File Names and Directories

Changes:

2010-09-23: make environmental restriction and dependency of not
supporting '/' and using some other directory separator explicit
(suggested by Stephen Pelc).

1st RfD: Split out this (hopefully less contentious) proposal from the
Directories proposal.


Problem
-------

What are portable file names?  How do I refer to directories?  What
about relative file names?  Forth-94 does not specify these things; as
a result, systems implement diverging approaches (in particular wrt
relative file names in loading words).  This proposal tries to outline
the common ground at the moment.


Solution outline
----------------

We specify portable file names and a directory separator (and follow
POSIX here).  We also specify working-directory-relative file names
and source-directory-relative file names, and which words are
guaranteed to use one of these approaches, and which words use either.


Typical usage
-------------

Keep all source files in one directory.  Forget about separately
packaged libraries.

Proposal
--------

Portable file name components: consist of ASCII letters, digits, ".",
   "_", and "-" [from POSIX 2004 3.276].  Case matching may be case
   sensitive or insensitive, so file names shall be referred to in the
   case in which they were created, and no two file names differring
   only by case shall be created in one directory.

Directory separator: The directory separator is "/".

File names: A file name has an optional beginning "/", followed by
   zero or more portable file name components separated by "/"
   characters.

Absolute file names: start with "/".

Relative file names: start with a portable file name component.  What
   a relative file name is relative to is defined further on.

Using file names containing other characters is an ambiguous condition
   [They are treated differently on different Forth systems, different
   operating systems, or different file systems].

Parent directory: ".." refers to the parent directory, "../.." to the
   grandparent, "../sibling" to a different directory at the same level.

Source-directory-relative file name: Given a relative file name F that
   occurs in the text of a Forth source file /SD/S, then using F as
   source-directory-relative file name results in /SD/F.  Using F as
   source-directory-relative file name while not including a file is
   an ambiguous condition [A typical fallback then might be to use F
   as working-directory relative].

Working-directory-relative file name: If the absolute file name of the
   working directory is /WD, then using F as
   working-directory-relative file name results in the absolute file
   name /WD/F.  If the operating system does not support a working
   directory, only absolute file names can be used in words that would
   use relative file names as working-directory relative.

Words:

Passing a relative filename to any word consuming a filename except
   INCLUDED, INCLUDE, REQUIRED or REQUIRE uses the file name as
   working-directory-relative file name.

Passing a relative filename to INCLUDED, REQUIRED, INCLUDE or REQUIRE
   uses the file name as either working-directory-relative file name
   or as source-directory-relative file name [you can still use that
   by making sure they are the same].  An ambiguous condition exists
   if the file does not exist or is not accessible. [A fallback then
   might be to search a path, or to throw an exception].  [Note that
   all the known implementations of REQUIRE treat the file name as
   source-directory-relative, so maybe you should refrain from
   implementing it in a working-directory-relative way.]

A system that does not support '/' as directory separator has an
environmental restriction.  A program that uses a directory separator
other than '/' has an environmental dependency.

Reference implementation and tests
----------------------------------

Will be done after the solution has solidified after more discussion.


Existing practice and experience
--------------------------------

To the best of my knowledge, all standard Forth systems that implement
the FILE wordset implement this proposal (as far as they implement the
words mentioned in the proposal).

Some systems implement source-directory-relative file names for
INCLUDED, REQUIRED and INCLUDE, some implement
working-directory-relative file names for these words.  For REQUIRE,
all systems that implement it, implement source-directory relative
file names; but in order to avoid touching a contentious topic, even
REQUIRE is specified weakly in this proposal.


Remarks
-------

Isn't specifying "/" as directory separator too OS-specific?

   It works in practice, and other approaches don't.  E.g., Peter Knaggs
   writes <4505c44b$0$2665$ed2619ec@ptn-nntp-reader02.plus.net>:
   
   |Java has a special DirectorySeparator constant you are supposed to use, 
   |but nobody ever does.
   
   If there is ever a Forth 200x system for an OS that has a different
   directory syntax (e.g., VMS), the Forth system can translate the
   filename with the slashes into the native directory syntax (I guess
   that the POSIX layer for such OSs does the same).

   Note that DOS and Windows understand "/" just fine (except in their
   command-line interpreter, but Forth systems work through system
   calls, not through the command-line interpreter).


Anton Ertl