Changes based on more thought and recent (2010-01) discussion:
added proposal part and revised discussion
REQUIRE and INCLUDE now treat relative file names as source-directory
relative, whereas all other words treat relative file names as
Cleaned up and rewrote remarks section
The main way to specify source-directory-relative filenames is now F",
not prefixes. Removed prefixes from proposal, but discuss them in the
Added Sections "Which prefix?", "What if there is no currently
including file?", "Isn't specifying "/" as directory separator too
Replaced many mentions of "./" with "prefix".
Minor rewrites to make some prose clearer.
Summary: How do I refer to another file that is distributed with the
Forth file at hand, in the presence of directories?
Example: Let us assume that the current working directory is /wd/, and
that we have a program installed in directory /prog1/, but on the next
installation it might be installed in /prog2/. The file to be
INCLUDEd by the user is /prog1/prog.fs, and it is supposed to INCLUDE
library.fs and read a data file data.txt residing in the same
directory. How should prog.fs refer to library.fs and data.txt?
Remember that on the next installation they might be in a different
directory than in this installation (but in the same directory as
As an additional complication, consider the case where prog.fs loads a
library /prog1/lib1/lib.fs, that has been developed independently of
prog.fs, and without knowledge that it would later wind up in
/prog1/lib1; /prog1/lib1/lib.fs refers to another file foo.fs which
resides at /prog1/lib1/foo.fs in this installation. How should Forth
code in the library refer to other files of the library?
An example of a similar structure can be found in
First we need to specify a directory separator. The "/" is supported
in all important OSs (including DOS and Windows, except in cmd.exe),
so we specify it for Forth.
Next, we need a way to specify a file name relative to the
source-directory, i.e., the directory that contains the Forth source
file that contains the file name (as a string): F" file" produces a
filename in a source-file-independent way. E.g., in the example
above, the prog.fs file could refer to the other files as follows:
and lib.fs could refer to /prog1/lib1/foo.fs as follows:
In addition, a word INCLUDE-NAME-ABS for converting a string from an
include-relative filename to an absolute filename is a natural factor
of F" and can be useful in some cases (e.g., if the file name contains
Programmers like to use parsing words like INCLUDE and REQUIRE, so we
refine these words to treat relative file names as source-directory
relative. So instead of using
F" foo.fs" required
the programmer can also write
F" lib1/lib.fs" required
require lib1/lib.fs \ equivalent to the above
F" data.txt" r/o open-file
S\" funny\"filename" include-name-abs r/o open-file
F" data.txt" save-mem 2constant data-filename
( in another file in another dir: ) data-filename r/o open-file
Directory separator: The directory separator is "/".
Absolute file names: start with "/" or "<letter>:/".
Relative file names: everything else; relative file names are not
necessarily relative to the working directory.
Parent directory: ".." refers to the parent directory, "../.." to the
grandparent, "../sibling" to a different directory at the same level.
Source-directory-relative file name: Given a relative file name F that
occurs in the text of a Forth source file /SD/S, then using F as
source-directory-relative file name results in /SD/F. Using F as
source-directory-relative file name while not including a file is
an ambiguous condition [A typical fallback then might be to use F
as working-directory relative].
Working-directory-relative file name: If the absolute file name of the
working directory is /WD, then using F as
working-directory-relative file name results in the absolute file
name /WD/F. If the operating system does not support a working
directory, only absolute file names can be used in words that would
use relative file names as working-directory relative.
Passing a relative filename to any word consuming a filename except
INCLUDED, INCLUDE, REQUIRED or REQUIRE uses the file name as
working-directory-relative file name.
Passing a relative filename to INCLUDED or REQUIRED uses the file name
as working-directory-relative file name. An ambiguous condition
exists if the file does not exist or is not accessible. [A typical
fallback then might be to search a path].
Passing a relative file name to REQUIRE or INCLUDE uses that file name
as source-directory-relative file name. An ambiguous condition
exists if that file does not exist or is not accessible. [A
typical fallback then might be to search a path].
INCLUDE-NAME-ABS FILE-EXT ( c-addr1 u1 -- c-addr2 u2 )
If the file name specified by c-addr1 u1 is an absolute file name,
c-addr2 u2 is the same file name. Otherwise c-addr2 u2 is the
absolute or (if the operating system supports a working directory)
working-directory-relative file name resulting from using c-addr1
u1 as source-directory-relative file name. The contents of the
buffer containing c-addr2 u2 are valid until the next invocation of
INCLUDE-NAME-ABS or F".
Interpretation: ( "ccc
" -- c-addr u )
Parse ccc delimited by " (double quote). Pass the resulting string
to INCLUDE-NAME-ABS, and return the result as c-addr u.
Compilation: ( "ccc
" -- )
Parse ccc delimited by " (double quote). Pass the resulting string
to INCLUDE-NAME-ABS, with the result being c-addr u. Append the
run-time semantics given below to the current definition.
run-time: ( -- c-addr u )
Return c-addr u described above.
Reference implementation and tests
Will be done after the solution has solidified after more discussion.
Existing practice and experience
Gforth implements source-directory-relative file names by interpreting
all relative file names in INCLUDED, INCLUDE, REQUIRED and REQUIRE as
source-directory relative (as part of the file search path). This has
been in Gforth since Gforth 0.4.0 in 1998. We have had very good
experiences with that functionality, and very bad experiences with
working-directory-relative INCLUDE and REQUIRE before that.
Some other Forth systems have similar facilities, e.g., Win32Forth.
Treating relative file names as working-directory-relative for all
words except INCLUDED, INCLUDE, REQUIRED, and REQUIRE is existing
practice in most (all?) Forth systems that support files on a
hierarchical file system.
Modern C compilers like gcc do
relative to the directory of the currently-included file. The
proposed equivalent would be 'INCLUDE bla.h' or 'F" bla.h" INCLUDED'.
If a symbolic link in Unix contains a relative file name, that file
name is relative to the directory that contains the symbolic link.
E.g., If we have
ln -s foo/bar /tmp/flip
then /tmp/flip refers to /tmp/foo/bar.
INCLUDE, REQUIRE and backwards compatibility
INCLUDE and REQUIRE are convenience words. Even though INCLUDED
was standardized in Forth-94 and INCLUDE was not, INCLUDE seems to
be much more popular. Tightening the specification of INCLUDE to
treat relative file names as source-directory-relative helps those
programmers who don't read standards (apparently most of them) to
write portable programs.
Introducing a new word (e.g., +INCLUDE) or a special syntax for the
file name (e.g., ./file or ) won't achieve this objective,
because it will not be used by most programmers. And those
programmers who would use it would probably also use
F" ..." INCLUDED.
The disadvantage of this kind of tightening is that on some systems
the current behaviour of INCLUDE is different, and existing
programs that work around this deficiency might no longer work
after the change. There are a number of ways for these systems to
- Use the legacy behaviour as fallback. That probably allows
nearly all legacy applications to work unchanged. The only
possible problem is if a file that is expected to be accessed
through the legacy behaviour happens to have a namesake in the
source directory, and these files are different.
- Have a (system-specific) switch that allows switching between the
legacy and source-directory-relative behaviour. This would
cover all cases.
Finally, another option would be to forgo tightening INCLUDE, and
leave it as unstandardized wrt. directories as it is now.
Programmers desiring portability will use F" ..." INCLUDED, and the
unwashed masses will just produce unportable programs.
The case is a little bit different for REQUIRE, because most
systems have not implemented it yet, so they have no legacy to stay
compatible with, and no compatibility problem. Gforth has
implemented it, but it has also implemented source-directory
relative file names with REQUIRE.
INCLUDED, REQUIRED, and backwards compatibility
Similarly, the tightening of the specification of INCLUDED and
REQUIRED to use working-directory relative files can cause
backwards compatibility problems for legacy programs on systems
that implement source-directory-relative file names for these words
(e.g., Gforth). The same options are possible as for INCLUDE.
Why let INCLUDE and REQUIRE behave differently from the other words?
In short: Because they are parsing words and the others are not,
and because the file names for INCLUDE and REQUIRE come from the
In long: Whether a relative file name should be treated as
source-directory relative or working-directory relative depends on
where it is coming from. Names coming from the programmer should
almost always be treated as source-directory-relative, because the
programmer knows where the source directory is, but not where the
working directory is. Conversely, names coming from the user
usually refer to the working directory.
In the present proposal we have F" to refer to files in a
source-directory relative way, and we then pass absolute or
working-directory-relative file names to words such as OPEN-FILE
and INCLUDED. However, INCLUDE and REQUIRE take the file name from
the input stream, without F". In the usual case they come from the
programmer (at least when used within programs, the main focus of
the standardization effort), and then they should be used as
source-directory relative file names.
What about specifying file names relative to a library root etc.?
Several people have suggested having a word (or a prefix) for
specifying a library directory. While this is a worthwhile goal, I
feel that finding a consensus on that topic is hard enough that it
should be attacked in a separate RfD.
Why not use a CD word for this purpose?
* You lose the user's working directory when you do a CD, so any
access to a file name provided by the user breaks after the CD.
* CD is not a standard word.
Isn't specifying "/" as directory separator too OS-specific?
It works in practice, and other approaches don't. E.g., Peter Knaggs
|Java has a special DirectorySeparator constant you are supposed to use,
|but nobody ever does.
If there is ever a Forth 200x system for an OS that has a different
directory syntax (e.g., VMS), the Forth system can translate the
filename with the slashes into the native directory syntax (I guess
that the POSIX layer for such OSs does the same).
Note that DOS and Windows understand "/" just fine (except in their
command-line interpreter, but Forth systems work through system
calls, not through the command-line interpreter).