Search Paths

From Open Watcom

Jump to: navigation, search

Contents

Introduction

This page concerns search paths in wgml and gendev. As such, it will explore several topics:

  • the documented search paths;
  • the actual search paths;
  • how the search paths are used; and
  • what this implies for this project.

Documented Search Paths

This is what the WGML Reference has to say:

Section 9.55 INCLUDE (wgml) and Section 15.6.2 INCLUDE (gendev):

When working on a PC/DOS system, the DOS environment symbol 
GMLINC may be set with an include file list. This symbol is 
defined in the same way as a library definition list (see 
"Defining a Library List" on page 297), and provides a list 
of alternate directories for file inclusion. If an included 
file is not defined in the current directory, the directories
specified by the include path list are searched for the file.
If the file is still not found, the directories specified by
the DOS environment symbol PATH are searched.

This defines the intended use of the environment variable GMLINC: it is intended to tell wgml where to look when an :INCLUDE tag is encountered. The search path is:

  • the current directory
  • the directories given in GMLINC
  • the directories given in PATH

Note that "include path list" here refers to the contents of GMLINC.

The README file produceable from the WGML 3.33 Update has this note relating to this path in section "4.4 New GML Features":

The device library is now searched before the DOS PATH 
when trying to open and include file.

So, the include path list would now be:

  • the current directory
  • the directories given in GMLINC
  • the directories given in GMLLIB
  • the directories given in PATH

Section 14 Running WATCOM Script/GML:

The option file "default" is located and loaded before other
options are processed. The search path for the default option
file is the current disk location, the device library path, 
followed by the document include path.

The "default" option file is an option file named "default.opt" which is always included if found. The search path is:

  • the current directory
  • the device library path
  • the document include path

This is not clear: if the "document include path" includes the entire path used with the :INCLUDE tag, then both the current directory and the directories provided by GMLLIB are searched twice; if it only includes GMLINC, then the PATH directory list is ignored.

Section 14.3.6 DEVice:

When working on a PC/DOS system, the DOS environment symbol 
GMLLIB is used to locate the device information (see 
"Libraries with IBM PC/DOS" on page 297). If the device 
information is not found, the document include path is searched 
see"INCLUDE" on page 102).

This, then, is the search path for binary device libraries:

  • the directories given in GMLLIB
  • the document include path

Here, the "document include path" would make more sense if it were applied to the entire path used with the :INCLUDE tag. Or, at least, it would if this did not entail searching the directories in GMLLIB a second time!

Section 14.3.8 FILE:

When working on a PC/DOS system, the DOS environment symbol
GMLLIB is used to locate the command file if it is not in the
current directory (see "Libraries with IBM PC/DOS" on page 
297). If it is still not found, the document include path is 
searched (see "INCLUDE" on page 102).

This, then, is the search path for non-default option files:

  • the current directory
  • the directories given in GMLLIB
  • the document include path

Here, the "document include path" would make more sense if it were applied to the entire path used with the :INCLUDE tag. Or, at least, it would if this did not entail searching the directories in GMLLIB a second time!

So, how many paths are there, per the documentation? I suggest that there are three (source files, option files, and binary device libraries), but what they are depends on how "the document include path" and "the device library include path" are interpreted.

If "the document include path" is taken to refer solely to the contents of GMLINC and "the device library include path" to the contents of GMLLIB, then we have:

  • for source files (.gml, .pcd, .fon, plus any extension specified by the user):
the current directory
the directories given in GMLINC
the directories given in GMLLIB
the directories given in PATH
  • for option files, including "default.opt":
the current directory
the directories given in GMLLIB
the directories given in GMLINC
  • for binary device libraries:
the directories given in GMLLIB
the current directory
the directories given in GMLINC

If "the document include path" is taken to all the directories searched for document specification files and "the device library include path" to all the directories searched for binary device libraries, then we have:

  • for source files (.gml, .pcd, .fon, plus any extension specified by the user):
the current directory
the directories given in GMLINC
the directories given in GMLLIB
the directories given in PATH
  • for the "default.opt" file:
the current directory
the directories given in GMLLIB ("device library path")
the current directory           ("device library path")
the directories given in GMLINC ("device library path")
the directories given in GMLLIB ("device library path")
the directories given in PATH   ("device library path")
the current directory           ("document include path")
the directories given in GMLINC ("document include path")
the directories given in GMLLIB ("document include path")
the directories given in PATH   ("document include path")
  • for option files:
the current directory
the directories given in GMLLIB
the current directory           ("document include path")
the directories given in GMLINC ("document include path")
the directories given in GMLLIB ("document include path")
the directories given in PATH   ("document include path")
  • for binary device libraries:
the directories given in GMLLIB
the current directory           ("document include path")
the directories given in GMLINC ("document include path")
the directories given in GMLLIB ("document include path")
the directories given in PATH   ("document include path")

If we remove the duplicates from the second set, then we have:

  • for source files (.gml, .pcd, .fon, plus any extension specified by the user):
the current directory
the directories given in GMLINC
the directories given in GMLLIB
the directories given in PATH
  • for option files, including "default.opt":
the current directory
the directories given in GMLLIB
the directories given in GMLINC
the directories given in PATH
  • for binary device libraries:
the directories given in GMLLIB
the current directory
the directories given in GMLINC
the directories given in PATH

Interestingly, the filenames given on the command lines to gendev and wgml were not mentioned in any section found in the WGML Reference by searching on "path". Direct investigation finds this information:

Section 14 Running WATCOM Script/GML:

The "file-name" specifies the file containing the source text
and GML tags for the document. If the file type part of the
file name (see "Files" on page 281) is not specified, WATCOM
Script/GML searches for source files with the alternate file
extension followed by the file type of GML. When a file type
is specified, WATCOM Script/GML searches for source files
with that file type.

Section 16 Running WATCOM GENDEV:

The "file-name" specifies the file containing the device,
font and/or driver definitions. If the file type part of the 
file name (see "Files" on page 281) is not specified, WATCOM 
GENDEV searches for source files with the default file type
for device and driver definitions. The font definition file 
type is the default alternate extension.

Exactly where these searches are conducted is not specified. The most likely search path used is the source file path, which is the one used with the :INCLUDE tag.

So, the situation is unclear, and testing is called for, especially since, even with the aid of the README, this is for version 3.3, and we are using version 4.0 of wgml and version 4.1 of gendev.

Actual Search Paths

It might be wondered why it is important that our gendev and wgml duplicate the search paths used by gendev 4.1 and wgml 4.0. The theoretical reason is quite simple: if different files with the same name exist in directories which are listed in different environment variables, then the order of search will determine which of those files is used. Since our wgml is intended to produce the same output as wgml 4.0, it would be helpful if it used the same input files. Whether this applies to the Open Watcom document build system is not, at present, known.

Search Targets

The various documents available suggest that wgml 4.0 searches for eleven different types of files:

  • the source file given on the command line (wgml/source)
  • an option file given on the command line (wgml/opt)
  • a layout file given on the command line (wgml/lay)
  • a source file named in an :IMBED block (wgml/imbed)
  • a source file named in an :INCLUDE block (wgml/include)
  • a source file named in an .AP block (wgml/.ap)
  • a source file named in an .IM block (wgml/.im)
  • the default option file (wgml/default)
  • a device library file (wgml/cop)
  • a binary file named in a :BINCLUDE block (wgml/binary)
  • a graphic file named in a :GRAPHIC block (wgml/graphic)

Several objections might be raised to this list:

  • The tag :IMBED is documented as being the same as :INCLUDE except for being allowed only in the pre-:GDOC section of the document specification (which turns out to be incorrect for wgml 4.0: :IMBED works just fine in the body of the document).
  • .IM is short for "IMBED" (the odd spelling is used in place of "EMBED" because the digraph ".EM" was already taken)
  • Two blocks naming source files (:MAILMERGE and :VALUESET, which are documented as being equivalents) are ignored

As to the last point, :MAILMERGE and :VALUESET are related to using wgml to produce correspondence, which is not our goal. Indeed, even if we were to release our version (some future version), it would probably not include the ability to produce correspondence; many other programs do that.

gendev 4.1 searches for two different types of files:

  • the source file given on the command line (gendev/source)
  • a source file named in an :INCLUDE block (gendev/include)

This may be a bit surprising, since the message list for gendev includes this message:

CL--003 Missing or invalid command filename

but using "FILE" as an option produces the error:

CL--001: Invalid option in command line

and no gendev command line option for a command file is documented in the WGML Reference. The implication is that gendev 4.1 does not use user-supplied option files.

To investigate whether or not gendev 4.1 searches for a file named "default.opt", I first invoked gendev with "( incl", which is documented to cause it to list each file as it is included. The file used had no :INCLUDE statements, but was itself listed when "incl" was given, and not listed when it was not. I then placed the command line into a "default.opt" file in the same directory that gendev 4.1 was invoked in. The results clearly showed that the file was not found, which implies that it is not searched for.

It may also be surprising that gendev 4.1 does not search for binary device libraries or files, since it creates them and creates or rewrites wgmlst.cop (the directory file) and so must surely search for that. However, if gendev 4.1 is invoked in a directory which is not listed in GMLLIB, then this warning is produced:

SN--082: Current disk location and library path do not match

and the library is produced anyway.

So the situation is not that gendev 4.1 does not search for library files; the situation is that gendev 4.1 only looks in the directory it is invoked in -- and that it expects that directory to be listed in GMLLIB. Well, it does if GMLLIB is defined: if GMLLIB does not exist, no error message appears.

Individual Location Tests

Now that the situations to test have been identified, let us start with two tests:

  1. whether or not, for the current directory and each environment variable separately, the file type is found by the program only when it is in the location indicated; and
  2. whether or not, for environment variable path lists written left-to-right in a locale which generally writes left-to-right, the directories are searched left-to-right (LtoR) or right-to-left (RtoL).
               curdir     GMLLIB      GMLINC     PATH
wgml/source    yes        LtoR        LtoR       LtoR
wgml/opt       yes        LtoR        LtoR       LtoR
wgml/lay       yes        LtoR        LtoR       LtoR
wgml/imbed     yes        LtoR        LtoR       LtoR
wgml/include   yes        LtoR        LtoR       LtoR
wgml/.ap       yes        LtoR        LtoR       LtoR
wgml/.im       yes        LtoR        LtoR       LtoR
wgml/default   yes        LtoR        LtoR       LtoR
wgml/cop       no         LtoR        LtoR       LtoR
wgml/binary    yes        LtoR        LtoR       LtoR
wgml/graphic   yes        LtoR        LtoR       LtoR
gendev/source  yes        no          LtoR       no
gendev/include yes        no          LtoR       no

The layout file was found not only when named with the LAYOUT option, but also when named with the .AP and .IM control words and the :IMBED and :INCLUDE tags placed before :GDOC.

The SCRIPT control words, of course, only work when SCRIPT or WSCRIPT is given on the command line or in an option file.

The result shown for wgml/cop using the current directory only appears to be correct: not defining GMLLIB or defining it to point at a directory not containing the device library which includes the specified device (whether that directory contains a valid library or not) produces this error message:

IO--008: For the device (or font) 'test':
         The information file for this name cannot be found.
         If the device/font has been defined, the problem may
         be that the DOS SET symbol GMLLIB has not been
         correctly set to point to the device library.

which is consistent with this statement in the WGML Reference:

To locate the library, WATCOM Script/GML and WATCOM GENDEV
must have a list of library directories. This list is defined
with the DOS SET command.

This applies not only to the directory file but to the device, driver and font files as well. When GMLLIB is set to "." or ".\", then the device is found.

The device is also found when GMLLIB does not exist but GMLINC contains the directory of a library the directory file of which contains an entry for the defined name of the device. So the current directory is, in fact, skipped, and not checked automatically. In ow\docs\mif\onebook.mif, the value assigned to GMLINC for the help file build (as opposed to PS) starts with ".;", thus ensuring that the current directory is always searched.

The tag :BINCLUDE was tested with a text binary file and worked as described in the WGML Reference. The last line was included whether it ended with a newline character or not.

The tag :GRAPHIC was tested using the PS device and an EPS graphic file created from a .BMP file. It took a while to get a file that GhostView was happy with:

  1. the :BINCLUDE tag had to be commented out; apparently, the file it included had lines so long that they made the PS invalid;
  2. the file EZAMBLE.PS had to be (copied to the test directory and) prepended to the output file; otherwise, GhostScript regarded the file as an EPS file.

The second point, however, only applies to the PS device from the WGML 3.33 Update. When GMLLIB was pointed at ow\docs\gml\syslib, wgml produced a file which GhostView not only recognized as a PS file but which it could convert to a PDF file which Acrobat could display.

The results shown for gendev/source and gendev/include for GMLLIB and PATH are surprising: they suggest that neither GMLLIB nor PATH is searched by gendev for these files.

Location Pair Testing

There are a total of six pairs to test; for easy viewing, the results will be spread over two tables. In this tables, "N/A" will be entered when either (or both) member of the pair is not used. When both pairs are used, then the member of the pair which is searched first will be entered.

               curdir/GMLLIB  curdir/GMLINC  curdir/PATH
wgml/source    curdir         curdir         curdir
wgml/opt       curdir         curdir         curdir
wgml/lay       curdir         curdir         curdir
wgml/imbed     curdir         curdir         curdir
wgml/include   curdir         curdir         curdir
wgml/.ap       curdir         curdir         curdir
wgml/.im       curdir         curdir         curdir
wgml/default   curdir         curdir         curdir
wgml/cop       N/A            N/A            N/A
wgml/binary    curdir         curdir         curdir
wgml/graphic   curdir         curdir         curdir
gendev/source  N/A            curdir         N/A
gendev/include N/A            curdir         N/A
               GMLLIB/GMLINC  GMLLIB/PATH    GMLINC/PATH
wgml/source    GMLINC         GMLLIB         GMLINC
wgml/opt       GMLLIB         GMLLIB         GMLINC
wgml/lay       GMLINC         GMLLIB         GMLINC
wgml/imbed     GMLINC         GMLLIB         GMLINC
wgml/include   GMLINC         GMLLIB         GMLINC
wgml/.ap       GMLINC         GMLLIB         GMLINC
wgml/.im       GMLINC         GMLLIB         GMLINC
wgml/default   GMLLIB         GMLLIB         GMLINC
wgml/cop       GMLLIB         GMLLIB         GMLINC
wgml/binary    GMLINC         GMLLIB         GMLINC
wgml/graphic   GMLINC         GMLLIB         GMLINC
gendev/source  N/A            N/A            N/A
gendev/include N/A            N/A            N/A

Search Paths Observed

This section lists the search paths which are actually used, based on the above tests.

For wgml/source, wgml/lay, wgml/ap, wgml/im, wgml/imbed, wgml/include, wgml/binary, and wgml/graphic:

  1. the current directory
  2. any directories listed in the GMLINC environment variable
  3. any directories listed in the GMLLIB environment variable
  4. any directories listed in the PATH environment variable

For wgml/opt and wgml/default:

  1. the current directory
  2. any directories listed in the GMLLIB environment variable
  3. any directories listed in the GMLINC environment variable
  4. any directories listed in the PATH environment variable

For wgml/cop:

  1. any directories listed in the GMLLIB environment variable
  2. any directories listed in the GMLINC environment variable
  3. any directories listed in the PATH environment variable

For gendev/source and gendev/include:

  1. the current directory
  2. any directories listed in the GMLINC environment variable

Using Search Paths

Now that the search paths have been identified, how are they used? This turns out to be a bit more complicated than might be thought.

Note: The DOS version of wgml 4.0, at least, imposes a limit on the length of the value of the environment variables. Relative paths can produce a useable setup where absolute paths produce "File Not Found" errors. Of course, using relative paths restricts the directories in which wgml 4.0 can be used to those with respect to which the relative paths make sense.

Finding One File

This is the most common use: the search path is followed until a file whose name matches the filename given is found. This will normally be the first match, and it will hide any other matches in locations further down the path.

Both gendev and wgml allow the use of multiple extensions. The WGML Reference discusses this topic in several locations.

To start with, these lists are given of default extensions:

Section 14.1.3 IBM PC/DOS Specifics:

The following default file types are used by WATCOM Script/GML:
File Type    Usage 
GML          document source files
LAY          layout files created with the :save tag
OPT          command files
VAL          value files specified by the VALUESET command line
             option

Section 16 Running WATCOM GENDEV:

File Type   Definition (IBM PC/DOS)
PCD         default file type for the device and driver definition.
FON         default file type for the font definition.
COP         default file type for the created member name.

Since the mail-merge/correspondence capabilities of wgml are not being recreated (as mentioned above) and the .LAY extension is for files which wgml creates (all layout files in the Open Watcom document build system use extension .GML), these are the default extensions we need to consider:

File Type   Usage
GML         document source files
OPT         command files
PCD         device and driver definition source files
FON         font definition source files
COP         binary device definition files

Using this list, it is possible to make the other statements in the WGML Reference clearer.

The first source file sought is the one given on the command line:

Section 14 Running WATCOM Script/GML:

The "file-name" specifies the file containing the source text
and GML tags for the document. If the file type part of the
file name (see "Files" on page 281) is not specified, WATCOM
Script/GML searches for source files with the alternate file
extension followed by the file type of GML. When a file type
is specified, WATCOM Script/GML searches for source files
with that file type.

which appears to me to be saying that wgml looks for:

  • the specified extension

or

  • the alternate extension
  • the extension .GML

Note that the first option is presumed to apply in all cases below, although it is only stated explicitly here.

Section 16 Running WATCOM GENDEV

The "file-name" specifies the file containing the device, font
and/or driver definitions. If the file type part of the file
name (see "Files" on page 281) is not specified, WATCOM GENDEV
searches for source files with the default file type for device
and driver definitions. The font definition file type is the
default alternate extension.

which appears to me to be saying that gendev looks for:

  • the specified extension

or

  • the extension .PCD
  • the alternate extension (which defaults to .FON)

Although, as shown above, there are at least four ways to include a source file, the only discussions in the WGML Reference are for files named by the :INCLUDE tag:

Section 9.55 INCLUDE:

If the specified file does not have a file type, the default 
document file type is used. For example, if the main document
file is manual.doc, doc is the default document file type. If
the file is not found, the alternate extension supplied on the
command line is used. If the file is still not found, the file
type GML is used.

which appears to me to probably be saying that wgml looks for:

  • the specified extension

or

  • the extension of the main source file
  • the alternate extension
  • the extension .GML

Section 15.6.2 INCLUDE:

The value of the required attribute file is used as the name
of the file to include. The content of the included file is
processed by WATCOM GENDEV as if the data was in the original
file. This tag provides the means whereby a definition may be
specified using a collection of separate files. More than one
definition may be included into one file for processing by
WATCOM GENDEV.

which says nothing to the point, but the situation presumably is that gendev looks for:

  • the specified extension

or

  • the extension .PCD
  • the alternate extension (which defaults to .FON)

and may or may not look for the extension of the main source file before using .PCD.

The attribute MEMBER_NAME occurs in the device, driver, and font definitions. This text is the first encountered, and is for the font definition. The other two, in Sections 15.9.1.2 & 15.8.1.2, are identical except that "driver" or "device" is used where "font" appears here.

Section 15.8.1.2 MEMBER_NAME Attribute:

The member_name attribute specifies the member name of the
font definition. The value of the member name attribute must
be a valid file name. The member name must be unique among
the member names of the font, driver and device definitions.
When the GENDEV program processes the font block, it places
the font definition in a file with the specified member name
as the file name. If the file extension part of the file name
is not specified, the GENDEV program will supply a default
extension. Refer to "Running WATCOM GENDEV" on page 277 for
more information.

which appears to me to be saying that gendev will use:

  • the extension given, if one is given

or

  • the extension .COP

Nothing was found specifying the extension used by wgml in searching for these files. It would, however, be reasonable to suppose that wgml does this:

  • the extension given, if one is given

or

  • the extension .COP

It is, of course, possible that wgml, if given a member name which has an extension other than .COP which it cannot find as such, will replace the given extension with .COP and try again.

The command-line option ALTEXTENSION, it appears, works in a straightforward and obvious manner:

Section 14.3.1 ALTEXTension:

When a GML source file is specified on the WGML command line,
or as an include file, the file type can be omitted. If a
source file with the default file type cannot be found, WATCOM
Script/GML will search for a file with the file type supplied
by the alternate extension option.

Section 16.1.1 ALTEXTension:

When a GENDEV source file is specified on the GENDEV command
line, or as an include file, the file type may be omitted. A
default file type will be supplied by WATCOM GENDEV. If the
source file cannot be found with the default file type, the
alternate extension option supplies a second file type to
find with the source file.

Depth-First or Breadth-First Searching?

Since I am using terms that usually describe methods of traversing trees, and since their application here may not be entirely clear, I will start by explaining how I am using them.

Suppose you have a file, testdoc.doc, which includes a line "#INCLUDE file='chapter1'". According to the material quoted above, wgml should look for these files:

  • chapter1.doc
  • chapter1.gml

Now suppose you have two different source file directories, docs1 and docs2, organized so that "..\" will access them from the directory in which wgml is running. If you set GMLINC to "..\docs1;..\docs2" and you place the file chapter1.doc in the directory docs2 and the file chapter1.gml in the directory docs1, will wgml include chapter1.doc or chapter1.gml? If it includes chapter1.doc, that is, if it searches the entire path using ".doc" before trying ".gml", then that is what I am calling "depth-first". If if includes chapter1.gml, that is, if it searches each directory on the path for each extension before moving on to the next directory, then that is what I am calling "breadth-first".

Initially, I tried to do this with a single table: this failed because the situation is too complicated to be summarized so simply. So now I will discuss various sets of search targets with similar search patterns in separate sections.

wgml Option Files

This section discusses two search targets:

  • wgml/default
  • wgml/opt

For these targets, wgml only uses one extension in any given instance.

For wgml/default, the file sought, "default.opt", is searched for automatically. There appears to be no way to specify a different name, and so the file name is almost certainly hard-coded into wgml.

For wgml/opt, the definitive tests showed that:

  • if the extension is specified, then that is the only extension used
  • if the extension is not specified, then .OPT is the only extension used

In particular, for this command line

wgml testdoc.doc ( device test file testopt altext txt

wgml searches for "testopt.opt" only. Neither the extension of the source file given on the command line nor any alternate extension given on the command line is used with the search target wgml/opt.

Aditional testing confirms that, even with this command line

wgml testdoc.doc ( device test file testopt.txt

if "file diffopt" appears in testopt.txt, then only diffopt.opt will be searched for. Prior usage of other extensions has no effect on the current search.

wgml Source Files

This section discusses these search targets:

  • wgml/source
  • wgml/lay
  • wgml/imbed
  • wgml/include
  • wgml/.ap
  • wgml/.im
  • wgml/binary
  • wgml/graphic

Starting with wgml/source, the search pattern derived from the documentation is:

  • the specified extension

or

  • the alternate extension
  • the extension .GML

Given that the three files "testdoc.doc", "testdoc.txt", and "testdoc.gml" exist in the search path, then the command line

wgml testdoc.doc ( device test altext txt

causes the file "testdoc.doc" to always be processed. If "testdoc.doc" does not exist in the search path, then wgml reports that "testdoc.doc" cannot be found and does not process any "testdoc" file (it does, however invoke the device functions in the :INIT block with "start" as the value of attribute place). This confirms that, if an extension is given, wgml searches for the file using only that extension.

With both file "testdoc.txt" and file "testdoc.gml" in the search path, then, when the command line

wgml testdoc ( device test altext txt

is invoked, "testdoc.gml" is processed when it occurs first in the search path and "testdoc.txt" is processed when it occurs first in the search path.

If both files are in the same directory (and so neither appears first in the search path), then "testdoc.gml" is processed.

So, the actual search pattern is:

  • the specified extension (only)

or

  • the extension .GML is used first
  • the alternate extension is used if .GML did not work
  • each directory is checked for both extensions before the next directory is checked

which is not what the documentation described.

For the various included source files, the search pattern derived from the documentation (for :INCLUDE) is:

  • the specified extension

or

  • the extension of the main source file
  • the alternate extension
  • the extension .GML

The tests will be described in terms of the LAYOUT command line option, but each test was also performed for .AP, .IM, :IMBED, and :INCLUDE with both layout files and source files and :BINCLUDE and :GRAPHIC with binary or graphic files.

Given that the file "testdoc.gml" exists and that three files "testlay.doc", "testlay.txt", and "testlay.gml" exist in the search path, then the command line

wgml testdoc ( device test altext txt layout testlay.doc

causes the file "testlay.doc" to always be processed. If "testlay.doc" does not exist in the search path, then wgml reports that "testlay.doc" cannot be found and does not process the "testdoc.gml" file it found (it does, however invoke the device functions in the :INIT block with "start" as the value of attribute place). This confirms that, if an extension is given with a layout file, wgml searches for the file using only that extension.

In some cases, when the document is found but an included source file is not, then the output of the functions in the :INIT block with "document" as the value of attribute place appear in the output file, but none of the document itself. In other cases the document is partially processed.

When file "testdoc.doc" and the files "testlay.doc", "testlay.txt", and "testlay.gml" are all in the search path, then the command line

wgml testdoc.doc ( device test altext txt layout testlay

causes whichever of "testlay.doc", "testlay.gml", or "testlay.txt" occurs first in the search path to be processed. If "testlay.gml" and "testlay.txt" are in the same directory, but "testlay.doc" is not in that directory or in any directory ahead of it in the search path, then "testlay.txt" is processed.

When file "testdoc.gml" and the files "testlay.txt" and "testlay.gml" are all in the search path, then the command line

wgml testdoc ( device test altext txt layout testlay

causes whichever of "testlay.gml" or "testlay.txt" occurs first in the command line to be processed. If "testlay.gml" and "testlay.txt" are in the same directory, then "testlay.gml" is processed.

So, the actual situation for the included files is:

  • the specified extension (only)

or

  • the extension of the main source file, if one is given
  • the alternate extension is used first
  • the extension .GML if the alternate extension did not work
  • each directory is checked for all three extensions before the next directory is checked

or (if the main source file has no extension)

  • the extension .GML is used first
  • the alternate extension is used if .GML did not work
  • each directory is checked for both extensions before the next directory is checked

which is far more complicated than the documentation suggests.

gendev Source Files

This section discusses these search targets:

  • gendev/source
  • gendev/include

For gendev/source, we saw above that the manual is fairly clear:

  • the specified extension

or

  • the extension .PCD
  • the alternate extension (which defaults to .FON)

If the path contains a file "genall.doc", and this command line is executed:

gendev genall.doc

then "genall.doc" is found and processed. If "genall.doc" does not exist in the search path, but "genall.pcd" and "genall.fon" both do, gendev reports that it cannot be found and generates no files. This confirms that, if an extension is provided, only that extension is used.

If files "genall.pcd" and "genall.fon" exist in the search path and this command line is executed:

gendev genall

then it will process whichever file is found first in the path. If they are placed in the same directory and nowhere else in the path, then "genall.pcd" is processed.

If "genall.pcd", "genall.fon", and "genall.txt" exist in the search path and this command line is executed:

genall genall ( altext txt

then it will process "genall.pcd" or "genall.txt", depending on which it encounters first, or "genall.pcd" if both are in the same directory and neither any earlier in the path, but, when forced to use "genall.fon" because that is all that is available, gendev states that it cannot find "genall.txt".

So, for gendev/source, the actual search pattern is:

  • the specified extension (only)

or

  • the extension ".PCD" is used first
  • the alternate extension is used if ".PCD" did not work
  • if no alternate extension was given on the command line, then ".FON" is used if ".PCD" did not work
  • each directory is checked for both extensions before the next directory is checked

which is the documented behavior.

For gendev/include, there is no search pattern given in the manual; the discussion above simply re-uses the pattern for gendev/source.

If files "testinc.pcd" and "testinc.fon" exist and are used with #INCLUDE in a source file, then they are included. If they do not exist, then gendev reports that the file requested cannot be found: it does not include "testinc.fon" if it is told to look for "testinc.pcd" or "testinc.pcd" if it is told to look for "testinc.fon".

If the #INCLUDE statement merely shows "testinc", then "testinc.pcd" is included whether "genall.pcd" or "genall.fon" or, for that matter, "genall.txt" when ALTEXTENSION is used with value "txt", is processed. A file named "testinc.fon" or "testinc.txt" is used, if available, when no "testinc.pcd" file exists.

Specifying "genall.txt" on the command line has a very interesting effect: included files ending in ".PCD" are not found, only those ending in ".TXT" or ".FON".

So, for gendev/include, the search pattern is:

  • the specified extension (only)

or

  • the extension of the source file on the command line is used first
  • the extension ".PCD" is used if no extension was used with the source file on the command line
  • the alternate extension is used if ".PCD" did not work
  • if no alternate extension was given on the command line, then ".FON" is used if the first extension tried did not work
  • each directory is checked for both extensions before the next directory is checked

which is somewhat unexpected.

wgml Binary Device Library Files

wgml is given a defined name, not a filename, for the target device, so the first step is to find the device library whose wmglst.cop file has an entry for that defined name, as discussed in Finding Device Libraries. The member name can then be extracted for use in locating the binary file encoding the :DEVICE block for the target device.

This member name may or may not include an extension; if it does not, then ".COP" will be used as the extension by gendev when it creates the file and so must be used by wgml when it searches for it, presumably using the search path determined by actual test and summarized in Search Paths Observed. It is also not unreasonable to test whether or not, if the member name includes an extension but a file with that name but not that extension is found, a search using ".COP" is done before giving up.

The basic tool in testing this was a pair of binary libraries, "testlib1" and "testlib2", set up so that "testlib1" contained a device file "test.doc" while "testlib2" contained a device file "test.pcd", both of which had "test" as their defined name. As might be expected, whichever of "testlib1" and "testlib2" appeared first in the search path was used.

When any binary file, in whichever directory was listed first, was made inaccessable (by renaming, for easy reversibility), the same error (IO-008) occurred as is shown in Individual Location Tests in the context of searching the current directory for the binary device library files. If both "test.doc" and "test.cop" are in "testlib1" and "test.doc" is made inaccessible, then the error occurs: "test.cop" is not used.

The implication is clear: wgml only actually searches for the first directory file (wgmlst.cop) in the search path containing an entry for the defined name it is given. It looks for the member name only in the same directory, and only for the extension given (if one is given) or the extension ".COP", but not both.

Finding Device Libraries

The search path used to convert a defined name to a member name is used, not to return the first wgmlst.cop file it finds, but rather to return the first wgmlst.cop file with an entry for the defined name is found. Once the defined name is found in a wgmlst.cop file, the search terminates and any additional wgmlst.cop files containing the defined name are never found.

Further, as shown in wgml Binary Device Library Files, a file with the member name (with .COP appended if it has no extension) must exist in that same directory, or wgml will report that it could not be found.

Specified Paths

The bulk of this page deals with simple filenames: filenames which may have an extension but which are given with no path information at all. This section deals with filenames that include path information. As such, it does not apply to binary device library files, since those files are sought by defined name and not (directly) by filename.

There are two types of path information that may be prepended to the filename:

  • an absolute path; and
  • a relative path.

The WGML Reference does not address this issue, nor does the README file produceable from the WGML 3.33 Update. This leaves testing as the only source of information.

These are the file types discussed here whose filenames can include path information:

  • the source file given on the command line (wgml/source)
  • an option file given on the command line (wgml/opt)
  • a layout file given on the command line (wgml/lay)
  • a source file named in an :IMBED block (wgml/imbed)
  • a source file named in an :INCLUDE block (wgml/include)
  • a source file named in an .AP block (wgml/.ap)
  • a source file named in an .IM block (wgml/.im)
  • the default option file (wgml/default)
  • a binary file named in a :BINCLUDE block (wgml/binary)
  • a graphic file named in a :GRAPHIC block (wgml/graphic)
  • the source file given on the command line (gendev/source)
  • a source file named in an :INCLUDE block (gendev/include)

and they can be grouped into three categories:

  • wgml Option Files
    • wgml/default
    • wgml/opt
  • wgml Source Files
    • wgml/source
    • wgml/lay
    • wgml/imbed
    • wgml/include
    • wgml/.ap
    • wgml/.im
    • wgml/binary
    • wgml/graphic
  • gendev Source Files
    • gendev/source
    • gendev/include

Preliminary results for relative paths are suggestive:

  • If the filename includes path information, that path information is automatically prepended to simple filenames included in the file. It does this even if the filename has its own prepended information: that is, the original path replaces the given path.
  • There is no third attempt, for wgml Source Files, to find the file using ".gml" if the filename has a different extension and the alternate extension either does not exist or is not ".gml".

It appears that gendev searches the normal paths if the file is not found: the filename reported as "not being found" includes path information from the PATH environment variable; wgml reports the filename as entered when a different path is provided and it cannot be found, and opens it if it in fact exists there. This suggests that, at least for relative paths, the search order is "enhanced" to include the original file's path information first, and then the given path information. No attempt to pursue this further was made.

Preliminary results for absolute paths are also interesting:

  • The absolute path is prepended to simple filenames included in the file. If the filename has a path, whether relative or absolute, that is used instead.
  • There is no indication that either the default extension or the third extension ".gml" are used for wgml Source Files if the filename includes an extension. If it does not include an extension, then the alternate extension, if one exists, is used.

Design and Coding Implications

It is, of course, hardly possible at present to be certain about the meaning of the information determined on this page. This is a preliminary attempt to set down some general ideas.

It is not possible to both implement the search procedures as documented and to replicate the behavior of wgml 4.0 and gendev 4.1, for these are not consistent with each other. It might be helpful to build a model of how wgml was originally used. This model is, of course, speculative.

If the non-PC sections of the WGML Reference and such documents as script-tso.txt are considered, then it becomes apparent that, originally, wgml and gendev were intended to be used on IBM VM/CMS and DEC VAX/VMS computers. In the IBM VM/CMS environment, GMLLIB was not an environment variable, but the name of something called a "MACLIB", or, alternately, was created with the "MACLIB" command. The same applies to the DEC VAX/VMS environment, except that the command used was "LIBRARY". The net effect was that there could only be one device library, and its name was GMLLIB.

When wgml and gendev were ported to the PC, GMLLIB became an environment variable giving the dirctory of the device library. This immediately created the possibility of having more than one device library, since an environment variable can list several directories.

The environment variable GMLINC is only mentioned in the context of PCs: apparently, it did not exist in wgml or gendev as used in the IBM VM/CMS or DEC VAX/VMS environment. The environment variable PATH, of course, is very common on PC systems, at least the non-Linux systems which are currently fully supported by Open Watcom.

Now, it occurs to me that the use of GMLLIB and GMLINC by gendev can be taken as indicative of their originally intended use: GMLLIB was intended to point to device libraries (only), and GMLINC was intended to point to source code (only). Clearly the search patterns shown to be used by wgml no longer reflect this division: both are checked (the order varies), and then PATH is checked as well. I suggest that this reflects a different understanding of GMLLIB and GMLINC: that, in fact, GMLLIB is intended for the centralized system-wide device library which ordinary users cannot alter, and GMLINC was intended for local files, whether source files or local device libraries.

That GMLLIB is searched first for device libraries not only preserves the authority of the central authority, but, more to the point, ensures that the local device libraries will have to use different defined names, thus preventing unpleasant surprises where a local library redefines a device with the same defined name as a defined name used in the centralized library, while preserving the ability of local users to create and use modified devices if they feel the need to.

This eventually led to GMLLIB being searched (after GMLINC) for source files, and for PATH being searched in, presumably, a last, desperate effort to find the desired device library or file before giving up and disappointing the user.

The search paths found to be used by wgml for option files and for source files are the same as those documented, so there really is no reason to implement anything else in these cases. The search path found to be used by gendev for source files is not the same as the documented path, but can be regarded as a shortened form of it. Since gendev, unlike wgml, is used only occasionally, either search path would do.

The search procedure for binary device libraries is so different from the normal search procedure that it has to be implemented separately. Since it preserves the idea of a system-provided library, it should be implemented to work the way it works in wgml 4.0.

The issues of breadth versus depth and use of extensions turned out to be so easy and straightforward to implement that that is what was done. Our gendev and wgml should behave exactly as gendev 4.1 and wgml 4.0 do when searching for filenames that do not include any path information.

The issue of filenames which do contain path information is currently handled quite simply: it is treated as an error. If such filenames turn out to require support, additional testing will be needed to ensure that the behavior of gendev 4.1 and wgml 4.0 is completely understood. Based on what has already been done, this should work:

  1. Create a global variable def_path.
  2. Initialize def_path for both option and document specification files in wgml and source files in gendev if a relative path is used.
  3. If def_path is not empty, and the filename sought does not have an absolute path, search the def_path first and any relative path given as part of the filename.
  4. In terms of the existing code, this can be done by creating directory_list object(s) containing the path(s) to be used, either instead of (wgml) or before (gendev) the normal search pattern.
  5. If absolute paths are always followed, then identifying filenames incorporating them and searching only that path for only that filename separately from the rest of the code might be an easy simplification.
Personal tools