Directory File Format

From Open Watcom

Jump to: navigation, search


Why This Page Is Needed

Initial investigation of the binary device library directory file format produced a very simple result: a length field of either two or four bytes followed by a set of item-and-file name pairs.

The first version of the research program cfparse.exe immediately showed that this format did not apply to the file openwatcom\docs\gml\syslib\wgmlst.cop.

The format, if that is the word for it, of this file required a fair amount of research to puzzle out. And, having done the work, I can hardly avoid documenting the results!

Is There A Directory File Format?

This may seem to be an odd question, but consider these facts discovered about the file openwatcom\docs\gml\syslib\wgmlst.cop:

1. The item count is 0x1fb, or 507, yet it contains only 406 entries.

2. The first 205 entries look like this:

01 02 
07 helpdrv 

3. The next 200 entries are arranged in 100 pairs that look like this:

01 00 
01 04 
0C vintage12iso
01 00 
06 XVI12O
01 04 

01 04 
0D vintage12biso 
07 XVI12BO 

4. The final entry looks like this:

01 00 
01 04 
19 courier12medium-ecma-94-i
01 00 
01 04 

5. The file then ends with sufficient null characters to make its length an even multiple of 16.

This is a format? To me, it seems a bit lacking in regularity. Still, as shown in The Directory File , it can be documented and, as cfparse.exe and copparse.exe show, it can be parsed.

Format Decision Notes

This section documents the decisions I made in deciding how to describe and how to parse directory files.

The Compact Format

The original item format is renamed "the compact format" and referred to as "a compact entry".

The Extended Format

The item format used for the last entry in the file is renamed "the extended format" and referred to as "an extended entry".

When an extended entry is followed by a compact entry, the final element in the extended entry is identical to the first element of the following compact entry:

01 00  
01 04 
06 boldps
01 00 
01 02

01 02
08 x2700drv
08 X2700DRV

This might suggest that the final element in the extended entry is intended as a "preview" of the type of the following compact entry. However, since the file ends with an extended entry (with no following compact entry), it should be clear that no action anticipating a following entry can safely be taken. That is why these entries are shown as they are and not formatted or treated like this:

01 00  
01 04 
06 boldps
01 00 

01 02
01 02
08 x2700drv
08 X2700DRV

Meta-Type Indicators

The "0x0001" which invariably begins an extended entry can be considered a meta-type indicator, indicating the type of entry rather than (as the type indicators "0x101", "0x201" and 0x401" do) the type of item/file contained in the entry.

When I identified the use of "0x0001", I naturally looked for a value that indicated the start of the initial compact entries. This led to this pair of meta-type indicators:

0x0000 -- the following entries are compact entries until further notice
0x0001 -- the following entry is an extended entry

However, further reflection showed that the "0x0000" was, in fact, the third and fourth bytes of the four-byte count of entries used by the version 4.1 binary device library directory file. This leaves "0x0001" as the only meta-type indicator.

A History of Damage?

I noted above that this file contains fewer entries than the count would imply. The actual library (openwatcom\docs\gml\syslib) contains 44 binary device files (excluding wgmlst.cop). Most entries, then, do not refer to actual files.

Examination of the filenames generated by the preliminary form of cfparse.exe (which would be only those preceding the first extended entry and which will form the "first list") with the filenames generated by the first released version (which includes them all and will form the "second list") is interesting. I imported the names into an Open Office spreadsheet, sorted them into alphabetical order, and compared the lists. The results are:

  1. For the first 104 entries, the lists are identical.
  2. The next remaining 100 entries in the first list correspond, with one exception, to three entries in the second list, the first two of which are identical to the entry in the first list.
  3. The exception PSSHADE, appears once on both lists, and so can be grouped with the initial 104 entries.
  4. The final entry on the first list matches two identical entries (only) on the second list.

Here is an example of the second set of entries described above:


This pattern can be explained as follows: the entry, PSPI in this case, occurs in both the initial section (before the first 0x0001) and the extended section (after the first 0x0001), although not necessarily in an extended entry (I did not check this and so cannot say if it occurred in an extended entry or not). The extended section also contains a related entry, PSPR, which is not in the initial section. The final entry in the file is, clearly, the same sort of pattern, but with the related entry not present.

Tests with gendev show that it is not possible to insert the same item twice. At best, the binary device file for the device, driver or font is regenerated. At worst, nothing happens at all. This file is not in a natural state for a binary device library directory file and has almost certainly suffered damage since it was first created.

Are The Extended Entries Used?

It might be wondered whether the extended entries are even used.

I have not fully tested this; these facts are certain, however:

  1. The only wgmlst.cop file that wgml can possibly access which contains a reference to psdrv is openwatcom\docs\gml\syslib\wgmlst.cop.
  2. The only reference to psdrv in openwatcom\docs\gml\syslib\wgmlst.cop is in the extended-format section (albeit in compact format).
  3. PostScript files are, nonetheless, produced when desired.

It is also clear from testing wgml that it will not look for a binary device file on its own. So, unless there is a wgml option that equates the driver "psdrv" with the file "PSDRV.COP", we must infer that wgml parses all the entries in this file, wherever they are located, although it may be skipping those in the extended format and using only those in the compact format.

Alternative Solution

The obvious alternative to accomodating this "format" in our code is to replace openwatcom\docs\gml\syslib\wgmlst.cop with a file using only compact entries and listing only the binary device files actually present.

This would not be particularly hard. Since only wgmlst.cop is to be replaced, it really would not matter if the source files (.PCD, .FON) generated binary device files identical to those already present. What would have to be checked, however, is that all the device, driver and font defined names and member names are the same as those used for the existing binary device files.

Another consideration is the nature of Perforce: so far as I can tell, the only way to get Perforce to recognize the new file as a new version of the old file is to edit it. Unfortunately, we do not have a binary device library directory file editor -- and creating one would be a diversion from the initial goals of this project. Eventually, of course, when we have replaced wgml and gendev, a program using gendev to dynamically edit directory files could be created. But that will not be happening for a long time to come.

These considerations are what convinced me that examination and, if possible, rectification of the Open Watcom document build system is a necessary part of recreating wgml and gendev. In addition to a rebuilt wgmlst.cop, all of the binary device files should have a source file which, if necessary, can be used to regenerate it correctly.

Impact on CFParse

When cfparse was extended to verify designators while parsing a directory file, it became clear that, for the directory openwatcom\docs\gml\syslib, the count of unopened files and the difference between that and the items found (which should be the number of opened files, i.e., files that exist) is not correct. This is the result of the duplicate entries: both existent and nonexistent files are, in some cases, counted twice.

Personal tools