Over-large Code Segments

From Open Watcom

Revision as of 00:59, 18 March 2009; view current revision
←Older revision | Newer revision→
Jump to: navigation, search

Contents

Tutorial Scope

This tutorial is about the problem of over-large code segments. It is written from the perspective of 16-bit programming because that is where the problem normally occurs. If you have the same problem with 32-bit programming, you may need to adapt some of the information, but the basic principles should remain the same.

This tutorial was written using Open Watcom 1.6; it has been revalidated with Open Watcom 1.8 and the (very small number of) necessary notes added. There is no reason to believe that Open Watcom 1.7/1.7a behaved any differently than Open Watcom 1.6.

Revalidation for future versions will occur if and when a major change occurs which is likely to affect this topic.

If you discover any inaccuracy, whether the result of a new version of Open Watcom or not, please report it to either the openwatcom.contributors or openwatcom.users.c_cpp newsgroup. My intent is to revisit the tutorial, at the latest, when an inaccuracy is reported.

Unless otherwise noted, all results apply to:

  1. 16-bit programs in DOS, Windows and OS/2;
  2. Both the C (wcc) and C++ (wpp) 16-bit compilers.

This tutorial does discuss some features that C++ has which C does not have:

  1. The C++ compiler has debugging options (-d2i, -d3i, -d2s and -d3s) useful in working with inline functions;
  2. C++ has the "virtual" keyword, which affects how the C++ compiler treats inline functions; and
  3. C++ has templates.

Problem Statement

You are working on a C or C++ project and either the compiler or the linker complains about a code segment which is too large.

Procedural Summary

This section is a summary. The tutorial goes into much greater detail.

The term "ordinary code" refers to code satisfying these criteria:

  • For C and C++, no functions are declared with the keyword "inline".
  • For C++, no templates are included.

Preliminary Steps

  1. If you are using a small-code model, use a big-code model instead.
  2. For ordinary code, if you have modules containing multiple functions, either separate the functions into single-function modules or use compiler option -zm.

Compiler Error Encountered

  1. If a module containing multiple functions is producing the error, put each function in a separate module (at least temporarily) so that you can determine which function or functions are generating too much code.
  2. If inline functions are involved in C code, then commenting-out the invocations can be used to determine if an inline function is causing or contributing to the problem. A module which has no code other than the inline function invocation can be used to determine if the inline function itself generates too much code.
  3. If inline functions are involved in C++ code, then using -d2s or -d3s with -zm will place them in separate segments. This is a quicker way to determine if an inline function is causing or contributing to the problem, although identifying which one is at fault will require the same steps as for C code.
  4. If class templates are involved, the solution is considerably harder in version of Open Watcom prior to 1.8 because the compiler will generate every member function in any module that instantiates the template. Separating the member functions into unrelated templates and instantiating each in its own module is the first step; however, the size of a function can be affected by class template structure, so further testing may be needed. If each function compiles when separated, but not when combined, then the sizes should be checked and those near the limit refactored to see if that solves the problem when they are combined. For Open Watcom 1.8, only the member functions used in a module (including the constructors and destructor) are generated; however, refactoring is still the only recourse if a compiler error is produced by a template.
  5. Each function which generates too much code must be refactored into a set of smaller functions. Once this is done, you can recombine any functions separated out in the first step, provided you are compiling with the -zm option and, if templates are involved, possibly the linker's LIBF directive.

Linker Error Encountered

  1. For ordinary code, if there is no compiler error and you are using a big-code memory model, then you have included enough of the C Standard Library to produce a _TEXT segment with more than 64K of code. The next section discusses the problem in more detail. The tutorial itself contains much more detail on how to deal with this problem in the section "_TEXT and the C Runtime Library".
  2. When using templates, virtual inline functions, or -d2i/-d3i, use LIBF to link specific modules first. This will require some experimentation to find a module, or set of modules, which will work.
  3. For virtual inline functions and templates where LIBF does not work at first, refactor the class or class template and then use LIBF to link the modules containing the refactored code. You may need to repeat this process several times before you have enough code relocated for it to be effective.

Rocks and Hard Places

There is one place in the tutorial where the proposed solutions are rather more complicated than the solutions outlined above. This is the section "_TEXT and the C Runtime Library": you are using enough of the C Standard Library that the _TEXT segment exceeds 64K.

At first, I simply recommended moving to 32-bit programming. But this presupposes that you are doing 16-bit programming on a lark, and not because you must.

Further reflection and several usenet posts have reminded me that there are situations in which 16-bit programming is the only option:

  1. You are targeting DOS on an 8088, 8086, 80188, 80186, or 80286 (or equivalent) processor.
  2. You are targeting 16-bit Windows on an 8088, 8086, 80188, 80186, or 80286 or equivalent processor or, on an 80386 or equivalent or higher, are targeting a 16-bit Windows other than version 3.x in standard or 386 enhanced mode.
  3. You are targeting OS/2 1.x or the 16-bit OS/2 subsystem in Windows NT or Windows 2000.

In those situations, you must apply the methods shown in "_TEXT and the C Runtime Library", there really is no other option.

If the solutions proposed are daunting and you are willing to consider using 32-bit programming, these observations may be helpful:

  1. DOS 32-bit programs run under normal DOS, provided the processer is a 386 or above. Since there are several of them, you may, or may not, depending on what your program does, need to review the documentation and pick the one that works best for your situation.
  2. Open Watcom's Win386 produces 32-bit programs which run under Windows 3.x in 386 enhanced mode, which, of course, requires the processor to be an 80386 or equivalent or above. Win386 programs will also run under OS/2, Win9x and NT (they should run under any emulator which emulates Windows 3.x in 386 enhanced mode).

For OS/2, usenet posts confirm that no such thing as a "32-bit extender" exists: if your OS/2 code must run on a 16-bit OS/2 system, it must be a 16-bit program.

IDE vs. Command-Line: A Note on Options

I use the IDE myself, and so this tutorial inevitably is oriented to the IDE. Thus, the compiler and linker options, except where relevant to the tutorial, are the default IDE options.

If you are using the command line, then you presumably have one or more makefiles which use the options you prefer. It should be clear from the options shown just what you need to change to apply the solutions given; indeed, there are really only three items: the memory model, -zm, and changing the module link order. Your other choices should not have to be changed unless you are working with C++ and wish to use -d2i, -d3i, -d2s or -d3s.

If you are using the IDE, then you presumably have some familiarity with the IDE and have also selected the options you need for your project. The tutorial does contain notes to IDE users which specify which of the dialogue screens the various solutions can be found on, since some of them may not be where you would expect them to be.

One place where my IDE bias intrudes is in changing module link order: in the IDE, this requires the creation of a library (as discussed below) and the LIBF directive. In a makefile, this can be done by changing the order of modules listed with the FILE directive and using the NAME directive to specify the name of the executable. Most sections of the tutorial simply refer to "using LIBF". The linker option lines all show "LIBF" when changing the link order is needed. Command-line users will have to keep this in mind.

IDE users will have to keep something else in mind: although the compiler and linker option lines shown are based on the IDE, the linker options are not identical to those shown by the IDE. Some linker options are placed by the IDE in a file with a .lk1 extension, for example, "tester.lk1". The actual linker command line then ends with, for example, "symf tester.lk1". What I will do is show the actual options from the .lk1 file.

Please remember that the options shown are examples, that only the options that actually solve the over-large code segment problem are important, and that those are clearly identified in the text.

The Open Watcom 1.8 IDE Uses A New Option

If you are using the IDE with Open Watcom 1.8, then this option:

-fo:.obj

appears in each set of compiler options before the -ms or -ml option. It has no effect on the results reported here or the various solutions proposed.

I mention it here to avoid confusion. The remainder of the tutorial ignores it, since the Open Watcom 1.6 IDE did not use it.

The Compiler Error

If you examine the help files C Diagnostic Errors and C++ Diagnostic Messages, it might seem that neither compiler reports an error for an over-large code segment, since no such message is found. So this section demonstrates that such an error can occur.

The C compiler uses Error E1118, and the C++ compiler uses Error E094. These are general-purpose messages; they display the string "***FATAL***" followed by a string showing the specific problem. For an over-large code segment, that string is "segment too large". This was observed in Small, Medium, Compact, Large, and Huge memory models.

It is not difficult to produce this error: the simple test program (available in folder Phase0 in the downloadable ZIP file Tutorial.zip)


#include <stdio.h>
#include <stdlib.h>

int main( void )
{
    int Fred = 0;
    int Judy = 0;

/* 
    Repeat these lines 1000 times for C and 5000 times for C++
    and the compiler will report 
    "Error! E1118: ***FATAL*** segment too large" for C or
    "Error! E094: ***FATAL*** segment too large" for C++.
*/

    Judy += ++Fred;
    putchar(Judy);

    return( EXIT_SUCCESS );
}

will do so.

Ordinary Code

This section deals with ordinary code, that is, with code satisfying these criteria:

  • For C and C++, no functions are declared with the keyword "inline".
  • For C++, no templates are used.

The Initial Test Framework

The program above demonstrates the existence of the problem, but something more flexible is needed to illustrate all the topics to be covered. We are going to start with a C and a C++ test framework that demonstrate the same problem.

The source code for the test framwork will be found on Over-large Code Segments Code and can be downloaded from Tutorial.zip. In particular, the initial test framework is found in section Phase 1 and in folder Phase1 of Tutorial.zip.

Technical note: I use the IDE and place both the C and C++ targets in the same sub-directory (of which there are three: DOS16, OS216 and WIN16). This requires the .C and .CPP file's names to be distinguished by more than just the extension, as will be immediately apparent.

Warning: The test program is intended to illustrate the causes and cures of the problem "overlarge code segments". It it not intended to be invoked. Although not deliberately dangerous, you must understand that, if you choose to run it, you are assuming full responsibility for the results.

Style note: The test program is intended to aid in following this tutorial. It should not be taken as an example of how to actually write code.

When the C Phase 1 code is compiled with these options, the compiler error "segment too large" will result:

for DOS:     -w4 -e25 -zq -od -d2 -bt=dos -ms
for OS/2:    -w4 -e25 -zq -od -d2 -bt=os2 -ms
for Windows: -w4 -e25 -zq -od -d2 -bt=windows -ms

When the C++ Phase 1 code is compiled with these options, the compiler error "segment too large" will result:

for DOS:     -w4 -e25 -zq -od -d2 -bt=dos -ms -xs -xr
for OS/2:    -w4 -e25 -zq -od -d2 -bt=os2 -ms -xs -xr
for Windows: -w4 -e25 -zq -od -d2 -bt=windows -ms -xs -xr

In both the C and C++ cases, the options shown are the default options set by the IDE for 16-bit EXE files, except for the memory model: here we are using -ms (Small) while the IDE defaults to -ml (Large).

Function Too Large? Refactor

The compiler error is caused by the size of the code generated when the function is compiled. This may appear to be rather unlikely in real code, and I suppose that it is, but there are some compiler and language features that can make a function compile much larger than might otherwise seem likely:

  1. Compiler optimizations, such as aggressive loop-unrolling applied to a function with several large loops.
  2. Functions declared with the "inline" keyword that are actually inlined and which contain significant amounts of code.

There may, of course, be others; but the point, I think, is clear: the compiled size of a function cannot be reliably predicted by the size of its source file. A function, in a real program, which produces too much code is possible.

Functions declared with "inline" are discussed in the section "Inline Functions". In this section, we deal with functions which generate too much code all by themselves.

When a single function produces too much code, there is no alternative to splitting it up. This is one of the many forms that "refactoring" can take. This tutorial is not about refactoring, but several forms will be illustrated. At this point, external refactoring will be applied to the test code.

Normally, of course, an existing function, upon examination, can be divided into natural subfunctions. However, our artificial function can simply be split in half.

The resulting code is shown in Phase 2 and can be found in folder Phase2 of Tutorial.zip. It was produced by following these steps:

  1. Refactor the function. In this case, reduce the number of reps to 500 (for C) or 2500 (for C++) and then copy it, naming the copy "test2".
  2. Move the new function(s) to new file(s). In this case, move the new function to a file named Ctest2.c (for C) or CPPtest2.cpp (for C++).
  3. Add a declaration for the new function to the header file. For C, this is just another function declaration; for C++, it is a new public class member.
  4. Determine where the new functions are to be invoked and add the code necessary to do so. In this case, add the invocations to the "main" functions after the invocations of the original function.

When you compile the result with the same compiler options as before, the compiler error vanishes and a linker error appears.

The linker options for C were:

for DOS:     d all SYS dos op m op maxe=25 op q FIL Cmain.obj,Ctest1.obj,Ctest2.obj
for OS/2:    d all SYS os2 op m op maxe=25 op q FIL Cmain.obj,Ctest1.obj,Ctest2.obj
for Windows: d all SYS windows op m op maxe=25 op q FIL Cmain.obj,Ctest1.obj,Ctest2.obj

The linker options for C++ were:

for DOS:     d all SYS dos op m op maxe=25 op q FIL CPPmain.obj,CPPtest1.obj,CPPtest2.obj
for OS/2:    d all SYS os2 op m op maxe=25 op q FIL CPPmain.obj,CPPtest1.obj,CPPtest2.obj
for Windows: d all SYS windows op m op maxe=25 op q FIL CPPmain.obj,CPPtest1.obj,CPPtest2.obj

This may not appear to be much of an achievement; however, the simple fact is that if your code will not not compile, you cannot even attempt to link it.

The Linker Error and The Map File

Actually, if your target is 16-bit DOS you will see two linker errors (this is from the C++ test program):

Error! E2021: size of segment _TEXT exceeds 64k by 13888 bytes
Error! E2020: size of group AUTO exceeds 64k by 13888 bytes

For 16-bit OS/2 and 16-bit Windows, you will only see the first.

The link errors are found at the top of the map file, which can be identified by its extension *.map.

Map files have a lot of useful information. Here, for example are the first three lines of the "Segments" section of the 16-bit DOS C++ test program's map file:

BEGTEXT                CODE           AUTO           0000:0000       00000007
_TEXT                  CODE           AUTO           0001:0000       00013640
FAR_DATA               FAR_DATA       AUTO           1365:0000       00000000

For 16-bit OS/2 and 16-bit Windows, you will not see the third line. Actually, looking at the 16-bit OS/2 C test program's map file shows another difference:

BEGTEXT                CODE           AUTO           0001:0000       00000007
_TEXT                  CODE           AUTO           0002:0000       0001bf6a

This reflects the difference between 16-bit DOS and the other 16-bit targets: 16-bit DOS runs in "real mode" and so the "0001" for _TEXT indicates a physical address 16 bytes after the start of the program. The others run in "protected mode" and so the "0001" and "0002" and so on are what I suppose can be called "virtual linker segments", which will eventually become actual selectors (which select segment descriptors, which describe the actual segments) when the program is loaded.

If you carefully examine the Memory Map section, you will find that BEGTEXT is the startup code. _TEXT contains the code, and that is the segment which is too large. For a 16-bit program, the last value on each line, which is the number of bytes in that segment, must start with four zeros: 0000ffff is the 16-bit segment size limit. Examination of the Memory Map section also shows that a lot of the code in _TEXT is from the run-time library. Finally, it should be clear that all three functions (main, test1 and test2) are placed in this segment.

The most obvious solution to the linker error, then, is to get the code distributed over more than one segment.

The Memory Model Solution

Because of the relatively small size of a 16-bit segment on i86 processors, the machine code itself (which, of course, is what is ultimately executed when the program is invoked) distinguishes between near function calls and returns, where the function called is in the same segment as the function calling it, and far calls and returns, where the called function is in a different segment. The compiler (or assembler) processes one source module (one .C or .CPP file) at a time, and must be told if the code it produces can use near calls and returns or the far forms, which are both larger and slower than the near forms.

This led to the introduction of standardized memory models, which take into account both code and data. From our perspective, there are two memory models: small-code and big-code. The standardized small-code models are Tiny, Small, and Compact. The standardized big-code models are Medium, Large, and Huge.

Note: There may be situations where you cannot use a large-code model. In that case, you are outside the scope of this tutorial.

If the test programs are recompiled with this set of options for C:

for DOS:     -w4 -e25 -zq -od -d2 -bt=dos -ml
for OS/2:    -w4 -e25 -zq -od -d2 -bt=os2 -ml
for Windows: -w4 -e25 -zq -od -d2 -bt=windows -ml

or this set for C++:

for DOS:     -w4 -e25 -zq -od -d2 -bt=dos -ml -xs -xr
for OS/2:    -w4 -e25 -zq -od -d2 -bt=os2 -ml -xs -xr
for Windows: -w4 -e25 -zq -od -d2 -bt=windows -ml -xs -xr

The programs now fail to compile because the difference between far and near calls is enough to make the generated code too large (note that it is not only our functions that use far calls and returns, but the library functions now require far calls as well).

If you reduce the number of repetitions in the functions in Ctest1.c and Ctest2.c to 400, and in CPPtest1.cpp and CPPtest2.cpp to 2000, then the functions will compile. In a real program, of course, you would have to refactor the functions again. The resulting code is shown in Phase 3 and found in folder Phase3 in Tutorial.zip.

The linker now succeeds in building the executables. The map files show that change to a big-code memory model does, in fact, work by placing each object file's contents in its own compiler segment. The C program for OS/2 map file, for example, shows

Cmain_TEXT             CODE           AUTO           0001:0000       0000002f
Ctest1_TEXT            CODE           AUTO           0001:002f       0000d8cc
Ctest2_TEXT            CODE           AUTO           0002:0000       0000d8cc

and the remaining code segments are all in "virtual linker segment" 0002. The linker appears to be forming its segments in this way: each file's code is added to the current segment until doing so would produce an overlarge code segment, at which point a new segment is started.

That might seem to be the end of the tutorial, since we appear to have a solution that will always work:

  1. Ensure that each function produces less code than a segment can hold.
  2. Use a big-code memory model so the linker can place the code in more than one segment.

If only it were that simple!

_TEXT and the C Runtime Library

When examining the map files, you may have noticed that a _TEXT segment still exists and still contains the functions from the C runtime library. It has been reported that, under some conditions, this segment can exceed 64K bytes which, of course, does not work at all.

I should point out that not only can C++ programs call functions from the C runtime library, but the C++-specific runtime library may be implemented using functions from the C runtime library, so this problem can appear in either language.

By default, the linker, when processing a library file, only includes those modules which contain items used by the program. Thus, the map file for the 16-bit DOS C++ test program has these entries:

Module: J:\Progra~1\OpenWatcom/lib286/dos\clibl.lib(cstart)
1654:0000*     __nullarea
1654:0054*     __ovlflag
1654:0055*     __intno
1654:0056*     __ovlvec
0a72:a684      _cstart_
0a72:a75b*     _Not_Enough_Memory_
0a72:a86e      __exit_
0a72:a88d      __do_exit_with_msg__
0a72:a8e0      __GETDS

These functions are all linked in, as they are all part of the same module (cstart), but those marked with an asterisk (*) are never called. All modules shown in the map file have at least one function which is called (which has no asterisk).

The linker's behavior can be changed by adding a FILE directive such as

FILE J:\Progra~1\OpenWatcom/lib286/dos\clibl.lib

to the linker command line. (I tried using just "clibl.lib" but had to copy the entire path from the map file to get it to work -- you will need to use a path that works on your system, of course.) In the IDE, this can be done through the Options menu: pick Linker Switches and, on the first page, type the directive into the "Other Options" box on the first page.

This will produce the linker error message and examining the map file will show this (for 16-bit DOS, C or C++):

_TEXT                  CODE           AUTO           0000:0000       0003296a

which shows a size far greater than 64K (which would appear as 0000ffff in the last column). The map file also shows many segments in which no function is ever called (all functions have an asterisk). Examining the OS/2 and Windows map files shows that the linker attempts to cram the entire library into one virtual linker segment; it does not use the division into modules to distribute the code over multiple virtual linker segments.

In real programs, since the FILE directive would not be used this way, this would only occur if enough of the C Runtime Library were used. That is why it does not manifest itself very often.

The entire C Runtime Library is not actually placed in _TEXT; that is, there are modules whose functions, in whole or in part, are in their own segment. Also, _TEXT contains functions from the math library as well.

For 16-bit OS/2 and Windows, the thought of linking against a DLL naturally occurs. However, that DLL cannot be created because the _TEXT segment is too large.

Examination of the OpenWatcom documentation and source code shows that the modules of the C Runtime Library which end up in _TEXT in big-code memory models are there because those modules are compiled using the nt=_TEXT compiler option.

There are some actions that can be taken. They are not, however, as direct or as simple as most of those discussed in this tutorial.

The Wiki has currently has resources such as a zip file of the C Runtime Library code, a daily snapshot of the repository, a web-based front end, and instructions on obtaining a Perforce account. These all have advantages and disadvantages:

  1. The front end would require you to download the source file-by-file.
  2. The C Runtime Library zip file not only does not contain a build system it is actually a 7z file, which, unless you already have it, would require installation of 7-Zip.
  3. The daily source code snapshot is a .tar.bz2 file, which, unless you already have them, would require installation of whatever it takes to decompress/extract the files from such a file.
  4. Getting a Perforce account and synching it from scratch over a dial-up (as opposed to a DSL) connection would take quite a bit of time.

Both the source code snapshot (I presume) and the Perforce account require a certain amount of effort to get to the point where it is possible to build the library. Of course, the C Runtime Library zip file requires creation of a build system from scratch, which is likely to take longer than setting up the snapshot or the Perforce client.

Since I have a Perforce account, I did some testing. It may help to know that "builder" is the program used to build Open Watcom and that it takes arguments which act very much like the targets used with makefiles. Indeed, ultimately builder invokes wmake to process makefiles to actually build stuff. Also, I will use ow\ as the root of the client.

The first step was to determine if a full build is necessary. I did a "builder clean" (thus approximating the state of a new Perforce client after the initial setup has been done but before builder has ever been invoked) and then moved to the ow\bld\clib directory and tried "builder rel2". This failed because it required the existence of several tools which did not exist; what this means is that a full build must indeed be done. This was worth checking because a full build can take several hours on a computer with a 1.6GHz Athlon and not having to do a full build could save considerable time; on a sub-1GHz computer, it might be wise to plan to allow a full build to run all night.

Before doing the full build, open the file ow\bld\clib\flags.mif (if you do not check it out with Perforce, you will have to manually remove the Read Only flag) and find and modify the line

sw_c_bigcode  = -nt=_TEXT

to

# sw_c_bigcode  = -nt=_TEXT
sw_c_bigcode  = 

to remove the -nt=_TEXT from the compiler flags.

The full build is done by invoking "builder rel2" in the ow\bld directory. The "rel2" will cause a complete Open Watcom distribution, including the libraries produced by the build, to appear in the ow\rel2 directory. If it does nothing else, this makes the modified libraries easy to find.

After the full build, and after modifying the FILE directive to use the modified library in ow\rel2\lib286\os2 for 16-bit OS/2, the map file now shows:

_TEXT                  CODE           AUTO           0001:0160       00007582

which is a considerable reduction in size!

However, there were also four linking errors:

Error! E2028: ___nearheap is an undefined reference
Error! E2052: file e:\progdev\cpp\ow\rel2\lib286\os2\clibl.lib(initrtns): relocation at 0002:e551 not in the same segment
Error! E2052: file e:\progdev\cpp\ow\rel2\lib286\os2\clibl.lib(initrtns): relocation at 0002:e5ab not in the same segment
Error! E2052: file e:\progdev\cpp\ow\rel2\lib286\os2\clibl.lib(chk8087): relocation at 0002:e85a not in the same segment

So, there may be modules that must be in _TEXT. Also, the file ow\bld\wl\ovlldr\makefile uses nt=_TEXT, so it is possible that the overlay support requires a library build with nt=_TEXT.

In a real program, of course, you would not use the modified library with FILE; you would use it with LIB (or simply add it to the target in the IDE). The normal library will also be checked (this is automatic unless you disable it with a linker directive). The errors should not be seen unless you use the modules that trigger them. Removing those modules from the modified library so that they are linked from the normal library (and so end up in _TEXT) should eliminate the errors -- hopefully while keeping _TEXT under 64K. Clearly, if you are in this situation, you will have to experiment a bit to see what works.

In fact, for a real program, where the _TEXT segment is only slightly too large, you might want to download just the source and do this:

  1. In the .map file, identify modules in _TEXT that appear to be rather large.
  2. Compile those modules only into a new library.
  3. Add the library to the project. As noted above, the modules in the library will be taken first, and placed into their own segments, while the bulk of the code would still be in _TEXT.

If the number of modules that works is small, this may be easier than redoing the entire library. Of course, a certain amount of trial-and-error would be needed to identify the modules needed and then to build them.

An instance has been reported on the newsgroup where compiling with -mm succeeded where compiling with -ml failed because the _TEXT segment exceeded 64K bytes.

The file ow\bld\mathlib\flags.mif contains the same line shown above:

sw_c_bigcode  = -nt=_TEXT

However, if the modified clibl.lib is added to the project and FILE is used with the normal mathl.lib, the OS/2 map file shows:

_TEXT                  CODE           AUTO           0001:0030       0000a92f

which suggests that there is, at present, no problem in big-code memory models. Indeed, when the same modification is made to ow\bld\mathlib\flags.mif as shown earlier for ow\bld\clib\flags.mif showed a larger size (0xa9df instead of 0xa92f), so modifying the library this way is unlikely to be useful with mathl.lib.

Please remove any FILE statements present in the linker options.

Files Containing Multiple Function Definitions

In real programs, it is quite common to define more than one function in the same .C or .CPP file. In C++, of course, the member functions of a class form a natural grouping, but even in C it is possible for several functions to serve a common purpose, such as manipulating a particular data structure.

To explore this situation, we will perform an internal refactoring. The resulting test framework is found in section Phase 4 or in folder Phase4 of Tutorial.zip. The procedure is:

  1. Starting from the Phase 3 test framework, double the number of repetitions in test1() to 800 (for C) or 4000 (for C++), thus creating a too-large function.
  2. Refactor the too-large function into smaller functions, replacing the removed code with an invocation of the new function and placing the new function in the same file after the original function. In the test framework, put 400 (for C) or 2000 (for C++) repetitions in test11() and the other 400 or 2000 repetitions in test12().
  3. For C code, the function declarations go at the top of the .c file preceded by the keyword "static". For C++, the functions become private member functions of the class defined in the .hpp file.

If you compile the test programs, the compiler error will occur, since the two new functions combined generate more than 64K bytes of code.

One solution -- moving one of the new functions to another file -- we know will work, and we will shortly explore an alternative solution, but first a final word on refactoring.

A Note on Flexible Refactoring

This tutorial is not about refactoring, but it would be remiss of me to leave you with the impression that the two forms illustrated above, which I have called "external" and "internal", must be applied only as illustrated.

In actual fact, the term "refactoring" can be applied to at least three different aspects of programming:

  1. The code itself.
  2. The physical distribution.
  3. The public interface.

Refactoring the code itself (for ordinary code) is what happens when a function is divided into multiple functions. There are many reasons for doing this, and when and how to do it are, I suppose, the heart of the topic, which you will have to explore elsewhere.

After you have a set of new functions, you must then, for each function separately, deal with the other two aspects. That is, you must first decide whether to leave the new function in the same source file it is currently in or whether to move it to another one. You must also decide which other functions can call this function.

For a C function, if you move it to a new file, you will usually have to make it publicly accessible by listing its prototype in a header file. This is because the original function will usually need to call it, and it is now in a different file.

For a C function, if you leave it in the old file, you can decide whether to make it publicly accessible or not. This decision should be based on whether you want any function, not defined in the same file, to be able to call it.

For C++, you must decide whether or not the new function is a member function or not. This is, again, a topic you will have to research elsewhere, although in many cases there is no choice: the new function must be a member function because it must have access to private members of the class. For a C++ non-member function, the choices are exactly the same as for a C function.

For a C++ member function, the situation is completely different. The function must be added to the class definition in the header file, for that is what makes it a member function. However, you can choose whether to make it private, protected or public, depending on which other functions you wish to be able to call it. This applies regardless of what file the member function definition is placed in.

So, the techniques shown as "external" and "internal" refactoring were really just two of a large number of alternatives, the elements of which can be used in different combinations for each new function independently.

The Compiler's -zm Option

Since moving one of the new functions to a different file works by placing it into a different compiler segment, it is natural to consider whether or not the compiler can be induced to place two (or more) code segments in the same object module.

Indeed it can, using compiler option -zm. There is also a -zmf option for C++, but the description in the User's Guide suggests that it is not useful for our purposes here.

For IDE users, the -zm on page 9 of the Compiler Switches dialogue for both C and C++. With it checked, these are the compiler options for C:

for DOS:     -w4 -e25 -zq -od -d2 -zm -bt=dos -ml
for OS/2:    -w4 -e25 -zq -od -d2 -zm -bt=os2 -ml
for Windows: -w4 -e25 -zq -od -d2 -zm -bt=windows -ml

and for C++:

for DOS:     -w4 -e25 -zq -od -d2 -zm -bt=dos -ml -xs -xr
for OS/2:    -w4 -e25 -zq -od -d2 -zm -bt=os2 -ml -xs -xr
for Windows: -w4 -e25 -zq -od -d2 -zm -bt=windows -ml -xs -xr

Both programs compile and link for all three targets. Although the naming conventions are different between C and C++, the map files show that the linker is clearly considering each segment separately in filling its virtual linker segments. That is, the linker is not insisting on placing every compiler segment in the same module into the same virtual linker segment, but is considering only the compiler segment itself in deciding where to place it.

Inline Functions

Inline functions are functions defined with the keyword "inline". Since this is only a suggestion to the compiler, that is, since the compiler is free to decide whether or not to actually inline the function, I will use "non-inlined" to refer to functions defined with "inline" which are, nonetheless, not compiled inline and "inlined" for those which are, in fact, compiled inline.

To be "compiled inline", of course, means that the code in the inlined function appears inside the invoking function's code body. Ideally, the result is exactly what it would have been had the code been placed there by the programmer.

The decision whether to define a particular function with "inline" is another type of refactoring, and as such is beyond the scope of this tutorial. However, since even a function which does not appear to be very large can, in fact, produce too much code for a single segment if it invokes inline functions which are actually inlined, some of the factors involved in deciding whether or not to inline a particular function are within the scope of this tutorial.

As it happens, the test framework already contains a type of inline function: member functions for which function bodies are provided within the class definition are implicitly inline. This applies to these functions:

  • the constructor
  • the copy constructor
  • the destructor
  • the assignment operator

If the OS/2 CPPtest.map file is consulted, these lines will be found:

CPPmain_TEXT           CODE           AUTO           0001:0000       00000000
CPPmain_TEXT2          CODE           AUTO           0001:0000       000000c0
CPPmain_TEXT3          CODE           AUTO           0001:00c0       0000005f

and, further down in the file:

Module: CPPmain.obj(E:\ProgDev\Cpp\Tutorial\CPPmain.cpp)
0001:0000+     far testClass1::testClass1()
0001:0044+     far testClass1::testClass1( testClass1 const far & )
0001:008a+     far testClass1::~testClass1()
0001:00c0      main_

The assignment operator is not found. This might suggest that constructors and destructors are never inlined by Open Watcom; however, it will be seen that not all constructors and destructors are treated this way, and that the compiler can inline a function which is nonetheless included in a module as a separate function.

The map file also shows all three of these functions in the same compiler segment (CPPmain_TEXT2). This may seem surprising because -zm was specified. This issue was discussed on the newsgroup. This appears to be the situation: the compiler does not actually create such a compiler segment, but emits these items in such a way that the linker sees them as a single compiler segment. The -zm option is not applied because the compiler is not putting them into a compiler segment as such.

As will be seen, this type of "compiler segment" causes problems with linking because it also evades the size limitation normally imposed by the compiler (that is, the limit on segment size; the individual functions are still limited to 64K). The map file lists these items as compiler segments, and so will I; however, whenever I refer to or point out a compiler segment with more than 64K in code, it is this type of "compiler segment" I am referring to, whether I use quotes or not.

The Compiler Error Returns

As it happens, over-large code segment problems caused by inline functions require different techniques to resolve than those seen previously. To illustrate this, modify the test framework (the result is in Phase 5 or folder Phase5 of Tutorial.zip) in this way:

  1. Add the line "include <stdio.h>" to the top of test.h and of test.hpp.
  2. Move test12() to the test.h and testClass1::test12() to test.hpp and put "inline" at the start of the definition.
  3. Copy test12() and name the copy test13() in test.h; copy testClass1::test12() and name the copy testClass1::test13().
  4. Remove the static declaration of test12() from Ctest1.c .
  5. Update test.h to declare both test12() and test13(); update test.hpp to add test13() to testClass1 as a public function.
  6. The function main() now invokes test1(), test2(), test12() and test13().
  7. The function test1() now invokes test11() and test13().
  8. The function test2() now invokes test13() just before the return statement.

Compiling the code with these compiler options for C:

for DOS:     -w4 -e25 -zq -od -d2 -zm -bt=dos -ml
for OS/2:    -w4 -e25 -zq -od -d2 -zm -bt=os2 -ml
for Windows: -w4 -e25 -zq -od -d2 -zm -bt=windows -ml

and with these for C++:

for DOS:     -w4 -e25 -zq -od -d2 -zm -bt=dos -ml -xs -xr
for OS/2:    -w4 -e25 -zq -od -d2 -zm -bt=os2 -ml -xs -xr
for Windows: -w4 -e25 -zq -od -d2 -zm -bt=windows -ml -xs -xr

now produces compiler over-large segment errors for Cmain.c, Ctest2.c, CPPmain.cpp, and CPPtest2.cpp, since the main() and test2() functions include enough inlined code to push the number of repetitions to 800 (for C) or 4000 (for C++). Ctest1.c and CPPtest1.cpp avoid this error because test11() is compiled into a separate segment from test1() as the result of option -zm.

In a real program, of course, this error could be caused by an ordinary function which, all by itself, generates too much code. Unless the problem first appeared after you starting converting ordinary functions to inline functions, you would not know for certain that inlined functions are causing the problem. The natural question is: how do I find out what the cause of the problem is?

Isolating Inlined Functions

Ordinary functions, as shown above, can be isolated either by using -zm or by placing each function in its own module. But how can inlined functions be isolated?

There are really two steps here:

  1. Identify which inlined functions cause the problem.
  2. Identify any inlined function which, all by itself, generates more than 64K of code.

The first step can be accomplished by using // to comment-out all invocations of inline functions. If the module then compiles, clearly the inlined functions are the problem. This step can be refined: by selectively re-enabling these invocations, the inlined function or functions causing the problem can be identified (at least for the current version of the module -- a different version might produce a different list).

The second step can be done by commenting-out all invocations of the function or functions identified in the first step and then creating a separate module for each one which does nothing but invoke that function. Clearly, a module which does nothing but invoke an inlined function and which produces the compiler error shows that the function involved generates too much code.

If an inlined function is found which generates too much code, then the refactoring steps discussed earlier apply. However, if the new functions are inlined and invoked by the original function, this will not work: so some or all of the new functions will have to be ordinary functions, that is, some or all of them must not use the keyword "inline".

This applies to both C and C++ code. The C++ compiler, however, permits an alternative approach with C++ code.

Debugging Levels and Special Switches

We have been using the default IDE debugging level, indicated by option -d2. As discussed in the User's Guide, this is one of four debugging levels; the User's Guide should be consulted on what each level provides.

Two of these levels (-d2 and -d3) have subtypes relevant to the problem at hand: -d2s/-d3s and -d2i/-d3i. They serve two different purposes, as will be seen. They are available only for C++, not for C so, from the next version onward, the test framework will contain only C++ code.

Note to IDE users: These specialized switches do not appear as choices in the C++ Compiler Switches dialogue. To use them, first go to the Debugging Switches page and select "No debugging information"; then go the Miscellaneous Switches page and put whichever one you want to use into the box. Don't forget the hyphen!

Verification: -d2s/-d3s

If -d2s or -d3s is used in place of -d2 or -d3, the problem disappears; however, to see what is happening, the linker option "op STAT" is needed because, by default, the map file does not show static functions.

Note for IDE users: "op STAT" must be typed in the "Other options" box on the first page of the Linking Switches dialogue.

So, using these compiler options:

for DOS:     -w4 -e25 -zq -d2s -od -zm -bt=dos -ml -xs -xr 
for OS/2:    -w4 -e25 -zq -d2s -od -zm -bt=os2 -ml -xs -xr 
for Windows: -w4 -e25 -zq -d2s -od -zm -bt=windows -ml -xs -xr 

and these linker options:

d all SYS dos op STAT op m op maxe=25 op q FIL CPPmain.obj,CPPtest1.obj,CPPtest2.obj 
d all SYS os2 op STAT op m op maxe=25 op q FIL CPPmain.obj,CPPtest1.obj,CPPtest2.obj 
d all SYS windows op STAT op m op maxe=25 op q FIL CPPmain.obj,CPPtest1.obj,CPPtest2.obj 

the test program will compile and link.

The OS/2 map file shows these compiler segments:

CPPmain_TEXT           CODE           AUTO           0001:0000       00000000
CPPmain_TEXT2          CODE           AUTO           0001:0000       000000c0
CPPmain_TEXT4          CODE           AUTO           0001:00c0       0000d313
CPPmain_TEXT5          CODE           AUTO           0002:0000       0000d313
CPPmain_TEXT6          CODE           AUTO           0002:d313       000000ac

and (note the "s" suffix indicating static functions):

Module: CPPmain.obj(E:\ProgDev\Cpp\Tutorial\CPPmain.cpp)
0001:0000+     far testClass1::testClass1()
0001:0044+     far testClass1::testClass1( testClass1 const far & )
0001:008a+     far testClass1::~testClass1()
0001:00c0s     void far testClass1::test12()
0002:0000s     void far testClass1::test13()
0002:d313      main_

This shows that the constructors and destructor are still in a single compiler segment (CPPmain_TEXT2) but that test12() is in compiler segment CPPmain_TEXT4 and test13() is in compiler segment CPPmain_TEXT5. Thus, with -zm and -d2s, the compiler puts each explicitly-designated inline function in its own segment.

If we look a little further into the map file, we find these entries:

Module: CPPtest1.obj(E:\ProgDev\Cpp\Tutorial\CPPtest1.cpp)
0003:0000s     void far testClass1::test13()
0003:d313      void far testClass1::test1()
0004:0000+     void far testClass1::test11()
Module: CPPtest2.obj(E:\ProgDev\Cpp\Tutorial\CPPtest2.cpp)
0005:0000s     void far testClass1::test13()
0006:0000      void far testClass1::test2()

This shows that each module will contain a copy of the non-inlined static functions.

The -d2s/-d3s switch, then, is a very easy way to check two points:

  1. That the problem is being caused by inlined functions.
  2. Whether any of these functions is generating too much code.

However, since the functions appear in each module, the resulting executable will contain multiple copies and so is larger than it would be if these functions were refactored as ordinary functions. In real-world programs, this may matter.

Practicality: -d2i/-d3i

These switches are similar to -d2s/-d3s, except for one detail: the noninlined functions are not static. This allows the creation of an executable whose size and speed is much closer to the executable which refactoring all of the explicitly-inline functions to ordinary functions would produce. For a real program, this may help in deciding how many and/or which inline functions to refactor.

If these compiler options are used:

for DOS:     -w4 -e25 -zq -d2i -od -zm -bt=dos -ml -xs -xr 
for OS/2:    -w4 -e25 -zq -d2i -od -zm -bt=os2 -ml -xs -xr 
for Windows: -w4 -e25 -zq -d2i -od -zm -bt=windows -ml -xs -xr 

then the source code compiles, but the linker complains of an over-large code segment.

The last time this happened, it was because we were using a small-code model: each compiler segment was under the 64K limit but the single _TEXT segment used by the linker was too large. Here something quite different is happening, as the OS/2 map file shows:

CPPmain_TEXT           CODE           AUTO           0001:0000       00000000
CPPmain_TEXT2          CODE           AUTO           0001:0000       0001a6e6
CPPmain_TEXT3          CODE           AUTO           0002:0000       000000ac

and further down:

Module: CPPmain.obj(E:\ProgDev\Cpp\Tutorial\CPPmain.cpp)
0001:0000+     far testClass1::testClass1()
0001:0044+     far testClass1::testClass1( testClass1 const far & )
0001:008a+     far testClass1::~testClass1()
0001:00c0+     void far testClass1::test12()
0001:d3d3      void far testClass1::test13()
0002:0000      main_

Note that testClass1::test13(), which is invoked in all three modules, does not have a "+" suffix: this is the only copy in the resulting executable. Or, at least, it would be if it were possible for the linker to produce it.

This situation was discussed earlier: the linker sees these functions as part of a single "compiler segment" which, in this case, is larger than 64K bytes. If this were a normal compiler segment, that would be the end of the story, since the linker does not remove code from within normal compiler segments. This type of "compiler segment", however, is treated differently by the linker.

The Linker's LIBF Directive

Since testClass1::test13() is invoked in all three modules, and the compiler is only aware of the content of the module it is working on, testClass1::test13() must be compiled into each of the modules. Yet it only appears once in the resulting executable (or would, if the executable could be produced).

And, since main() is the only function to invoke both testClass1::test12() and testClass1::test13(), CPPmain.cpp is the only module to include both functions.

What, then, would happen if CPPtest1.cpp were linked first? This can be explored by using the linker's LIBF directive. For comand-line users, simply use these linker options (note that op STAT is no longer needed):

d all SYS dos op m LIBF CPPtest1.obj op maxe=25 op q FIL CPPmain.obj,CPPtest2.obj 
d all SYS os2 op m LIBF CPPtest1.obj op maxe=25 op q FIL CPPmain.obj,CPPtest2.obj 
d all SYS windows op m LIBF CPPtest1.obj op maxe=25 op q FIL CPPmain.obj,CPPtest2.obj 

and the program links.

For IDE users, the second tab of the Linking Switches dialogue has a box labled "Library files" into which "CPPtest1.obj" can be placed. Deleting the "op STAT" on the first tab and adding "CPPtest1.obj" on the second almost produces the linker options above: the difference is that the IDE inserts CPPtest1.obj in the FIL list. This causes the linker to link it twice, which does not work.

After some discussion on the newsgroup, I was reminded of two facts:

  1. The linker always includes files listed with LIBF and FIL.
  2. The linker allows files listed with LIBF to override files in libraries (.LIB files).

So, for IDE users' at least, the time has come to create three new targets, following this procedure for each of DOS, OS/2 and Windows:

  1. Remove CPPtest1.cpp and CPPtest2.cpp from the target for the .EXE file.
  2. Create a new target and add CPPtest1.cpp and CPPtest2.cpp to it. Configure the target to produce a .LIB file.
  3. Save the project and verify that the target is located in the proper directory. Move it and reacquire it if necessary.
  4. Make the .LIB file.
  5. Add the .LIB file to the original .EXE target.

Using the name TestLib.lib for this library, the linker options now become:

d all SYS dos op m LIBF CPPtest1.obj op maxe=25 op q FIL CPPmain.obj LIBR LibTest.lib
d all SYS os2 op m LIBF CPPtest1.obj op maxe=25 op q FIL CPPmain.obj LIBR LibTest.lib
d all SYS windows op m LIBF CPPtest1.obj op maxe=25 op q FIL CPPmain.obj LIBR LibTest.lib

If you change one of the files in the .LIB, select the .EXE, and rebuild it (F4), you will see that the .LIB is automatically updated before the .EXE is built. Putting object modules into a .LIB target instead of an .EXE target and including the .LIB file in the .EXE target also allows the IDE to be used to build more than one target in the same directory using the same object files.

Be certain that the compiler is using the commands above: in particular, -d2i (or -d3i) and -zm. It will work with all three targets.

Why does this work? Consulting the OS/2 version's map file again, we find:

CPPtest1_TEXT          CODE           AUTO           0001:0000       00000000
CPPtest1_TEXT2         CODE           AUTO           0001:0000       0000d313
CPPtest1_TEXT3         CODE           AUTO           0001:d313       0000003d
CPPtest1_TEXT4         CODE           AUTO           0002:0000       0000d313
CPPmain_TEXT           CODE           AUTO           0002:d313       00000000
CPPmain_TEXT2          CODE           AUTO           0003:0000       0000d3d3
CPPmain_TEXT3          CODE           AUTO           0003:d3d3       000000ac

and

Module: CPPtest1.obj(E:\ProgDev\Cpp\Tutorial\CPPtest1.cpp)
0001:0000      void far testClass1::test13()
0001:d313      void far testClass1::test1()
0002:0000+     void far testClass1::test11()
Module: CPPmain.obj(E:\ProgDev\Cpp\Tutorial\CPPmain.cpp)
0003:0000+     far testClass1::testClass1()
0003:0044+     far testClass1::testClass1( testClass1 const far & )
0003:008a+     far testClass1::~testClass1()
0003:00c0+     void far testClass1::test12()
0003:d3d3      main_
Module: TestLib.lib(E:\ProgDev\Cpp\Tutorial\CPPtest2.cpp)
0004:0000      void far testClass1::test2()

Since testClass1::test13() is now part of CPPtest1.cpp instead of CPPmain.cpp, it is clear that the compiler is compiling testClass1::test13() in all three modules. When the linker finds the same function in more than one module, it keeps the first copy it finds and removes the other copies, thus linking each duplicated function only once. The LIBF directive works by controlling which module is linked first and thus which module will contain the code for each of this type of function.

This may seem like a lot of work for very little reward. How important it is when applied to a real-world program depends on how important it is to be able to see what the effect of refactoring inline functions to ordinary functions on the size and speed of the resulting executable is. In many, perhaps most, situations, using -d2s/-d3s to confirm that inlined functions are or are not the problem may provide all the information needed.

Virtual and Inline

A major feature of C++ is the use of "virtual functions", which, for the purposes of this Tutorial, are functions declared with the keyword "virtual".

When a virtual function is also defined as "inline", then the compiler is faced with a problem. Consider this code:

void testClass1::test1()
{
    test11();
    test13();
}

where test13() is both virtual and inline. The compiler cannot inline test13() because there may exist a derived class:

class testClass2 : public testClass1
{
    public:

    void test13() { cout << "FRED"; }
};

which is used in this way:

testClass2 joker;
joker.test1();

and which would be expected to display "FRED", not what the version of test13() in the test framework displays.

The compiler could not rule this possibility out even if it were able to access the entire codebase rather than just one module at a time, for it cannot ignore the possibility that such code may be written in the future.

So the compiler has no choice but to treat the invocation of test13() as if test13() were an ordinary function. But this also requires the compiler, in at least one module, to compile test13() as if it were an ordinary function, i.e., to make it non-inlined.

If that were all there were to this topic, the conclusion would be clear: there is no point to making virtual functions inline because they will not be inlined. This, however, is not actually the case. Here are two reasons that virtual functions may be defined as inline in real programs:

  1. Some compilers, including Open Watcom, in some situations, do inline these functions.
  2. The style guide being used may require that certain functions be defined with the keyword inline, even when they are also virtual.

These are the issues that need exploration:

  1. Can these functions cause overlarge-segment errors for code segments?
  2. Can whether or not these functions are compiled in a given module be controlled?

To explore these issues, the test framework needs to be modified (the result is in Phase 6 or in folder Phase6 of Tutorial.zip):

  1. In test.hpp, make test12() into a virtual function.
  2. In test.hpp, make test13() into a virtual function.
  3. In CPPtest1.cpp, modify test1() to create an object of testClass1 and use it to invoke test11() and test13().

If these compiler options are used (note that -d2i is gone and no debugging switch is present):

for DOS:     -w4 -e25 -zq -od -zm -bt=dos -ml -xs -xr 
for OS/2:    -w4 -e25 -zq -od -zm -bt=os2 -ml -xs -xr 
for Windows: -w4 -e25 -zq -od -zm -bt=windows -ml -xs -xr

then the compiler error will result in CPPmain.cpp.

If the invocation of either test12() or test13() in main() is commented-out, compilation will work; if these linker options are used:

d all SYS dos op m LIBF CPPtest1.obj op maxe=25 op q FIL CPPmain.obj LIBR TestLib.lib
d all SYS os2 op m LIBF CPPtest1.obj op maxe=25 op q FIL CPPmain.obj LIBR TestLib.lib 
d all SYS windows op m LIBF CPPtest1.obj op maxe=25 op q FIL CPPmain.obj LIBR TestLib.lib

then the linker error occurs and the OS/2 map file shows:

CPPtest1_TEXT          CODE           AUTO           0001:0000       00000000
CPPtest1_TEXT2         CODE           AUTO           0001:0000       0001a706
CPPtest1_TEXT3         CODE           AUTO           0002:0000       0000853a
CPPtest1_TEXT4         CODE           AUTO           0003:0000       0000d313
CPPmain_TEXT           CODE           AUTO           0003:d313       00000000
CPPmain_TEXT2          CODE           AUTO           0003:d313       00000000
CPPmain_TEXT3          CODE           AUTO           0004:0000       0000854a

and

Module: CPPtest1.obj(E:\ProgDev\Cpp\Tutorial\CPPtest1.cpp)
0001:0000      far testClass1::testClass1()
0001:0054      far testClass1::testClass1( testClass1 const far & )
0001:00aa      far testClass1::~testClass1()
0001:00e0      void far testClass1::test12()
0001:d3f3      void far testClass1::test13()
0002:0000      void far testClass1::test1()
0003:0000+     void far testClass1::test11()
Module: CPPmain.obj(E:\ProgDev\Cpp\Tutorial\CPPmain.cpp)
0004:0000      main_

and if the invocations of both test12() and test13() in main() are commented out, the entry for CPPmain_TEXT3, that is, for main(), becomes:

CPPmain_TEXT3          CODE           AUTO           0003:d313       0000006f

Similarly, if the invocation of test13() is commented out in test1(), the entry for CPPtest1_TEXT3, that is, for test1(), becomes:

CPPtest1_TEXT3         CODE           AUTO           0002:0000       0000005f

Finally, with either test12() or test13() commented out in main(), if the LIBF is removed so that the linker options are:

d all SYS dos op m op maxe=25 op q FIL CPPmain.obj LIBR TestLib.lib
d all SYS os2 op m op maxe=25 op q FIL CPPmain.obj LIBR TestLib.lib
d all SYS windows op m op maxe=25 op q FIL CPPmain.obj LIBR TestLib.lib

the linker error occurs.

Now the OS/2 map file shows:

CPPmain_TEXT           CODE           AUTO           0001:0000       00000000
CPPmain_TEXT2          CODE           AUTO           0001:0000       0001a706
CPPmain_TEXT3          CODE           AUTO           0002:0000       0000006f
CPPtest1_TEXT          CODE           AUTO           0002:15f3       00000000
CPPtest1_TEXT2         CODE           AUTO           0002:15f3       00000000
CPPtest1_TEXT3         CODE           AUTO           0002:15f3       0000853a
CPPtest1_TEXT4         CODE           AUTO           0003:0000       0000d313

and

Module: CPPmain.obj(E:\ProgDev\Cpp\Tutorial\CPPmain.cpp)
0001:0000      far testClass1::testClass1()
0001:0054      far testClass1::testClass1( testClass1 const far & )
0001:00aa      far testClass1::~testClass1()
0001:00e0      void far testClass1::test12()
0001:d3f3      void far testClass1::test13()
0002:0000      main_
Module: TestLib.lib(E:\ProgDev\Cpp\Tutorial\CPPtest1.cpp)
0002:15f3      void far testClass1::test1()
0003:0000+     void far testClass1::test11()

From which these conclusions can be drawn:

  1. Open Watcom 1.6 will inline virtual inline functions when it is certain which version of the function is being called. This is why test12() and test13() cannot both be allowed in main().
  2. All virtual inline functions end up in the same compiler segment as the constructors and destructor. As noted above, this is not an actual compiler segment, but reflects how the linker sees this code.
  3. The constructors, destructor and virtual inline functions are all compiled into every module which instantiates the class. The linker eliminates all but the first copy of the resulting functions.

Since, for each class, all of the functions are compiled into each of the affected modules, all of the functions for that class end up in the same module when linked. So we have, for each class, a single "compiler segment" which contains too much code which the linker treats as an irreducible block of code: some form of refactoring is clearly called for. First, though, we need to get our main() function to compile.

Forcing Non-Inlining of Functions

Consider the situation in the main() function: both test12() and test13() cannot be invoked because both are inlined. Is it possible to convince the compiler to not inline, say, test13()?

The solution is shown in Phase 7 or in folder Phase7 of Tutorial.zip, and can be summarized in this way:

  1. Create a pointer to an object of testClass1.
  2. Invoke test13() through the pointer.

The main() function now compiles. Consulting the map file for compiler segment CPPmain_TEXT3, which contains the main() function, now shows:

CPPmain_TEXT3          CODE           AUTO           0004:0000       0000856a

Which shows that the invocation of test13() is no longer inlined.

Creating an otherwise-superflous pointer may be appropriate when the functions which, taken together, cause the compiler problem when they are actually inlined are all defined with "inline" for a good reason (that is, refactoring them as ordinary functions is not acceptable for some reason).

Refactoring Classes: Inheritance

Refactoring a class is done by separating out the functionality of the given class into two or more classes. How this is done in a real program is a topic to be researched elsewhere; this tutorial presents, in Phase 8 or folder Phase8 of Tutorial.zip, a version of the test framework which uses refactoring by inheritance to solve the over-large code segment problem.

These are the changes made:

  1. Class testClass1 is split into testClass0 and testClass1, as shown.
  2. test1() instantiates no classes and invokes test11() and test13() directly.
  3. CPPtest1.cpp instantiates an object of testClass0 directly above testClass1::test1().

If these compiler options are used:

for DOS:     -w4 -e25 -zq -od -zm -bt=dos -ml -xs -xr 
for OS/2:    -w4 -e25 -zq -od -zm -bt=os2 -ml -xs -xr 
for Windows: -w4 -e25 -zq -od -zm -bt=windows -ml -xs -xr

and these linker options are used (note the absence of LIBF):

d all SYS dos op m op maxe=25 op q FIL CPPmain.obj LIBR TestLib.lib
d all SYS os2 op m op maxe=25 op q FIL CPPmain.obj LIBR TestLib.lib
d all SYS windows op m op maxe=25 op q FIL CPPmain.obj LIBR TestLib.lib

then the linker error occurs. The OS/2 map file shows

CPPmain_TEXT           CODE           AUTO           0001:0000       00000000
CPPmain_TEXT2          CODE           AUTO           0001:0000       0001a797
CPPmain_TEXT3          CODE           AUTO           0002:0000       00008676
CPPtest1_TEXT          CODE           AUTO           0002:9fb9       00000000
CPPtest1_TEXT1         CODE           AUTO           0002:9fb9       0000004e
CPPtest1_TEXT2         CODE           AUTO           0002:a007       000000e0
CPPtest1_TEXT3         CODE           AUTO           0002:a0e7       00000052
CPPtest1_TEXT4         CODE           AUTO           0003:0000       0000d313

and

Module: CPPmain.obj(E:\ProgDev\Cpp\Tutorial\CPPmain.cpp)
0001:0000+     far testClass1::testClass1()
0001:0088+     far testClass1::testClass1( testClass1 const far & )
0001:0112+     far testClass1::~testClass1()
0001:0171+     void far testClass1::test12()
0001:d484      void far testClass0::test13()
0002:0000      main_
Module: TestLib.lib(E:\ProgDev\Cpp\Tutorial\CPPtest1.cpp)
0002:a007+     far testClass0::testClass0()
0002:a05b+     far testClass0::testClass0( testClass0 const far & )
0002:a0b1+     far testClass0::~testClass0()
0002:a0e7      void far testClass1::test1()
000e:0780+     testClass0 far tester
0003:0000+     void far testClass1::test11()

where virtual linker segment 000e is a data segment.

This suggests two things:

  1. Although instantiating a class (testClass1) should also instantiate its base class (testClass0), the constructors and destructor of the base class are not always placed in the module.
  2. The virtual inline function test13() is placed in the module, presumably because it is part of testClass1 by inheritance, but is shown as a member of testClass0.

Such observations, however interesting, are not actually relevant to the subject of this tutorial. The result is still that the linker sees a "compiler segment" larger than 64K.

As might be expected, if these linker options are used:

d all SYS dos op m LIBF CPPtest1.obj op maxe=25 op q FIL CPPmain.obj LIBR TestLib.lib
d all SYS os2 op m LIBF CPPtest1.obj op maxe=25 op q FIL CPPmain.obj LIBR TestLib.lib
d all SYS windows op m LIBF CPPtest1.obj op maxe=25 op q FIL CPPmain.obj LIBR TestLib.lib

then the executables are produced. The OS/2 map file shows

CPPtest1_TEXT          CODE           AUTO           0001:0000       00000000
CPPtest1_TEXT1         CODE           AUTO           0001:0000       0000004e
CPPtest1_TEXT2         CODE           AUTO           0001:004e       0000d3f3
CPPtest1_TEXT3         CODE           AUTO           0001:d441       00000052
CPPtest1_TEXT4         CODE           AUTO           0002:0000       0000d313
CPPmain_TEXT           CODE           AUTO           0002:d313       00000000
CPPmain_TEXT2          CODE           AUTO           0003:0000       0000d484
CPPmain_TEXT3          CODE           AUTO           0004:0000       00008676

and

Module: CPPtest1.obj(E:\ProgDev\Cpp\Tutorial\CPPtest1.cpp)
0001:004e+     far testClass0::testClass0()
0001:00a2+     far testClass0::testClass0( testClass0 const far & )
0001:00f8+     far testClass0::~testClass0()
0001:012e      void far testClass0::test13()
0001:d441      void far testClass1::test1()
000d:075a+     testClass0 far tester
0002:0000+     void far testClass1::test11()
Module: CPPmain.obj(E:\ProgDev\Cpp\Tutorial\CPPmain.cpp)
0003:0000+     far testClass1::testClass1()
0003:0088+     far testClass1::testClass1( testClass1 const far & )
0003:0112+     far testClass1::~testClass1()
0003:0171+     void far testClass1::test12()
0004:0000      main_

where virtual linker segment 000d is a data segment.

This shows that refactoring by inheritance can work when the new base class in instantiated explicitly and LIBF is used to ensure that the module using it is linked first.

Refactoring Classes: Composition

This tutorial now presents, as shown in Phase 9 or folder Phase9 of Tutorial.zip, a version of the test framework which uses refactoring by composition to solve the over-large code segment problem.

These are the changes made:

  1. Classes testClass0 and testClass1 are modified so that testClass1 includes an object of testClass0 as a data member and add a version of test13() which invokes testClass0::test13().
  2. test2(), test11() and test12() use testClass1::tester to access data member Judy and Fred.

If these compiler options are used:

for DOS:     -w4 -e25 -zq -od -zm -bt=dos -ml -xs -xr 
for OS/2:    -w4 -e25 -zq -od -zm -bt=os2 -ml -xs -xr 
for Windows: -w4 -e25 -zq -od -zm -bt=windows -ml -xs -xr

and these linker options are used (note the absence of LIBF):

d all SYS dos op m op maxe=25 op q FIL CPPmain.obj LIBR TestLib.lib
d all SYS os2 op m op maxe=25 op q FIL CPPmain.obj LIBR TestLib.lib
d all SYS windows op m op maxe=25 op q FIL CPPmain.obj LIBR TestLib.lib

then the linker error occurs. The OS/2 map file shows

CPPmain_TEXT           CODE           AUTO           0001:0000       00000000
CPPmain_TEXT2          CODE           AUTO           0001:0000       00015a57
CPPmain_TEXT3          CODE           AUTO           0002:0000       000085fe
CPPtest1_TEXT          CODE           AUTO           0002:9c03       00000000
CPPtest1_TEXT1         CODE           AUTO           0002:9c03       0000004e
CPPtest1_TEXT2         CODE           AUTO           0002:9c51       000000e0
CPPtest1_TEXT3         CODE           AUTO           0002:9d31       00000053
CPPtest1_TEXT4         CODE           AUTO           0003:0000       00008514

and

Module: CPPmain.obj(E:\ProgDev\Cpp\Tutorial\CPPmain.cpp)
0001:0000+     far testClass1::testClass1()
0001:00d7+     far testClass1::testClass1( testClass1 const far & )
0001:01b0+     far testClass1::~testClass1()
0001:01e6+     void far testClass1::test13()
0001:0230+     void far testClass1::test12()
0001:8744      void far testClass0::test13()
0002:0000      main_
Module: TestLib.lib(E:\ProgDev\Cpp\Tutorial\CPPtest1.cpp)
0002:9c51+     far testClass0::testClass0()
0002:9ca5+     far testClass0::testClass0( testClass0 const far & )
0002:9cfb+     far testClass0::~testClass0()
0002:9d31      void far testClass1::test1()
0009:0334+     testClass0 far tester
0003:0000+     void far testClass1::test11()

where virtual linker segment 0009 is a data segment. Note that testClass1::test13() appears even though it is implicitly inlined: this confirms that all virtual inline functions (whether inlined explicitly or implicitly) are seen by the linker as in the same "compiler segment" as the constructors and destructor of the class instantiated.

As might be expected, if these linker options are used:

d all SYS dos op m LIBF CPPtest1.obj op maxe=25 op q FIL CPPmain.obj LIBR TestLib.lib
d all SYS os2 op m LIBF CPPtest1.obj op maxe=25 op q FIL CPPmain.obj LIBR TestLib.lib
d all SYS windows op m LIBF CPPtest1.obj op maxe=25 op q FIL CPPmain.obj LIBR TestLib.lib

then the executables are produced. The OS/2 map file shows

CPPtest1_TEXT          CODE           AUTO           0001:0000       00000000
CPPtest1_TEXT1         CODE           AUTO           0001:0000       0000004e
CPPtest1_TEXT2         CODE           AUTO           0001:004e       0000d3f3
CPPtest1_TEXT3         CODE           AUTO           0001:d441       00000053
CPPtest1_TEXT4         CODE           AUTO           0002:0000       00008514
CPPmain_TEXT           CODE           AUTO           0002:8514       00000000
CPPmain_TEXT2          CODE           AUTO           0003:0000       00008744
CPPmain_TEXT3          CODE           AUTO           0004:0000       000085fe

and

Module: CPPtest1.obj(E:\ProgDev\Cpp\Tutorial\CPPtest1.cpp)
0001:004e+     far testClass0::testClass0()
0001:00a2+     far testClass0::testClass0( testClass0 const far & )
0001:00f8+     far testClass0::~testClass0()
0001:012e      void far testClass0::test13()
0001:d441      void far testClass1::test1()
0009:0334+     testClass0 far tester
0002:0000+     void far testClass1::test11()
Module: CPPmain.obj(E:\ProgDev\Cpp\Tutorial\CPPmain.cpp)
0003:0000+     far testClass1::testClass1()
0003:00d7+     far testClass1::testClass1( testClass1 const far & )
0003:01b0+     far testClass1::~testClass1()
0003:01e6+     void far testClass1::test13()
0003:0230+     void far testClass1::test12()
0004:0000      main_

where virtual linker segment 0009 is a data segment.

This shows that refactoring by composition can work when the new base class in instantiated explicitly and LIBF is used to ensure that the module using it is linked first.

Virtual Inline Functions in Real Programs

Both forms of refactoring end with the same solution: refactor the class causing the problem and instantiate the new class in a separate module which is linked first with LIBF. Now, instantiating a class solely for this reason may seem to be a bit heavy-handed. But suppose the test framework had been designed from the start with testClass0 and testClass1. Then refactoring would not have been needed: use of LIBF would have been enough -- if a module already existed which instantiated testClass0 but not testClass1.

In a real program, classes are created and structured (and refactored) the way they are for definite reasons (which vary, of course, from program to program), and being able to use the functionality of a class directly rather than indirectly by using a derived class or a class containing an instance of it as a data member is one of them, so it is quite possible that neither refactoring nor artificial class instantiation is needed to solve this problem: all that may be needed is to identify a module which instantiates an appropriate class used as a base class or as a data member of the class causing this problem and use LIBF to link that module first.

Finding such a module (or set of modules) is a matter of trial-and-error and may have to be redone if the code is altered. But it does avoid the artificiality of otherwise-unneeded instantiations.

Although not strictly on-topic, there are two items that should be mentioned.

The first is that testClass1::tester is a pointer to testClass0 rather than an object of testClass0 because, when tried as an object, testClass1::test13() ended up with testClass0::test13() inlined inside it, and even using LIBF on CPPtest1.obj was not enough to solve the linker problem. This is a practical application of the technique discussed above for forcing the non-inlining of an inline function by accessing it through a pointer.

And I would be remiss if I did not point out that using inheritance as shown required the data members to be changed from private to protected and that using composition as shown required them to become public data members. While protected data members may be acceptable, public data members are almost never acceptable in real programs. There are alternatives to doing this:

  1. Provide accessor functions in testClass0, so the data members can remain private.
  2. Keep the data members in testClass1 and change test13() to accept two int parameters.

With accessor functions, the first two lines of, for example, testClass1::test11() might look like this:

int Judy = tester->getJudy();
int Fred = tester->getFred();

which is not that much of a change when composition is used.

This technique, which uses function-local variables to capture the state of the object on entry, can be useful in some situations but it also can cause problems as well. For example:

  1. If the values change, then they usually must be written back to the data members before the function exits. This works fine in a single-threaded environment.
  2. In a multi-threaded environment, synchronization must be considered: without it, the state can change in mid-function execution with no way to change the local values (which may, of course, be exactly what you want) and race conditions may occur if the values are written back; with it, deadlock and the other problems resulting from synchronization and blocking may occur.

On the other hand, accessor functions may increase code size to the point where, if used frequently enough, they cause it to exceed the 64K segment limit. Using function-local variables is one solution to this problem.

Template Effects

Templates (so far) only exist in C++, not in C. The test framework will be C++-only for this entire section.

This will not be a discussion of template programming. Instead, it will explore how template instantiation affects code segment size, and what can be done to solve any problems that appear.

The terminology used with templates is not always clear. This is how it will be used here:

  1. A class template is a template which defines a class. No code is generated by a class template.
  2. A function template is a template which defines a non-member function. No code is generated by a function template.
  3. A template class is a class template which has been instantiated. Code is generated for a template class.
  4. A template function is a function template which has been instantiated. Code is generated for a template function.

To explore this topic, the test framework must be extensively modified. The result can be found in Phase 10 or in folder Phase10 of Tutorial.zip. Note that CPPtest1.cpp and CPPtest2.cpp are empty and can be ignored (for now). For IDE users, this means that the .LIB files can be removed (for now) from the .EXE targets.

This will now compile with these options:

-w4 -d25 -zq -od -zm -bt=dos -ml -xs -xr
-w4 -d25 -zq -od -zm -bt=os2 -ml -xs -xr
-w4 -d25 -zq -od -zm -bt=windows -ml -xs -xr

but when linked with these options:

d all SYS dos op m op maxe=25 op q FIL CPPmain.obj
d all SYS os2 op m op maxe=25 op q FIL CPPmain.obj
d all SYS windows op m op maxe=25 op q FIL CPPmain.obj

it fails to link because of an over-large code segment.

The OS/2 map file shows:

CPPmain_TEXT           CODE           AUTO           0001:0000       00000000
CPPmain_TEXT2          CODE           AUTO           0001:0000       00000082
CPPmain_TEXT3          CODE           AUTO           0002:0000       0003d253

and: Module: CPPmain.obj(E:\ProgDev\Cpp\Tutorial\CPPmain.cpp)

0001:0000      main_
0002:0000+     far testClass1<int far >::testClass1()
0002:0044+     far testClass1<int far >::testClass1( testClass1<int far > const far & )
0002:008a+     far testClass1<int far >::~testClass1()
0002:00c0+     void far test3( int )
0002:85bd+     void far testClass1<int far >::test1()
0002:85fa+     void far testClass1<int far >::test11()
0002:590d+     void far testClass1<int far >::test12()
0002:2c20+     void far testClass1<int far >::test13()
0002:ff33+     void far testClass1<int far >::test2()

which looks very much like what happened with virtual inline functions: the entire template class is seen, by the linker, as being in a single "compiler segment" which can be over 64K in size.

As was the case with virtual inline functions, the template class must be refactored before LIBF can be used to solve the problem.

Refactoring Templates: Inheritance

In working with templates, it is important not to get too attached to them. Although many of the constructs which result when a class template is refactored will be class templates using the same template parameters as the original class template, some member functions may not depend on the template parameters and so can be grouped into a normal class. Thus, the refactored test framework shown in Phase 11 or in folder Phase11 of Tutorial.zip has a four-class linear hierarchy with both normal and template base classes.

Note for IDE users: you should add the library to the .EXE targets again. Also, CPPtest3.cpp and CPPtest4.cpp should be added to the library projects.

It might be asked why the lines

   T Fred = this->Fred;
   T Judy = this->Judy;

have been placed at the start and the lines

   this->Fred = Fred;
   this->Judy = Judy;

have been placed at the end of test12(), test13() and test2(). The initial form of this code did not have them, and, while it compiled for DOS and OS/2, it did not compile for Windows. After considerable discussion and assistance on the newsgroup, the reason finally appeared:

  1. The Windows code is larger than the DOS and OS/2 code for reasons having to do with the requirements imposed by Windows.
  2. The references to "Fred" and "Judy" were always, in C++, actually offsets from the value of "this". In Phase10, the offset for "Fred" was zero; in Phase11, it is eight. This changes the resulting instruction to include the offset, which increases its size. Since this happens 2000 times, the code size becomes considerably larger.
  3. The DOS and OS/2 code size increased but stayed below 64K; the Windows code size was pushed above 64K.

This illustrates the use of function-local variables (discussed above) to solve an overlarge code segment problem resulting from, in this case, refactoring the code. The same problem could easily occur if accessor functions were used. In this case, resetting the data members "Fred" and "Judy" at the end of the function is also illustrated.

When compiled with these options:

-w4 -d25 -zq -od -zm -bt=dos -ml -xs -xr
-w4 -d25 -zq -od -zm -bt=os2 -ml -xs -xr
-w4 -d25 -zq -od -zm -bt=windows -ml -xs -xr

and linked with these options :

d all SYS dos op m op maxe=25 op q FIL CPPmain.obj LIBR TestLib.lib
d all SYS os2 op m op maxe=25 op q FIL CPPmain.obj LIBR TestLib.lib
d all SYS windows op m op maxe=25 op q FIL CPPmain.obj LIBR TestLib.lib

the linker error occurs.

The OS/2 map file shows:

CPPmain_TEXT           CODE           AUTO           0001:0000       00000000
CPPmain_TEXT2          CODE           AUTO           0001:0000       00021d4b
CPPmain_TEXT3          CODE           AUTO           0002:0000       00000265

and: Module: CPPmain.obj(E:\ProgDev\Cpp\Tutorial\CPPmain.cpp)

0001:0000+     far baseTest::baseTest()
0001:0054+     far baseTest::baseTest( baseTest const far & )
0001:00aa+     far baseTest::~baseTest()
0002:0000      main_
0001:01a5+     far testClass0<int far >::testClass0()
0001:028e+     far testClass0<int far >::testClass0( testClass0<int far > const far & )
0001:0379+     far testClass0<int far >::~testClass0()
0001:04ba+     far testClass2<int far >::testClass2()
0001:0613+     far testClass2<int far >::testClass2( testClass2<int far > const far & )
0001:076e+     far testClass2<int far >::~testClass2()
0001:08bf+     void far test3( int )
0001:8dbc+     void far testClass0<int far >::test12()
0001:12d3+     void far testClass1<int far >::test13()
0001:97ea+     void far testClass2<int far >::test1()
0001:9827+     void far testClass2<int far >::test2()

which shows that the linker is seeing all of the template code as part of a single "compiler segment". It also shows that, in some cases, base class constructors and destructors are produced by the compiler, although the exact conditions affecting this are outside the scope of this tutorial.

The same functions are in the same segment with Open Watcom 1.8; however, they are in a different order.

If LIBF is used with the linker, so that the linker options are:

d all SYS dos op m LIBF CPPtest2.obj,CPPtest3.obj,CPPtest4.obj op maxe=25 op q FIL CPPmain.obj LIBR TestLib.lib
d all SYS os2 op m LIBF CPPtest2.obj,CPPtest3.obj,CPPtest4.obj op maxe=25 op q FIL CPPmain.obj LIBR TestLib.lib
d all SYS windows op m LIBF CPPtest2.obj,CPPtest3.obj,CPPtest4.obj op maxe=25 op q FIL CPPmain.obj LIBR TestLib.lib

then test framework will now link with versions prior to Open Watcom 1.8.

Now the OS/2 map file shows:

CPPtest2_TEXT          CODE           AUTO           0001:0000       00000000
CPPtest2_TEXT2         CODE           AUTO           0001:0000       00008517
CPPtest3_TEXT          CODE           AUTO           0001:8517       00000000
CPPtest3_TEXT2         CODE           AUTO           0002:0000       00008517
CPPtest4_TEXT          CODE           AUTO           0002:8517       00000000
CPPtest4_TEXT2         CODE           AUTO           0003:0000       000084fd
CPPtest4_TEXT3         CODE           AUTO           0003:84fd       00000023
CPPmain_TEXT           CODE           AUTO           0003:8520       00000000
CPPmain_TEXT2          CODE           AUTO           0004:0000       00008e20
CPPmain_TEXT3          CODE           AUTO           0004:8e20       00000265

and:

Module: CPPtest2.obj(E:\ProgDev\Cpp\Tutorial\CPPtest2.cpp)
0001:0000      void far testClass0<int far >::test12()
Module: CPPtest3.obj(E:\ProgDev\Cpp\Tutorial\CPPtest3.cpp)
0002:0000      void far testClass1<int far >::test13()
Module: CPPtest4.obj(E:\ProgDev\Cpp\Tutorial\CPPtest4.cpp)
0003:84fd*     void far doTest3()
0003:0000      void far test3( int )
Module: CPPmain.obj(E:\ProgDev\Cpp\Tutorial\CPPmain.cpp)
0004:0000+     far baseTest::baseTest()
0004:0054+     far baseTest::baseTest( baseTest const far & )
0004:00aa+     far baseTest::~baseTest()
0004:8e20      main_
0004:01a5+     far testClass0<int far >::testClass0()
0004:028e+     far testClass0<int far >::testClass0( testClass0<int far > const far & )
0004:0379+     far testClass0<int far >::~testClass0()
0004:04ba+     far testClass2<int far >::testClass2()
0004:0613+     far testClass2<int far >::testClass2( testClass2<int far > const far & )
0004:076e+     far testClass2<int far >::~testClass2()
0004:08bf+     void far testClass2<int far >::test1()
0004:08fc+     void far testClass2<int far >::test2()

This works by moving the code out of CPPmain.cpp into other modules. It is interesting that this works when CPPtest2.cpp and CPPtest3.cpp include only a typedef: unlike virtual inline functions, where an object had to be created to force generation of the function, for a template a typedef is sufficient, at least in Open Watcom 1.6. Also, the linker knows nothing of typedefs and so eliminates duplicate instantiations based entirely on the template class signature. Thus, the typedef need not be used anywhere else.

For Open Watcom 1.8, the only function that moves is template function test3< int far >(). If the data from the OS/2 MAP file is considered:

CPPtest2_TEXT          CODE           AUTO           0001:0000       00000000
CPPtest3_TEXT          CODE           AUTO           0001:0000       00000000
CPPtest4_TEXT          CODE           AUTO           0001:0000       00000000
CPPtest4_TEXT2         CODE           AUTO           0001:0000       000084fd
CPPtest4_TEXT3         CODE           AUTO           0001:84fd       00000023
CPPmain_TEXT           CODE           AUTO           0001:8520       00000000
CPPmain_TEXT2          CODE           AUTO           0002:0000       00019a41
CPPmain_TEXT3          CODE           AUTO           0003:0000       000003c0

it is clear that the LIBF modules were linked in the desired order.

However, CPPtest2.obj and CPPtest3.obj each consists entirely of a typedef. The conclusion to be drawn is that typedefs no longer generate template instantiations.

Refactoring Templates: Composition

The test framework refactored to use composition is shown in Phase 12 and in folder Phase12 of Tutorial.zip.

When compiled with these options:

-w4 -d25 -zq -od -zm -bt=dos -ml -xs -xr
-w4 -d25 -zq -od -zm -bt=os2 -ml -xs -xr
-w4 -d25 -zq -od -zm -bt=windows -ml -xs -xr

and linked with with these options:

d all SYS dos op m op maxe=25 op q FIL CPPmain.obj LIBR TestLib.lib
d all SYS os2 op m op maxe=25 op q FIL CPPmain.obj LIBR TestLib.lib
d all SYS windows op m op maxe=25 op q FIL CPPmain.obj LIBR TestLib.lib

then the linker error results.

The OS/2 map file shows:

CPPmain_TEXT           CODE           AUTO           0001:0000       00000000
CPPmain_TEXT2          CODE           AUTO           0001:0000       00021638
CPPmain_TEXT3          CODE           AUTO           0002:0000       000000d4

and:

Module: CPPmain.obj(E:\ProgDev\Cpp\Tutorial\CPPmain.cpp)
0002:0000      main_
0001:0000+     far testClass2<int far >::testClass2()
0001:009c+     far testClass2<int far >::testClass2( testClass2<int far > const far & )
0001:013a+     far testClass2<int far >::~testClass2()
0001:0170+     void far test3( int )
0001:866d+     void far testClass0<int far >::test12()
0001:0b84+     void far testClass1<int far >::test13()
0001:909b+     void far testClass2<int far >::test1()
0001:9108+     void far testClass2<int far >::test2()

which shows that the linker is viewing the template code as one over-large "compiler segment". The absence of testClass0<int>::test11(), testClass1<int>::test11(), testClass1<int>::test12(), testClass2<int>::test11(), testClass2<int>::test12(), and testClass2<int>::test13() suggests that their invocations were, in fact, inlined.

If LIBF is used with the linker, so that the linker options are:

d all SYS dos op m LIBF CPPtest2.obj,CPPtest3.obj,CPPtest4.obj op maxe=25 op q FIL CPPmain.obj LIBR TestLib.lib
d all SYS os2 op m LIBF CPPtest2.obj,CPPtest3.obj,CPPtest4.obj op maxe=25 op q FIL CPPmain.obj LIBR TestLib.lib
d all SYS windows op m LIBF CPPtest2.obj,CPPtest3.obj,CPPtest4.obj op maxe=25 op q FIL CPPmain.obj LIBR TestLib.lib

then the test framework will now link with versions of Open Watcom prior to 1.8.

Now the OS/2 map file shows:

CPPtest2_TEXT          CODE           AUTO           0001:0000       00000000
CPPtest2_TEXT2         CODE           AUTO           0001:0000       00008517
CPPtest3_TEXT          CODE           AUTO           0001:8517       00000000
CPPtest3_TEXT2         CODE           AUTO           0002:0000       00008517
CPPtest4_TEXT          CODE           AUTO           0002:8517       00000000
CPPtest4_TEXT2         CODE           AUTO           0003:0000       000084fd
CPPtest4_TEXT3         CODE           AUTO           0003:84fd       00000023
CPPmain_TEXT           CODE           AUTO           0003:8520       00000000
CPPmain_TEXT2          CODE           AUTO           0004:0000       0000870d
CPPmain_TEXT3          CODE           AUTO           0004:870d       000000d4

and:

Module: CPPtest2.obj(E:\ProgDev\Cpp\Tutorial\CPPtest2.cpp)
0001:0000      void far testClass0<int far >::test12()
Module: CPPtest3.obj(E:\ProgDev\Cpp\Tutorial\CPPtest3.cpp)
0002:0000      void far testClass1<int far >::test13()
Module: CPPtest4.obj(E:\ProgDev\Cpp\Tutorial\CPPtest4.cpp)
0003:84fd*     void far doTest3()
0003:0000      void far test3( int )
Module: CPPmain.obj(E:\ProgDev\Cpp\Tutorial\CPPmain.cpp)
0004:870d      main_
0004:0000+     far testClass2<int far >::testClass2()
0004:009c+     far testClass2<int far >::testClass2( testClass2<int far > const far & )
0004:013a+     far testClass2<int far >::~testClass2()
0004:0170+     void far testClass2<int far >::test1()
0004:01dd+     void far testClass2<int far >::test2()

which shows that LIBF works by moving the code for testClass0<int>::test12(), testClass1<int>::test13() and test3() out of module CPPmain.cpp.

For Open Watcom 1.8, the only function that moves is template function test3< int far >(). If the data from the OS/2 MAP file is considered:

CPPtest2_TEXT          CODE           AUTO           0001:0000       00000000
CPPtest3_TEXT          CODE           AUTO           0001:0000       00000000
CPPtest4_TEXT          CODE           AUTO           0001:0000       00008520
CPPmain_TEXT           CODE           AUTO           0001:8520       00000000
CPPmain_TEXT2          CODE           AUTO           0002:0000       00019169
CPPmain_TEXT3          CODE           AUTO           0003:0000       0000020e

it is clear that the LIBF modules were linked in the desired order.

However, CPPtest2.obj and CPPtest3.obj each consists entirely of a typedef. The conclusion to be drawn is that typedefs no longer generate template instantiations.

The <string> Header

This tutorial was prompted by a usenet post in which it was noted that a C++ file containing one line:

#include<string>

generates over 30K of code!

This is not hard to explore; the test framework from Phase 11 altered for this purpose is shown in Phase 13 or in folder Phase13 of Tutorial.zip.

Note to IDE users: CPPtest5.cpp will need to be added to TestLib.

When compiled with these options:

-w4 -d25 -zq -od -zm -bt=dos -ml -xs -xr
-w4 -d25 -zq -od -zm -bt=os2 -ml -xs -xr
-w4 -d25 -zq -od -zm -bt=windows -ml -xs -xr

and linked with with these options:

d all SYS dos op m LIBF CPPtest2.obj,CPPtest3.obj,CPPtest4.obj op maxe=25 op q FIL CPPmain.obj LIBR TestLib.lib
d all SYS os2 op m LIBF CPPtest2.obj,CPPtest3.obj,CPPtest4.obj op maxe=25 op q FIL CPPmain.obj LIBR TestLib.lib
d all SYS windows op m LIBF CPPtest2.obj,CPPtest3.obj,CPPtest4.obj op maxe=25 op q FIL CPPmain.obj LIBR TestLib.lib

then the linker error results.

The OS/2 map file shows:

CPPtest2_TEXT          CODE           AUTO           0001:0000       00000000
CPPtest2_TEXT2         CODE           AUTO           0001:0000       00008517
CPPtest3_TEXT          CODE           AUTO           0001:8517       00000000
CPPtest3_TEXT2         CODE           AUTO           0002:0000       00008517
CPPtest4_TEXT          CODE           AUTO           0002:8517       00000000
CPPtest4_TEXT2         CODE           AUTO           0003:0000       000084fd
CPPtest4_TEXT3         CODE           AUTO           0003:84fd       00000023
CPPmain_TEXT           CODE           AUTO           0003:8520       00000000
CPPmain_TEXT2          CODE           AUTO           0004:0000       00018164
CPPmain_TEXT3          CODE           AUTO           0005:0000       000001e9

and:

Module: CPPtest2.obj(E:\ProgDev\Cpp\Tutorial\CPPtest2.cpp)
0001:0000      void far testClass0<int far >::test12()
Module: CPPtest3.obj(E:\ProgDev\Cpp\Tutorial\CPPtest3.cpp)
0002:0000      void far testClass1<int far >::test13()
Module: CPPtest4.obj(E:\ProgDev\Cpp\Tutorial\CPPtest4.cpp)
0003:84fd*     void far doTest3()
0003:0000      void far test3( int )
Module: CPPmain.obj(E:\ProgDev\Cpp\Tutorial\CPPmain.cpp)
0004:0000      far std::exception::exception()
0004:0086      far std::exception::exception( std::exception const far & )
0004:0124      far std::exception::~exception()

and so on at incredible length. All the code shown in CPPmain.obj before is still present; the code from header <string> is present as well for Open Watcom versions prior to 1.8. And it is clear that the code from all templates, related to each other or not, will be seen by the linker as part of a single "compiler segment".

If CPPtest5.obj is added to the LIBF directive, so that these linker options are used:

d all SYS dos op m LIBF CPPtest2.obj,CPPtest3.obj,CPPtest4.obj,CPPtest5.obj op maxe=25 op q FIL CPPmain.obj LIBR TestLib.lib
d all SYS os2 op m LIBF CPPtest2.obj,CPPtest3.obj,CPPtest4.obj,CPPtest5.obj op maxe=25 op q FIL CPPmain.obj LIBR TestLib.lib
d all SYS windows op LIBF CPPtest2.obj,CPPtest3.obj,CPPtest4.obj,CPPtest5.obj m op maxe=25 op q FIL CPPmain.obj LIBR TestLib.lib

then the test framework will link for Open Watcom versions prior to 1.8.

Now the OS/2 map file shows:

CPPtest2_TEXT          CODE           AUTO           0001:0000       00000000
CPPtest2_TEXT2         CODE           AUTO           0001:0000       00008517
CPPtest3_TEXT          CODE           AUTO           0001:8517       00000000
CPPtest3_TEXT2         CODE           AUTO           0002:0000       00008517
CPPtest4_TEXT          CODE           AUTO           0002:8517       00000000
CPPtest4_TEXT2         CODE           AUTO           0003:0000       000084fd
CPPtest4_TEXT3         CODE           AUTO           0003:84fd       00000023
CPPtest5_TEXT          CODE           AUTO           0003:8520       00000000
CPPtest5_TEXT2         CODE           AUTO           0004:0000       0000f344
CPPmain_TEXT           CODE           AUTO           0004:f344       00000000
CPPmain_TEXT2          CODE           AUTO           0005:0000       00008e20
CPPmain_TEXT3          CODE           AUTO           0005:8e20       000001e9

and:

Module: CPPtest2.obj(E:\ProgDev\Cpp\Tutorial\CPPtest2.cpp)
0001:0000      void far testClass0<int far >::test12()
Module: CPPtest3.obj(E:\ProgDev\Cpp\Tutorial\CPPtest3.cpp)
0002:0000      void far testClass1<int far >::test13()
Module: CPPtest4.obj(E:\ProgDev\Cpp\Tutorial\CPPtest4.cpp)
0003:84fd*     void far doTest3()
0003:0000      void far test3( int )
Module: CPPtest5.obj(E:\ProgDev\Cpp\Tutorial\CPPtest5.cpp)
0004:0000      far std::exception::exception()
0004:0086      far std::exception::exception( std::exception const far & )
0004:0124      far std::exception::~exception()
<and so forth>
Module: CPPmain.obj(E:\ProgDev\Cpp\Tutorial\CPPmain.cpp)
0005:0000+     far baseTest::baseTest()
0005:0054+     far baseTest::baseTest( baseTest const far & )
0005:00aa+     far baseTest::~baseTest()
0005:8e20      main_
0005:01a5+     far testClass0<int far >::testClass0()
0005:028e+     far testClass0<int far >::testClass0( testClass0<int far > const far & )
0005:0379+     far testClass0<int far >::~testClass0()
0005:04ba+     far testClass2<int far >::testClass2()
0005:0613+     far testClass2<int far >::testClass2( testClass2<int far > const far & )
0005:076e+     far testClass2<int far >::~testClass2()
0005:08bf+     void far testClass2<int far >::test1()
0005:08fc+     void far testClass2<int far >::test2()

This works by shifting the code generated by header <string> from module CPPmain.cpp to module CPPtest5.cpp. The size of CPPtest5_TEXT2 (that is, of the code generated by header <string>) is 0x0000f344 or 62,276 bytes (60.8 KB), rather more than the 30K reported!

It might be wondered why the code generated by header <string> starts with std::exception. The reason for this is not hard to determine: the header <string> is required to throw exceptions, so it includes the header <stdexcep>; the header <stdexcep> extends class std::exception and adds a data member of class std::string. If std::string were defined in the header <string>, this would create a circular inclusion since the header <stdexcept> would have to include the header <string>; to avoid this, the definition of std::string is in the internal header <_strdef.h>. And so the header <_strdef.h> provides typedefs for template classes std::string and std::wstring. The header <string> provides a typedef for template class std::istring.

The net effect is that including the header <string> also includes three typedefs which generate three versions of class template basic_string<>. And all of the related classes and template classes.

At the moment, of course, this is not a problem: the size is under the 64K limit. However, the header <string> has been said on the newsgroup to not be finished yet. This header may become a problem eventually.

For Open Watcom 1.8, the situation is much simpler: since typedefs do not instantiate templates, no code appears as a result of simply including header <string>.

Open Watcom 1.8

The test framework refactored to allow the Phase 13 code to link with Open Watcom 1.8 is shown in Phase 14 and in folder Phase14 of Tutorial.zip.

The OS/2 MAP file now shows:

CPPtest2_TEXT          CODE           AUTO           0001:0000       00000000
CPPtest2_TEXT2         CODE           AUTO           0001:0000       00008908
CPPtest2_TEXT3         CODE           AUTO           0001:8908       00000166
CPPtest3_TEXT          CODE           AUTO           0001:8a6e       00000000
CPPtest3_TEXT2         CODE           AUTO           0002:0000       00008818
CPPtest3_TEXT3         CODE           AUTO           0002:8818       00000202
CPPtest4_TEXT          CODE           AUTO           0002:8a1a       00000000
CPPtest4_TEXT2         CODE           AUTO           0003:0000       000084fd
CPPtest4_TEXT3         CODE           AUTO           0003:84fd       00000023
CPPmain_TEXT           CODE           AUTO           0003:8520       00000000
CPPmain_TEXT2          CODE           AUTO           0004:0000       00008d66
CPPmain_TEXT3          CODE           AUTO           0004:8d66       00000274

and:

Module: CPPtest2.obj(E:\ProgDev\Cpp\Tutorial\CPPtest2.cpp)
0001:0000      far baseTest::baseTest()
0001:0054      far baseTest::baseTest( baseTest const far & )
0001:00aa      far baseTest::~baseTest()
0001:8908*     void far test0()
0001:01aa      far testClass0<int far >::testClass0()
0001:0293      far testClass0<int far >::~testClass0()
0001:03f1      void far testClass0<int far >::test12()
Module: CPPtest3.obj(E:\ProgDev\Cpp\Tutorial\CPPtest3.cpp)
0002:8818*     void far test1()
0002:0000      far testClass1<int far >::testClass1()
0002:0149      far testClass1<int far >::~testClass1()
0002:0301      void far testClass1<int far >::test13()
Module: CPPtest4.obj(E:\ProgDev\Cpp\Tutorial\CPPtest4.cpp)
0003:84fd*     void far doTest3()
0003:0000      void far test3<int far >( int )
Module: CPPmain.obj(E:\ProgDev\Cpp\Tutorial\CPPmain.cpp)
0004:8d66      main_
0004:0000+     far std::allocator<char far >::allocator()
0004:0035+     far std::allocator<char far >::~allocator()
0004:006b+     far std::basic_string<char far,std::char_traits<char far > far,std::allocator<char far > far >::basic_string( std::allocator<char far > const far & )
0004:0172+     far std::basic_string<char far,std::char_traits<char far > far,std::allocator<char far > far >::basic_string( std::basic_string<char far,std::char_traits<char far > far,std::allocator<char far > far > const far & )
0004:02d0+     far std::basic_string<char far,std::char_traits<char far > far,std::allocator<char far > far >::~basic_string()
0004:03c5+     far testClass2<int far >::testClass2()
0004:0573+     far testClass2<int far >::~testClass2()
0004:0785+     void far testClass2<int far >::test1()
0004:07c2+     void far testClass2<int far >::test2()
0004:8ce6+     char far * far std::basic_string<char far,std::char_traits<char far > far,std::allocator<char far > far >::alloc( int unsigned, int unsigned far & )
000d:0060+     int unsigned const far std::basic_string<char far,std::char_traits<char far > far,std::allocator<char far > far >::npos

from which it is clear that the change in the C++ compiler starting with Open Watcom 1.8 has these effects:

  1. typedefs no longer instantiate templates.
  2. Creating an object of a template class instantiates the constructor(s) and destructor for that class and any other classes used in constructing the object, but no other functions.
  3. LIBF still works, but only when template functions (including template class member functions) are used in the modules used with LIBF.
  4. Commenting-out the invocation of tester1.test2() in CPPmain.cpp results in the line
0004:07c2+     void far testClass2<int far >::test2()

vanishing from the MAP file: member functions which are not used are not instantiated at all.

This greatly simplifies the task of separating out the member functions of a template class when it is necessary to determine if one or more of them is generating, all by itself, more than 64K bytes of code:

  1. Create a module that instantiates an object of the template class; this will produce the constructor(s) and destructor only for that class and any other classes needed to produce the object.
  2. For each member function, create a module that instantiates an object of the template class and invokes that function (only).
  3. Using LIBF, link in the first module described first, and then each of the others in turn. The result will be that the constructor(s)/destructor are in one module, and each member funtion is in its own module.

If these modules all compile and link, then none of the member functions generates 64K bytes of code all by itself.

Templates in Real Programs

When a class or class template is refactored into new classes or class templates (or one or more of each), the resulting classes/class templates will almost always need to be related to each other in some manner, since they will still have to provide the same functionality as the original class or class template.

When a class or class template is refactored into exactly two new classes or class templates (or one of each), then only two choices for relating them exist: inheritance or composition.

When a class or class template is refactored into more than two new new classes or class templates (or any combination of classes and class templates) then each pair of classes/class templates that are to be related directly to each other may be related by inheritance or by composition independently of how other such pairs are related. The resulting structure should, of course, be whatever makes sense for the specific program you are writing.

In a real program, refactoring may not be necessary: over-large code segments involving templates are more likely to involve multiple independent templates (class or function) rather than one large template class or one large interrelated set of template and normal classes. It may be sufficient to use LIBF on modules which intantiate separate parts of the existing templates to reduce each segment of each module to less than 64K. This is even more the case in Open Watcom 1.8, where templates are only instantiated when needed.

And the modules used with LIBF may not need to be specifically created for this purpose. It may take some time to do so, but it is quite possible that modules which form a natural part of a real program can be found which, when used with LIBF, solve the over-large code segment problem for a real program.

The situation in the test framework, where no two of the functions test12(), test13(), test2() and test3() can end up in the same segment, is not likely to appear in a real program, and having to use LIBF with every single module except one or two is also not likely to be necessary in a real program.

Other Options and Directives

This section discusses other compiler options in general and one linker options in particular which is sometimes suggested as a solution to the over-large code segment problem.

Compiler Options

The compiler has a great many options; I have focused on only six (-ml, -zm, -d2i, -d3i, -d2s, and -d3s) -- and only two of those (-ml and -zm) can truly be said to be part of the solution to the problem. The other four (-d2i, -d3i, -d2s, and -d3s) were discussed, in part, because, historically, it was the use of -d2i that caused me to discover -zm (and linker directive LIBF) and, in part, because using them can help in debugging C++ programs and so they are very helpful in solving problems otherwise difficult to explore, which, at least for -d2i and -d3i, makes dealing with any resulting over-large code segment errors relevant to this tutorial.

The reasons I have not mentioned other options are:

  1. Other options should be used or not used depending on whether or not you want or need their effects, not to solve the over-large code segment problem.
  2. Those other options which do affect code size may vary in their effects depending on the nature of the code to which they are applied and so cannot be relied on to have the same effect on every program.

Of course, if you are required to use a small-code model, and so are outside the scope of this tutorial, then playing around with the compiler switches to reduce the code size may make sense.

Op EL

Using the linker option OPTION ELIMINATE (short form OP EL) is, from time to time, suggested as a solution to this problem. And, indeed, the first sentence of its description in the documentation can give the impression that it might be useful here:

The "ELIMINATE" option can be used to enable dead code elimination.

The usage notes also suggest that it requires the use of -zm to be effective, which reinforces this impression.

However, the next sentence of its description shows that it is, in fact, useless for this purpose:

Dead code elimination is a process the linker uses to remove  
unreferenced segments from the application.

Note that it applies to "segments". OP EL will (when used properly) remove entire code segments and so reduce the overall size of the executable. It will not remove unused code from a segment, and so reduce its size. It should be clear from the tutorial above that this makes OP EL irrelevant to the problem of over-large code segments.

Personal tools