Windows, even in its latest incarnations, still exhibits quite a bit of quirky behavior
which is due to its DOS roots, or at least due to the attempt to remain compatible
with code which was created for DOS. Most of the time, I am not even surprised anymore
when I come across 16-bit limitations or similar reminiscences of the past.
But sometimes, I only become aware of them when my code crashes.
This happened some time ago with an application I am working on. When I started
the app in a certain way, it would simply crash very early during startup. It took
a while to break this down into the following trivial code example which consists
of a main executable and a DLL which is loaded into the executable via
LoadLibrary,
i.e. dynamically. Here is the code for the main executable,
SampleApp.cpp
:
#include <stdio.h>
#include <conio.h>
#include <windows.h>
#include <psapi.h>
static void EnumModules(const char *msg)
{
printf("\n==========================================================\n");
printf("List of modules in the current process %s:\n", msg);
HMODULE hMods[1024];
DWORD cbNeeded;
HANDLE hProcess = GetCurrentProcess();
// inquire modules loaded into process
if( EnumProcessModules(hProcess, hMods, sizeof(hMods), &cbNeeded)) {
// print name and handle for each module
for ( unsigned int i = 0; i < (cbNeeded/sizeof(HMODULE)); i++ ) {
char szModName[MAX_PATH];
if ( GetModuleFileNameEx( hProcess, hMods[i], szModName, sizeof(szModName))) {
printf(" %s (0x%08X)\n", szModName, hMods[i] );
}
}
}
CloseHandle( hProcess );
}
extern "C" __declspec(dllexport) int functionInExe(void)
{
printf("Now in functionInExe()\n");
return 42;
}
int main(void)
{
EnumModules("before loading DLL");
HMODULE hmod = LoadLibrary("SampleDLL.dll");
EnumModules("after loading DLL");
printf("\nPress key to exit.\n");
_getch();
return 0;
}
This code loads a DLL called
SampleDLL.dll
. Before and after loading
the DLL, it enumerates the modules which are currently loaded into the process;
this is only to demonstrate the effect which led to the crash in the other app I
was working on.
SampleDLL.dll
is built from this code (
SampleDLL.cpp
):
extern "C" __declspec(dllimport) int functionInExe(void);
extern "C" __declspec(dllexport) void gazonk(void)
{
int i = functionInExe();
}
The main executable exports a function called
functionInExe
, and the DLL
calls this function, and so it has an explicit reference to the main executable
which the linker needs to resolve. This is an important piece of the puzzle.
And here is a simple
makefile
which shows how to build the two modules:
all: SampleApplication.exe SampleDLL.dll
clean:
del *.obj *.exe *.dll
SampleApplication.exe: SampleApp.obj
link /debug /out:SampleApplication.exe SampleApp.obj psapi.lib
SampleApp.obj: SampleApp.cpp
cl /Zi /c SampleApp.cpp
SampleDLL.dll: SampleDLL.obj
link /debug /dll /out:SampleDLL.dll SampleDLL.obj SampleApplication.lib
SampleDLL.obj: SampleDLL.cpp
cl /Zi /c SampleDLL.cpp
Let's assume that the above files (
SampleApp.cpp
,
SampleDLL.cpp
and
makefile
)
are all in a directory
c:\temp\dupemod
, and that we built the code by
running
nmake
in that directory. Now let's run the code as shown in
the screenshots below.
Take a close look at the command shell window on the right: After loading the DLL,
the process maps both
c:\temp\dupemod\Sample~1.exe
and
c:\temp\dupemod\SampleApplication.exe
into its address space.
Both refer, of course, to the same file, which means that we have loaded
the executable twice!
This happens only if we run the executable using its 8+3 DOS name, i.e.
Sample~1.exe
.
When run with a long filename, everything works as expected. So when we load
SampleDLL.dll
, the OS loader tries to resolve the references which this
DLL makes to other modules. One of those modules is
SampleApplication.exe
.
The OS loader
should be able to map this reference to the instance of the
executable which is already mapped into the address space. However, it seems
that the OS loader cannot figure out that
Sample~1.exe
and
SampleApplication.exe
are
actually the same file, and therefore loads another instance of the executable!
BTW, this happens both on Windows 2000 and Windows XP systems.
In this trivial example, the only damage done is probably just that the main
executable consumes twice the virtual address space. In
a large application, the consequences can be more severe, and in our case
they were.
Microsoft also documents some effects of this issue in Knowledge Base articles,
for example
KB218475
and
KB193513.
The only workarounds I see are:
- Rename the executable to that it uses a name which
fits into the 8+3 format.
- Make sure that nobody ever runs the executable
using its short name.
We basically went with the latter approach - by making sure that end users
always run the application by double-clicking shortcuts which contain
the full executable name, and by fixing an interesting related bug in ATL
which, hopefully, I may have the time and the nerves to describe in
more detail one fine day...