These days, I spend quite some time in Microsoft's Windows Error Reporting
forum, which is where David Ching, who is a Microsoft MVP, posed
an interesting problem this week.
On Vista, Windows Error Reporting will create and transmit minidump files only if the WER
servers request them. At least this seems to be the default behavior which both David and I have
observed on Vista systems. David, however, wanted to make sure that whenever an application
crashes, a minidump file is generated which the user or tester can then send directly
to the developers of the application for analysis - even if Microsoft's WER servers
never actually request the minidumps, which, as far as I can tell, is the default
for applications which have not been explicitly registered with and mapped
at
Winqual.
My first idea was to force the system into queuing mode. When crash reports are queued,
minidumps are always generated and stored locally, so that they can be transmitted to
the error reporting server later on. Queuing is enabled by setting
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\Windows Error Reporting\ForceQueue
(
DWORD
) to 1. (See
WER Settings
for documentation on this and other WER-related registry keys.)
Crash report data will be stored in directories such as
c:\Users\someusername\AppData\Local\temp
and
C:\ProgramData\Microsoft\Windows\WER\ReportQueue
.
That works, but it also suppresses the WER UI, which isn't ideal either. Isn't
there some way to have the cake and eat it, too?
Let's see: A variation of the above approach is to disable the Internet connection before
the crash occurs. You'll get the dialogs, but WER won't be able to connect to the Microsoft
servers, and so it should then also queue the crash information. Alternatively,
and this is something that I have tried myself a few times, you could set
HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\Windows Error Reporting\CorporateWERServer
(string) to the name of some non-existing system. When a crash occurs, WER will try
to contact that server, find that it's not responding, and then store all crash data
locally so that it can be re-sent when the connection is later established.
Or you could go all the way and actually install such a Corporate Error Reporting server
on one of your systems. Probably one of the best solutions, since this gives you
direct access to minidump files within your organization.
But this blog isn't about IT, it's about hacking and coding
Here's an idea
how David's goals could be accomplished without implementing a full-blown
crash handler:
And here's the demo code which demonstrate this technique:
// Demo program using SetUnhandledExceptionFilter() and
// MiniDumpWriteDump().
//
// Claus Brod, http://www.clausbrod.de/Blog
#include <windows.h>
#include <DbgHelp.h>
#pragma comment(lib, "DbgHelp.lib")
#include <stdio.h>
static LONG WINAPI myfilter(_EXCEPTION_POINTERS *exc_ptr)
{
static const char *minidumpFilename = "myminidump.mdmp";
HANDLE hDumpFile = CreateFile(minidumpFilename, GENERIC_WRITE, 0, NULL,
CREATE_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL);
if (hDumpFile != INVALID_HANDLE_VALUE) {
__try {
MINIDUMP_EXCEPTION_INFORMATION exceptionInfo;
exceptionInfo.ThreadId = GetCurrentThreadId();
exceptionInfo.ExceptionPointers = exc_ptr;
exceptionInfo.ClientPointers = false;
BOOL ret = MiniDumpWriteDump(GetCurrentProcess(),
GetCurrentProcessId(), hDumpFile, MiniDumpNormal, &exceptionInfo, NULL, NULL);
if (ret) {
printf("Minidump information has been written to %s.\n", minidumpFilename);
}
} __except(EXCEPTION_EXECUTE_HANDLER) { }
CloseHandle(hDumpFile);
}
return EXCEPTION_CONTINUE_SEARCH;
}
static int wedding_crasher(int *pp)
{
*pp = 42;
return 42;
}
int main(void)
{
SetUnhandledExceptionFilter(myfilter);
wedding_crasher(0);
return 0;
}
And finally, here's a
really weird idea from Dmitry Vostokov:
Resurrecting Dr. Watson on Vista If you're into exception handling and crash analysis, Dmitry's
http://www.dumpanalysis.org/ web site is a fantastic resource. This guy
lives
in an exception filter
In the first part of this mini-series, I
demonstrated the ReportFault API
and why it didn't fit my needs on Vista. Last time around, I discussed
my first attempt to use the new
Windows Error Reporting (WER) APIs instead, which failed to produce
any crash reports on Microsoft's Winqual site.
When the curtain fell last time, I had a WER test application which,
on the surface,
appeared to work, but didn't manage to get any crash
reports through to Winqual. Also, entries for crash reports produced
by this application looked a little funny in Vista's Problem History window:
In particular, the
Bucket ID value stands out. What are
bucket IDs? Essentially,
the Winqual site combines various attributes of the crash report (application,
signatures, crash address etc.) and creates a unique integer value from them,
which then becomes an identifier for this particular type of crash.
All my WER-induced crash reports submitted from Vista clients always had a
bucket ID of 8, regardless of which test application I used and how exactly I provoked the
crash. Also, I knew from earlier, successful attempts to talk to the Winqual
servers how
real bucket IDs usually look like (much larger integers).
Something fishy was going on here.
The application I tested was properly registered, signed and mapped
at the Winqual site, and crash reports submitted from XP systems made it to
the Winqual servers just fine. Hence, registration issues could be ruled
out. I posted to the
Windows Error Reporting forum
and asked for help and clarification. Saar Picker responded: "We filter out
unknown event types. Since your report is not of a recognized event type, it
is being rejected. The Bucket ID 8 event is reporting the rejection to us."
So my crash reports were
not of a recognized event type. What's a poor
crash report supposed to do to be recognized?
The first parameter for
WerReportCreate
is an event type. The documentations says: "
wzEventType - A pointer to a
Unicode string that specifies the name of the event." Hmmm, so maybe this is
the event type that Saar mentioned. If so, what kind of event are we talking
about? Win32 events? Events like the ones captured in the Windows event log?
None of those, as it turns out. Instead, error reporting servers can define
types of error events that they want to capture.
Microsoft's Winqual servers, for example, are configured to accept event types
which represent application or operating system crashes.
So what is the magic event type which represents an application crash?
Hint 1: The
werapi.h
header file defines an undocumented macro constant called
APPCRASH_EVENT
.
#define APPCRASH_EVENT L"APPCRASH"
Hint 2: When a crash report is submitted using
WerReportSubmit
, this API tries to
contact the error reporting server. In Vista, the protocol is based
on XML snippets which the client sends to the server via HTTP. One of
the attributes in the initial XML that is transmitted is called
eventtype
,
and for applications which do not try to handle fatal crashes themselves,
the value of that attribute is indeed "APPCRASH".
So I modified my WER code to use "APPCRASH" instead of some arbitrary
string. And indeed, this made a difference, although not the one I had hoped for:
With the new event type,
WerReportSubmit()
now returned an error
(
E_FAIL
), where it previously succeeded...
To debug the problem, I intercepted the XML exchange between the client
and the server, and looked at the differences between a non-WER client
and my own test code. (If you're interested in the interception details,
drop me a line.) The non-WER client transmitted
additional data (so-called "signature parameters"), and it also
specified a "report type" of 2 instead of 1. So my strategy was
to eliminate the differences one by one by working the WER APIs.
The extra parameters sent by the non-WER client were things like the
application's name, version and timestamp; the faulting module's name,
version and typestamp; and the exception code and address offset.
And now, finally, I understood the purpose of the
underdocumented
WerReportSetParameter
API - depending on the server's setup, it expects certain extra
parameters to safely identify an event, and those can be set using
WerReportSetParameter
:
static void wer_report_set_parameters(HREPORT hReportHandle,
EXCEPTION_POINTERS *exc_ptr)
{
TCHAR moduleName[1024];
get_module_name(NULL, moduleName, _countof(moduleName));
pWerReportSetParameter(hReportHandle, 0, L"Application Name", moduleName);
TCHAR buffer[1024];
get_module_file_version(moduleName, buffer, _countof(buffer));
pWerReportSetParameter(hReportHandle, 1, L"Application Version", buffer);
HMODULE hModule = GetModuleHandle(0);
DWORD timeStamp = GetTimestampForLoadedLibrary(hModule);
_sntprintf_s(buffer, _countof(buffer), _TRUNCATE,
__T("%x"), timeStamp);
pWerReportSetParameter(hReportHandle, 2, L"Application Timestamp", buffer);
// determine module name from crash address
moduleName[0] = 0;
void *exceptionAddress = exc_ptr->ExceptionRecord->ExceptionAddress;
if (GetModuleHandleEx(GET_MODULE_HANDLE_EX_FLAG_FROM_ADDRESS |
GET_MODULE_HANDLE_EX_FLAG_UNCHANGED_REFCOUNT,
(LPCTSTR)exceptionAddress, &hModule)) {
get_module_name(hModule, moduleName, _countof(moduleName));
}
pWerReportSetParameter(hReportHandle, 3, L"Fault Module Name", moduleName);
get_module_file_version(moduleName, buffer, _countof(buffer));
pWerReportSetParameter(hReportHandle, 4, L"Fault Module Version", buffer);
timeStamp = GetTimestampForLoadedLibrary(hModule);
_sntprintf_s(buffer, _countof(buffer), _TRUNCATE,__T("%x"), timeStamp);
pWerReportSetParameter(hReportHandle, 5, L"Fault Module Timestamp", buffer);
_sntprintf_s(buffer, _countof(buffer), _TRUNCATE,
__T("%08x"), exc_ptr->ExceptionRecord->ExceptionCode);
pWerReportSetParameter(hReportHandle, 6, L"Exception Code", buffer);
INT_PTR offset = (char *)exceptionAddress - (char *)hModule;
_sntprintf_s(buffer, _countof(buffer), _TRUNCATE, __T("%p"), offset);
pWerReportSetParameter(hReportHandle, 7, L"Exception Offset", buffer);
}
The other significant change was to use the undocumented
WerReportApplicationCrash
constant
as the "report type" parameter for
WerReportCreate
. After these changes,
the Winqual servers finally started talking to me: I received bucket IDs, sometimes also
requests to transmit minidump data - and after a few days, the crash reports appeared
on the Winqual site! Whoopee!
The full demo code is
attached. To build, open
a Visual Studio command prompt and run the compiler:
cl werapitest.cpp
My special thanks to Saar Picker and Jason Hardester at Microsoft for their help!
Now that I've achieved my original goal (reporting crashes using the WER APIs
under Vista), let me spoil the fun by warning you to ever use this approach. Why?
Because this is clearly not the way Microsoft recommends to handle application
crashes. Now, while I'm not sure whether Microsoft as a whole has an official
recommendation, the documentation or the postings in newsgroups in blogs clearly
suggest that an application shouldn't actually even try to handle a crash
explicitly - instead, it should just crash and let the OS do the reporting.
The basic rationale behind this is that an application is probably already
deeply confused when a crash occurs, and some of its data may already have been
damaged. This makes crash recovery a difficult and unreliable endeavor.
There are circumstances where an application needs to keep control of the reporting
process, but Microsoft expects such cases to be very rare. Which explains
a lot of the initial communication disconnects that I experienced while discussing
my case with Saar and Jason.
There's a reason why it's called "WER" (
Windows
Error
Reporting)
and not "WCR" (
Windows
Crash
Reporting). Apparently, Microsoft doesn't
expect us to use those APIs for crash reporting, but rather for more generic
"error" or "event" reporting. For example,
this U.S. patent claim
discusses how the WER APIs can be used to report failures in handwriting recognition.
(By the way, there's also a patent for WER itself, see
http://www.freepatentsonline.com/20060271591.html.)
A few days ago, I reported about the
peculiarities of the ReportFault API,
particularly on Windows Vista, and how those peculiarities drove me to
give in to Microsoft's sound advice and use the new and shiny
Windows Error Reporting (WER) APIs
on Vista.
ReportFault() is a great
one-stop shopping API: A one-liner will display
all required dialogs, ask the user if he wants to contact Microsoft,
create report data (including
minidumps)
if required, and send the whole report off to Microsoft.
The new
WER APIs
in Vista are slightly more complex, but also provide more control for
the details of error reporting. Well, if you know how to handle the APIs,
that is. Apparently, I do
not know how to handle them since I still
haven't solved all the problems around them.
More on this in a moment. Let's first take a look at the core of a test application I wrote:
static bool report_crash(_EXCEPTION_POINTERS *inExceptionPointer)
{
// Set up parameters for WerReportCreate()
WER_REPORT_INFORMATION werReportInfo;
memset(&werReportInfo, 0, sizeof(werReportInfo));
werReportInfo.dwSize = sizeof(werReportInfo);
wcscpy_s(werReportInfo.wzFriendlyEventName,
_countof(werReportInfo.wzFriendlyEventName),
L"werapitest (friendly event name)");
wcscpy_s(werReportInfo.wzApplicationName,
_countof(werReportInfo.wzApplicationName), L"");
wcscpy_s(werReportInfo.wzDescription,
_countof(werReportInfo.wzDescription), L"Critical runtime problem");
PCWSTR eventType = L"werapitest (eventType)"; // APPCRASH
HREPORT hReportHandle;
if (FAILED(pWerReportCreate(eventType, WerReportCritical,
&werReportInfo, &hReportHandle)) || !hReportHandle) {
return false;
}
bool ret = false;
WER_EXCEPTION_INFORMATION werExceptionInformation;
werExceptionInformation.bClientPointers = FALSE;
werExceptionInformation.pExceptionPointers = inExceptionPointer;
bool dumpAdded = SUCCEEDED(pWerReportAddDump(hReportHandle, ::GetCurrentProcess(),
::GetCurrentThread(), WerDumpTypeMiniDump, &werExceptionInformation, NULL, 0));
if (!dumpAdded) {
FATAL_ERROR("Minidump generation failed.\n");
}
DWORD submitOptions = WER_SUBMIT_OUTOFPROCESS | WER_SUBMIT_NO_CLOSE_UI;
WER_SUBMIT_RESULT submitResult;
if (SUCCEEDED(pWerReportSubmit(hReportHandle, WerConsentNotAsked,
submitOptions, &submitResult))) {
switch(submitResult)
{
// ... decode result ...
}
}
pWerReportCloseHandle(hReportHandle);
return ret;
}
static int filter_exception(EXCEPTION_POINTERS *exc_ptr)
{
report_crash(exc_ptr);
return EXCEPTION_EXECUTE_HANDLER;
}
static void wedding_crasher(void)
{
__try {
int *foo = (int *)0;
*foo = 42;
} __except(filter_exception(GetExceptionInformation())) {
printf("Now in exception handler, process is still alive!\n");
}
Sleep(5000);
}
int main()
{
HMODULE hWer = LoadLibrary("Wer.dll");
if (hWer) {
pWerReportCreate =
(pfn_WERREPORTCREATE)GetProcAddress(hWer, "WerReportCreate");
pWerReportSubmit =
(pfn_WERREPORTSUBMIT)GetProcAddress(hWer, "WerReportSubmit");
pWerReportCloseHandle =
(pfn_WERREPORTCLOSEHANDLE)GetProcAddress(hWer, "WerReportCloseHandle");
pWerReportAddDump =
(pfn_WERREPORTADDDUMP)GetProcAddress(hWer, "WerReportAddDump");
}
if (!pWerReportCreate || !pWerReportSubmit ||
!pWerReportCloseHandle || !pWerReportAddDump) {
printf("Cannot initialize WER API.\n");
return 1;
}
wedding_crasher();
return 0;
}
The fundamental approach is still the same as for the
ReportFault
test program
presented recently:
- A structured exception block is established using
__try
and __except
.
- Code provokes an access violation.
- The exception filter
filter_exception
is consulted by the
exception handling infrastructure to find out how to proceed
with the exception.
- The filter calls the WER APIs to display the crash dialog(s),
and to give the user options to debug the problem, ignore it,
or report it to Microsoft.
- The exception filter returns
EXCEPTION_EXECUTE_HANDLER
to indicate that its associated exception handler should
be called.
The following WER APIs are used to create and send a crash report:
The WER APIs do indeed solve a problem that I found with
ReportFault
on Vista: They don't force the calling process to be terminated, and
allow me to proceed as I see fit. That's really good news.
The problem I haven't resolved yet is this: Even though I call
WerReportAddDump
, I have no idea whether minidump data are
actually generated and sent. In fact, from the feedback
provided by the system, it seems likely that those data are
not
generated.
To illustrate my uncertainties, I wrote a test program called
werapitest
.
The code is attached as a ZIP file; unpack it into a directory, open
a Visual Studio command prompt window, and build the code as follows:
cl werapitest.cpp
Run the resulting executable, then open up the "Problem Reports and Solutions"
control panel and click on "View problem history". On my system, I get
something like this:
Double-clicking on the report entry leads to this:
The problem history entry does not mention any attached files, such as
minidump data!
When a crash occurs, the system also writes entries into the event log;
those log entries claim there
are additional data in paths such as
C:\Users\clausb\AppData\Local\Microsoft\Windows\WER\ReportArchive\Report0f8918ad
,
and indeed, such directories exist and each contain a file called
Report.wer
,
which holds data such as:
Version=1
EventType=werapitest (eventType)
EventTime=128266502225896608
ReportType=1
Consent=1
UploadTime=128266502257542112
Response.BucketId=8
Response.BucketTable=5
Response.type=4
DynamicSig[1].Name=OS Version
DynamicSig[1].Value=6.0.6000.2.0.0.256.16
DynamicSig[2].Name=Locale ID
DynamicSig[2].Value=1033
UI[3]=werapitest.exe has stopped working
UI[4]=Windows can check online for a solution to the problem.
UI[5]=Check online for a solution and close the program
UI[6]=Check online for a solution later and close the program
UI[7]=Close the program
State[0].Key=Transport.DoneStage1
State[0].Value=1
State[1].Key=DataRequest
State[1].Value=Bucket=8/nBucketTable=5/nResponse=1/n
FriendlyEventName=werapitest (friendly event name)
ConsentKey=werapitest (eventType)
AppName=werapitest.exe
AppPath=C:\tmp\werapitest.exe
ReportDescription=Critical runtime problem
So again, the minidump is not mentioned anywhere.
Now let's try some minimal code which uses neither
ReportFault
nor
the new WER API:
int main(void)
{
int *p = (int *)0;
*p = 42;
return 0;
}
After running this code and letting it crash and report to Microsoft, I get the following
problem history entry:
This problem report contains a
lot more data than the one for
werapitest
,
and it even refers to a minidump file which was apparently generated by the system
and probably also sent to Microsoft.
So the lazy code which doesn't do anything about crashes gets full and proper
service from the OS, while the application which tries to deal with a crash in
an orderly manner and elaborately goes through all the trouble of using the
proper APIs doesn't get its message across to Microsoft. I call this unfair
Oh, and in case you're wondering: Yes, we've registered with Microsoft's Winqual site
where the crash reports are supposed to be sent to, and we established
"product mappings" there, and the whole process seems to work for XP clients
just fine.
I'm pretty sure that I'm just missing a couple of details with the new
APIs, or maybe I'm misinterpreting the feedback from the system.
I ran numerous experiments and umpteen variations, I've searched the web
high and low, read the docs, consulted newsgroups
here
and
there -
and now I'm running out of ideas. Any hints most welcome...
PS: I did indeed receive some hints. For updated WER code, along with
an explanation on why the above failed, see
Crashing with style on Vista, part II.
How can you tell that you're the control freak type of Windows programmer?
Easy: You feel that irresistible urge to install top-level exception handlers which
report application crashes to the end user and provide useful options on how to
proceed, such as to report the issue to the software vendor, save the currently
loaded data, inspect the issue in more detail, or call the police.
In fact, this is pretty much what
Windows Error Reporting
is all about, only that the crash reports are sent to Microsoft
first (to their
Winqual site, that is), from where
ISVs can then download them for further analysis. Oh, and the other difference is that
Microsoft dropped the "call the police" feature in order to get Vista done in time.
One of the applications that I'm working on already had its own top-level crash handler
which performed some of the services also provided by Windows Error Reporting. It
was about time to investigate Microsoft's offerings in this area and see how they
can replace or augment the existing crash handler code.
The first option I looked at was the
ReportFault
API. Microsoft's documentation says that the function is obsolete, and we should rather use a different
set of APIs collectively called the "WER functions". However, understanding them requires a lot more
brain calories than the trivial
ReportFault
call which you can simply drop into an
exception filter,
and you're done.
The required code is pretty trivial and looks roughly like this:
int filter_exception(EXCEPTION_POINTERS *exc_ptr)
{
EFaultRepRetVal repret = ReportFault(exc_ptr, 0);
switch (repret)
{
// decode return value...
//
}
return EXCEPTION_EXECUTE_HANDLER;
}
void main(void)
{
__try {
int *foo = (int *)0;
*foo = 42;
} __except(filter_exception(GetExceptionInformation())) {
_tprintf(__T("Nothing to see here, move on, process is still alive!\n"));
}
Sleep(5000);
}
Sequence of events:
- A structured exception block is established using
__try
and __except
.
- Code provokes an access violation.
- The exception filter
filter_exception
is consulted by the exception
handling infrastructure to find out how to proceed with the exception.
- The filter calls
ReportFault
to display the crash dialog as shown above,
and to give the user options to debug the problem, ignore it, or report
it to Microsoft.
- After performing its menial reporting duties, the exception filter
returns
EXCEPTION_EXECUTE_HANDLER
to indicate that its associated
exception handler should be called.
That exception handler is, in fact, essentially the
_tprintf
statement
which spreads the good news about the process still being alive.
On XP, that is. On Vista, the
_tprintf
statement may actually never execute.
You'll still get a nice reporting dialog, such as the one in the screenshot
to the right, but when you click the "Close program" button, the calling process
will be terminated immediately, i.e.
ReportFault
never really
returns to the caller!
I debugged into
ReportFault
on my Vista machine and found that
ReportFault
spawns off a process called
wermgr.exe
which performs the actual work.
My current hypothesis is that it is
wermgr.exe
which terminates
the calling process if the user chooses "Close program".
If you want to try it yourself, click
here
to download the demo code. To compile, simply run it through
cl.exe
:
cl.exe reportfault.cpp
Now, can we complain about this, really? After all, you can't call it surprising
if a program closes after hitting the "Close program" button.
Still, the behavior differs from the old XP dialog - and it is inconsistent
even on Vista. What I just described is the behavior that I found with the default
error reporting settings in Vista. By default, Vista "checks for solutions
automatically" and doesn't ask the user what to do when a crash occurs.
This can be configured in the "Problem Reports and Solutions" control panel:
After changing the report settings as shown above ("Ask me") and then running the test
application again, the error reporting dialog looks like this:
When I click on "Close program"
now, guess what happens - the process does
not terminate,
and the
_tprintf
statement in my exception handler is executed, just like on XP!
So that "Close program" button can mean two different things on Vista...
It's not just this inconsistency which bugged me. I also don't like the idea of
letting the error reporting dialog pull the rug from under my feet. Sure, I'd like to
use the dialog's services, but when it returns, I want to make my
own decisions
about how to proceed. For example, I could try and save the currently loaded data
in my application, or I could add my own special reporting. Or call the cops.
ReportFault
won't let me do that on Vista. And so I set out to
burn those extra brain calories anyway and learn about the new WER APIs
which were introduced with Windows Vista.
And burn calories I did, oh yes. More on this hopefully soon.
Previous month: Click
here.