In an ATL COM client which uses #import to generate wrapper code for objects,
I recently tracked down a subtle reference-counting issue down to this single line:
This code calls a method ILoadComponents on an application object which returns
an array of components. Innocent-looking as it is, this one-liner caused me
quite a bit of grief. If you can already explain what the reference counting
issue is, you shouldn't be wasting your time reading this blog. For the rest
of us, I'll try to dissect the problem.
(And for those who don't want to rely on my explanation: After I had learnt
enough about the problem so that I could intelligently feed Google with
search terms, I discovered a Microsoft
Knowledge Base
article on this very topic. However, even after reading the article, some details
were still unclear to me, especially since I don't live and breathe ATL all day.)
The #import statement automatically generates COM wrapper functions. For
ILoadComponents, the wrapper looks like this:
IComponentArrayPtr is a typedef-ed template instance of
_com_ptr_t.
The constructor used in the code snippet above will only call AddRef
on the interface pointer if its second argument is true. In our case, however,
the second arg is false, so AddRef will not be called. The IComponentArrayPtr
destructor, however, always calls Release().
Feeling uneasy already? Yeah, me too. But let's follow the course of action a little
bit longer. When returning from the wrapper function, the copy constructor of the
class will be called, and intermediate IComponentArrayPtr objects will be
created. As those intermediate objects are destroyed, Release() is called.
Now let us assume that the caller looks like above, i.e. we assign the return value
of the wrapper function to a CComPtr<IComponentArray> type. The sequence
of events is as follows:
Wrapper function for ILoadComponents is called.
Wrapper function calls into the COM server. The server returns
an interface pointer for which AddRef() was called (at least)
once inside the server. The reference count is 1.
Wrapper function constructs an IComponentArrayPtr smart pointer
object which simply copies the interface pointer value, but
does not call AddRef(). The refcount is still 1.
Now we return from the wrapper function. In C++, temporary objects
are destroyed at the end of the "full expression" which creates them. See also
section 6.3.2 in Stroustrup's "Design and Evolution of C++". This means that
the following assignment is safe:
ILoadComponents returns an object of type IComponentArrayPtr. At this
point, the reference count for the interface is 1 (see above). The
The compiler casts IComponentArrayPtr to IComponentArray*, then calls the
CComPtr assignment operator which copies the pointer and calls AddRef on it.
The refcount is now 2. At the completion of the statement, the temporary
IComponentArrayPtr is destroyed and calls Release on the interface. The
refcount is 1. Just perfect.
Now back to the original client code:
Here, we assign to a "raw" interface pointer, rather than to a CComPtr,
When returning from the wrapper function,
the refcount for the interface is 1. The compiler casts IComponentArrayPtr
to IComponentArray* and directly assigns the pointer. At the
end of the statement (i.e. the end of the "full expression"), the temporary
IComponentArrayPtr is destroyed and calls Release, decrementing the
refcount is 0. The object behind the interface pointer disappears, and
subsequent method calls on compArray will fail miserably or crash!
So while ATL, in conjunction with the compiler's #import support,
is doing its best to shield us from the perils of reference counting
bugs, it won't help us if someone pulls the plug from the ATL force-field
generator by incorrectly mixing smart and raw pointers.
This kind of reference counting bug would not have occurred if I had
used raw interface pointers throughout; the mismatch in calls to AddRef
and Release would be readily apparent in such code. However, those
smart pointers are indeed really convenient in practice because
they make C++ COM code so much simpler to read. However, they do not
alleviate the programmer from learning about the intricacies of
reference counting. You better learn your IUnknown before you do
CComPtr.
This reminds me of Joel Spolsky's
The Perils of JavaSchools,
which is soooo 1990 (just like myself), but good fun to read.
The other day, we were writing some .NET code for which we produced COM wrappers,
including a type library (through tlbexp). One of the exported methods
looked like this (in Managed C++):
In a COM test client which referred to said type library, the compiler
reported inexplicable errors. They hinted that the parameter named
(!) type somehow was thought to be of type System.Type...
but why? type is correctly declared as an ActionType, not as System::Type!
I won't tell you what other means we used to track down this issue;
let me just advise you to clean those chicken blood stains as early as
possible, before they stick to your keyboard just like, well, chicken blood .-)
In the end, Adam Nathan's blue bible had the right hint for us: Type libraries maintain a case-insensitive
identifier table. Once an identifier has been added to the table in one
case, any subsequent attempts to add the identifier to the table again
will simply fail, regardless of the case.
So in our example, the first "type"-ish identifier which was added to the table
was System::Type. (Or maybe it was actually a parameter called type which
was of type System::Type?). Later, the parameter name type was encountered,
but no longer added to the table because of its unlikely relative which made it
into the table first. Any subsequent references to anything called "type"
or "Type" or "tYPE" would then resolve to System::Type, with the aforementioned
consequences.
New software section (31.12.2005)
After hesitating for quite a while, I decided to split my blog into two
sections, one of which will focus on software development. While my
general blog is in German, the software section will be in English to broaden the
potential audience.
-- ClausBrod - 31 Dec 2005
Next month: Click here?.