Exception Handling: A False Sense Of Security

C++, Error Handling No Comments »

Long-time readers know that I have a bit of a penchant for error handling, especially with respect to exceptions. I just noticed that I have never, to my knowledge, posted about the classic article “Exception Handling: A False Sense of Security” by Tom Cargill.

Read it and weep.

C++ Exception Handling

C++, Error Handling No Comments »

Quick quiz: What is the behavior of the following code:

#include <stdio.h>

int main(int argc, char* argv[])
{
    try
    {
        *((char*) 0) = 1;
    }
    catch (…)
    {
        printf(”Caught exception.\n”);
    }

    return 0;
}

The answer is below the fold.

Read the rest of this entry »

C++ COM Error Handling

C++, Error Handling No Comments »

After a long hiatus, I wrote a bit of code in C++ over the last two days. While I haven’t yet internalized a set of rules for error handling in C++ (see my previous posts about error handling A C/C++ Error Handling Discipline: Part 1, A C/C++ Error Handling Discipline: Part 2, A C/C++ Error Handling Discipline: Part 3, and On C++ Error Handling), I decided to use C++ exceptions in this application. Furthermore, since I was using MSXML, a lot of the error handling information was contained in IErrorInfo objects. To this end, I wrote the following macro:

#define ThrowComErrorIfFailed(x) \\
    do { \\
        HRESULT hr = (x); \\
        if (FAILED(hr)) { \\
            CComPtr<IErrorInfo> spErrorInfo; \\
            BOOL hasErrorInfo = (::GetErrorInfo(0, &spErrorInfo) == S_OK); \\
            _com_raise_error(hr, hasErrorInfo ? spErrorInfo : 0); \\
        } \\
    } while (0)

This macro uses the GetErrorInfo() function to retrieve the IErrorInfo object and _com_raise_error() to throw a _com_error class as the exception. (By the way, this macro can—and probably should—be written as an inline function. Old habits die hard, I guess.)

When I first wrote this macro, I thought it might be general enough to use in any COM application. Unfortunately, I was wrong. There’s no guarantee that the GetErrorInfo() call corresponds to the most recent COM call, as there’s no guarantee that it will be cleared or reset if the COM call doesn’t support IErrorInfo. A proper macro would first test if the COM object supports IErrorInfo by querying for ISupportErrorInfo and calling ISupportErrorInfo::InterfaceSupportsErrorInfo(). Only if these tests succeed would the macro then proceed to call GetErrorInfo().

To this end, I wrote the following code:

void ThrowComErrorIfFailed
    (
    HRESULT hr,
    IUnknown* pUnk,
    REFIID iid
    )
{
    if (FAILED(hr)) {
        bool hasErrorInfo = false;
        CComPtr<IErrorInfo> spErrorInfo;

        if (pUnk)
        {
            CComQIPtr<ISupportErrorInfo> spSupportErrorInfo(pUnk);
            if (spSupportErrorInfo) {
                if (spSupportErrorInfo->InterfaceSupportsErrorInfo(iid) == S_OK) {
                    if (::GetErrorInfo(0, &spErrorInfo) == S_OK) {
                        hasErrorInfo = true;
                    }
                }
            }
        }

        _com_raise_error(hr, hasErrorInfo ? spErrorInfo : 0);
    }
}

Note that to use this function, you must provide the COM object upon which you made the call (pUnk) and the exact interface of the object (iid). Because I used ATL’s CComPtr extensively, I also wrote the following utility function:

template <class T>
void ThrowComErrorIfFailed
    (
    HRESULT hr,
    const CComPtr<T>& comPtr
    )
{
    ThrowComErrorIfFailed(hr, comPtr, __uuidof(T));
}

This experience led me to the following observation: In order to correctly use IErrorInfo, you must know what object generated it. This means that if you call a COM object which implements ISupportErrorInfo, the call fails, and you return only the HRESULT failure code, you have lost the IErrorInfo information completely. Yet another reason to use exceptions, I guess…

On C++ Error Handling

C++, Error Handling No Comments »

Raymond Chen, a developer at Microsoft, recently wrote a weblog post entitled A rant against flow control macros which cautions against using goto or return in a macro. He uses this code example to demonstrate his point:

HRESULT SomeFunction(Block *p)
{
    HRESULT hr;
    EnterCriticalSection(&g_cs);
    VALIDATE_BLOCK(p);
    MUST_SUCCEED(p->DoSomething());
    if (andSomethingElse) {
        LeaveCriticalSection(&g_cs);
        TRAP_FAILURE(p->DoSomethingElse());
        EnterCriticalSection(&g_cs);
    }
    hr = p->DoSomethingAgain();

  Cleanup:
    LeaveCriticalSection(&g_cs);
    return hr;
}

It is rather difficult to determine the correctness of this code, as Chen points out:

Is the critical section leaked? What happens if the BLOCK fails to validate? If DoSomethingElse fails, does DoSomethingAgain get called? What’s with that unused “Cleanup” label? Is there a code path that leaves the “hr” variable uninitialized?

Obviously, given this example, Chen’s qualms against flow control in macros are quite justified. Instead, Chen seems to favor the “nested-if” style of programming, at least on his blog, as demonstrated by the following code sample from this post:

void OnLButtonDown(HWND hwnd, BOOL fDoubleClick,
                   int x, int y, UINT keyFlags)
{
    IDataObject *pdto;
    if (SUCCEEDED(GetUIObjectOfFile(hwnd,
                      L"C:throwaway.txt",
                      IID_IDataObject, (void**) &pdto))) {
        IDropSource *pds = new CDropSource();
        if (pds) {
            DWORD dwEffect;
            if (DoDragDrop(pdto, pds,
                    DROPEFFECT_COPY | DROPEFFECT_LINK | DROPEFFECT_MOVE,
                    &dwEffect) == DRAGDROP_S_DROP) {
                if (dwEffect & DROPEFFECT_MOVE) {
                    DeleteFile(TEXT("C:throwaway.txt"));
                }
            }
            pds->Release();
        }
        pdto->Release();
    }
}

While I admit that this code sample can be easily understood by anyone who knows C++, and it is much easier to understand than the first example, let me demonstrate what it might look like if it had followed my recommendation to use flow control constructs in macros to simplify error handling in C/C++ from here, here, and here:

void OnLButtonDown(HWND hwnd, BOOL fDoubleClick,
                   int x, int y, UINT keyFlags)
{
    HRESULT hr;
    IDataObject *pdto = NULL;
    IDropSource *pds = NULL;
    DWORD dwEffect;

    Chk(GetUIObjectOfFile(hwnd, L"C:throwaway.txt",
            IID_IDataObject, (void**) &pdto));

    pds = new DropSource();
    ChkTrue(pds != NULL, E_OUTOFMEM);

    if (DoDragDrop(pdto, pds,
            DROPEFFECT_COPY | DROPEFFECT_LINK | DROPEFFECT_MOVE,
            &dwEffect) == DRAGDROP_S_DROP) {
        if (dwEffect & DROPEFFECT_MOVE) {
            DeleteFile(TEXT("C:throwaway.txt"));
        }
    }

  Cleanup:
    if (pds != NULL) pds->Release();
    if (pdto != NULL) pdto->Release();
}

To me, this is mostly a wash. Some functions are highly amenable to the Chk()-style of programming, such as those which are a sequence of operations which must all succeed in order. However, it seems the more non-fatal errors that the function must deal with, the worse Chk() does. Then again, I can envision functions written in the “nested-if” style which quickly run off your screen, or miss a cleanup step.

However, the Chk()-style of programming has major flaws. Note that I had to pull up all variable declarations to the top of the function to avoid the dreaded error C2362: initialization of ‘identifierr’ is skipped by ‘goto label’. (Alternatively I could have added extra blocks). This flaw becomes especially annoying when dealing with C++ classes which use the RAII idiom and do not have a default constructor as I describe here. Additionally, the initialization and comparisons to NULL, as well as executing any default constructors, add extra, possibly unnecessary work.

Some of the above points may be moot. For example, as I note in this post, to support any type of “return-value-is-error-code” programming, your objects must have default constructors (or you construct objects on the heap and return pointers, which has its own disadvantages). The common alternative is to use an exception-based error-handling model and return anything you like from a function, but I am growing more and more leery of exceptions because of articles like this, this, and this; basically one must be very careful about cleaning up side-effects if an exception is thrown, and an exception may be thrown virtually anywhere. (Andrei Alexandrescu and Petru Marginean have developed an intriguing solution to this problem called the ScopeGuard.) Note that C# and Java also have the same side-effect related problems with exceptions, but as Michael Grier has noted, without deterministic finalization and references the problem is effectively insoluble in the general case. Scary. Furthermore, have you ever written code which needs to catch and recover from an exception? It is just ugly!

What’s the solution? I’m starting to lean towards Grier’s suggestion of RAII and status codes. RAII seems to largely be a no-brainer — perhaps augmented with ScopeGuard — but I’m not sure how to deal with returning objects from functions, especially if they don’t have a default constructor. As for preferring status codes to exceptions, I found this quote by Grier to be amusing:

Ada’s original guideline on exception usage was probably the best I’ve heard (something to the effect of “no correct program may rely on exception handling for correct execution”) but stuck its head in the sand on the issue of “uh, then what do you use them for?”

I suppose the answer to the question is truly exceptional situations, like when your computer is on fire — where reasonable recovery methods are unavailable.

However, there’s a problem. Exception-based error handling mechanisms are viral — just ask anyone who wants to use the STL from a COM component. It can be terribly annoying to write functions which look like:

HRESULT DoStuff(...)
{
    HRESULT hr;

    try {
        ... use a function which throws an exception
    } catch (...) {
        hr = E_FAIL;
    }

    ...
}

There’s also the problem that without exceptions, one cannot generate a failure from a constructor or overloaded operator. (BTW, one shouldn’t ever throw an exception from a destructor as per the basic exception safety guarantee).

I would be remiss if I did not also mention that there are some operations which may “fail”, in that they do something nonobvious and often unexpected, but do not produce an error code of any kind. The now-classic example is integer overflow, as illustrated by Chen’s post Integer overflow in the new[] operator. It may be a good idea to wrap these operations into the error-handling model you use — I showed how to throw an exception upon overflow here — but it introduces yet another possible point of failure and puts further demands on your error recovery mechanisms.

Error handling is hard!

Writing Exception-Safe Code

C++, Error Handling No Comments »

Today I ran across the article catch considered harmful on Michael Grier’s weblog which discusses some of the perils of exceptions. This, combined with articles in the C++ Users Journal about the difficulties in writing exception-safe containers, has given me terrible pause. Considering Herb Sutter’s excellent reputation, I want to read the following article as it may help clean up the mess:

Sutter, Herb. “When and How to Use Exceptions.” C++ Users Journal, August 2004.

It seems that most of the problems of exceptions come from interactions with persistent state, which functional languages (the original source of exceptions) do not have.

Programming is much more difficult than most programmers think it is, especially considering how even seemingly simple bugs such as integer overflow can lead to viruses and losses of billions of dollars in productivity.

A C/C++ Error Handling Discipline: Part 3

C, C++, Error Handling No Comments »

One major disadvantage of using the error handling style I mentioned in A C/C++ Error Handling Discipline is that as you are using out parameters as opposed to a return value, any object returned from a function must have a default constructor. The alternative is to return values directly from the function and use exceptions for error handling.

This obviously isn’t an issue if you are using COM as you are forced into HRESULT-style code.

A C/C++ Error Handling Discipline: Part 2

C, C++, Error Handling No Comments »

In my post A C/C++ Error Handling Discipline, I described a macro-based method for cleanly handling errors along with proper resource deallocation in C/C++. I want to point out a few observations I’ve made since then:

  • Using the version of the Chk() macro which tests for SUCCEEDED(hr) before executing the statement means that if you ever want to execute any code not within a Chk() block, you must explicitly check for SUCCEEDED(hr) before executing it. For example:

    #define Chk(f)
        do {
            if (SUCCEEDED(hr)) {
                hr = (f);
            }
        } while (0)
    
    HRESULT DoStuff()
    {
        HRESULT hr = S_OK;
        RESOURCE r1 = INVALID_RESOURCE_VALUE;
    
        Chk(AllocResource(&r1));
    
        // The following line of code will execute EVEN IF AllocResource failed.
        // Consider wrapping in SUCCEEDED(hr).
        if (SomeEarlyTerminationConditionIsMet())
            goto Cleanup;
    
        ...
    
    Cleanup:
        if (r1 != INVALID_RESOURCE_VALUE)
            FreeResource(r1);
    
        return hr;
    }
    

    This is different from the version of Chk() which uses goto to jump to the end of the function, as this guarantees no subsequent code will be executed. Therefore, I recommend using the goto-based version of Chk().

  • However, using the goto-based version of Chk() interacts poorly with C++, due to C++’s ability to declare a variable anywhere within the function. It interacts especially poorly with C++’s RAII (resource acquisition is initialization) idiom, as this may mean you cannot declare the variable until somewhere in the middle of your function (after you have computed some value the constructor requires, for example). The problem stems from the fact that the goto may jump past the constructor of any object after the Chk() test — this will cause most (all?) C++ compilers to generate an error. For example:

    #define Chk(f)
        do {
            hr = (f);
            if (FAILED(hr)) {
                goto Cleanup;
            }
        } while (0)
    
    HRESULT DoStuff()
    {
        HRESULT hr = S_OK;
    
        Chk(Function1());
    
        std::string s("This is an argument to Function2()");
        Chk(Function2(s));
    
    Cleanup:
        return hr;
    }
    

    Note how if Function1 fails the constructor for s will never execute, but s is still in scope and could hypothetically be used. Therefore, C++ compilers will usually forbid this construct.

    To solve this problem, one has a number of options:

    • Use a C-like style of declaration at the top of the function block and usage anywhere else in the function. The downside to this is that a default constructor must exist and any work done in the default constructor is wasted. Furthermore, this may directly conflict with the RAII idiom and it requires your objects to support an “uninitialized” state.
    • Encase the use of variables such as s in their own context by using { and }. This is a little ugly and sometimes requires recursive nesting, which can get out of hand.
    • Recommended: Write/use C++ class wrappers for all dynamically acquired resources using RAII and replace the goto within Chk() with return. However, once you’ve reached this point, you are only an easy step or two away from exceptions…
    • Hope that someday C++ compilers become smart enough to notice that even though you used goto to jump past the initialization of the object, you never use the object after that point, and thus there is nothing to worry about (however, this may not be technically possible).

Update: See this post for some more thoughts.

A C/C++ Error Handling Discipline: Part 1

C, C++, Error Handling No Comments »

One of my passions (although I don’t know why) is proper error handling in programming. I find the fact that so few people do it properly incredibly frustrating, but I find solace in the fact that doing proper error handling is hard, and I have yet to find a solution which I deem fully satisfactory. Exceptions come close, but their terrible performance implications, out-of-bound processing, and sometimes expressly forbidden nature (such as when writing a COM function) leads to the conclusion that they should only be used for exceptional situations and cannot be used universally.

An interesting technical article entitled Thinking About Error Cases discusses a number of commong error-handling styles found in C and C++. Without talking too much about the advantages and disadvantages to these styles (I shall do so in a future post, in addition to discussing a new style), I shall simply state matter-of-factly that many Windows components simply return HRESULTs as a status code and encode return values in function parameters, and this style is usually sufficient.

One of the advantages of using the HRESULT-based error-handling style is that if your function also returns a HRESULT, you can typically return any failure codes as-is (somewhat reminiscent of exceptions). A naïve implementation of such code might look like:

HRESULT DoStuff()
{
    HRESULT hr;

    hr = Function1();
    if (FAILED(hr))
        return hr;

    hr = Function2();
    if (FAILED(hr))
        return hr;

    return S_OK;
}

The main downside to this code is that if there are any resources that were allocated by the function (such as file handles, dynamically allocated memory, etc.) they almost always will not be deallocated when a failure occurs. An exception to this case is if you wrap your resources in C++ objects using the RAII (resource acquisition is initialization) idiom where the object destructors deallocate the resources.

Given that you are not wrapping your resources in C++ objects, how might you properly perform deallocation of resources? Your first attempt might be to have each failure deallocate any resources that were allocated previously, such as:

HRESULT DoStuff()
{
    HRESULT hr;
    RESOURCE r1, r2, r3;

    hr = AllocResource(&r1);
    if (FAILED(hr))
        return hr;

    hr = AllocResource(&r2);
    if (FAILED(hr)) {
        FreeResource(&r1);
        return hr;
    }

    hr = AllocResource(&r3);
    if (FAILED(hr)) {
        // Free in the reverse order of allocation
        FreeResource(&r2);
        FreeResource(&r1);
        return hr;
    }

    ...
}

This style is obviously tedious and error-prone. Furthermore, in addition to the duplication of cleanup code between failure blocks, the function must also duplicate the cleanup code once again as part of freeing resources during normal operation. However, this was something I personally, briefly experimented with when trying to settle on a error-handling style of my own.

Your next step might be to observe that you only need to execute a subsequent operation if all previous operations were successful. By keeping track of what resources have been successfully allocated (an easy task when you can represent a not-yet-allocated resource by a unique value such as NULL), you can use this to your advantage by writing code in the following style:

HRESULT DoStuff()
{
    HRESULT hr;
    RESOURCE r1 = INVALID_RESOURCE_VALUE;
    RESOURCE r2 = INVALID_RESOURCE_VALUE;

    hr = AllocResource(&r1);
    if (SUCCEEDED(hr)) {
        hr = AllocResource(&r2);
        if (SUCCEEDED(hr)) {
            hr = FunctionWhichUsesResources(r1, r2);
            if (SUCCEEDED(hr)) {
                ...
            }
        }
    }

    // Perform cleanup
    if (r2 != INVALID_RESOURCE_VALUE)
        FreeResource(r2);
    if (r1 != INVALID_RESOURCE_VALUE)
        FreeResource(r1);

    return hr;
}

This style is a vast improvement upon the previous in that the cleanup code is performed exactly once. The nesting gets a little out of hand but it isn’t strictly necessary: the above code is equivalent to:

HRESULT DoStuff()
{
    HRESULT hr;
    RESOURCE r1 = INVALID_RESOURCE_VALUE;
    RESOURCE r2 = INVALID_RESOURCE_VALUE;

    hr = AllocResource(&r1);
    if (SUCCEEDED(hr))
        hr = AllocResource(&r2);
    if (SUCCEEDED(hr))
        hr = FunctionWhichUsesResources(r1, r2);
    if (SUCCEEDED(hr))
        ...

    // Perform cleanup
    if (r2 != INVALID_RESOURCE_VALUE)
        FreeResource(r2);
    if (r1 != INVALID_RESOURCE_VALUE)
        FreeResource(r1);

    return hr;
}

I believe this style to be somewhat common, but I still think it could use some improvement, what with all the SUCCEEDED(hr)s in there. Quick: What C/C++ feature can you think of which will allow you to easily repeat the same code over and over? That’s right, macros! One could easily make the following changes to make the above code a little cleaner (although ultimately I do not recommend it, see this post for why):

#define Chk(f)
    do {
        if (SUCCEEDED(hr)) {
            hr = (f);
        }
    } while (0)

HRESULT DoStuff()
{
    HRESULT hr;
    RESOURCE r1 = INVALID_RESOURCE_VALUE;
    RESOURCE r2 = INVALID_RESOURCE_VALUE;

    hr = AllocResource(&r1);
    Chk(AllocResource(&r2));
    Chk(FunctionWhichUsesResources(r1, r2));
    Chk(Function2WhichUsesResource(r1, r2));

    // Perform cleanup
    if (r2 != INVALID_RESOURCE_VALUE)
        FreeResource(r2);
    if (r1 != INVALID_RESOURCE_VALUE)
        FreeResource(r1);

    return hr;
}

Note that the macro uses the standard do/while trick to embed multiple statements into one macro. Also note it has the implicit dependency that there must be an HRESULT named hr in the function.

We can even make things a little cleaner. Changing the first line of the function to be HRESULT hr = S_OK allows you to use the Chk() macro consistently throughout the function:

#define Chk(f)
    do {
        if (SUCCEEDED(hr)) {
            hr = (f);
        }
    } while (0)

HRESULT DoStuff()
{
    HRESULT hr = S_OK;
    RESOURCE r1 = INVALID_RESOURCE_VALUE;
    RESOURCE r2 = INVALID_RESOURCE_VALUE;

    Chk(AllocResource(&r1));
    Chk(AllocResource(&r2));
    Chk(FunctionWhichUsesResources(r1, r2));
    Chk(Function2WhichUsesResource(r1, r2));

    // Perform cleanup
    if (r2 != INVALID_RESOURCE_VALUE)
        FreeResource(r2);
    if (r1 != INVALID_RESOURCE_VALUE)
        FreeResource(r1);

    return hr;
}

This is pretty good, but what if there is a condition where you consider the function successful, but you would like to terminate without further processing. For this case, the easiest thing is probably to tell Edsger Dijkstra to shove it and use a goto statement:

#define Chk(f)
    do {
        if (SUCCEEDED(hr)) {
            hr = (f);
        }
    } while (0)

HRESULT DoStuff()
{
    HRESULT hr = S_OK;
    RESOURCE r1 = INVALID_RESOURCE_VALUE;
    RESOURCE r2 = INVALID_RESOURCE_VALUE;

    Chk(AllocResource(&r1));
    Chk(AllocResource(&r2));
    Chk(FunctionWhichUsesResources(r1, r2));
    if (SomeEarlyTerminationConditionIsMet())
        goto Cleanup;
    Chk(Function2WhichUsesResource(r1, r2));

Cleanup:
    if (r2 != INVALID_RESOURCE_VALUE)
        FreeResource(r2);
    if (r1 != INVALID_RESOURCE_VALUE)
        FreeResource(r1);

    return hr;
}

Recommended: (see this post) Now that we have used goto once, why not use it everywhere? For my code, I standardize on the presence of a label such as Cleanup and have the Chk() macro goto upon failure. A Chk() which uses goto looks like:

#define Chk(f)
    do {
        hr = (f);
        if (FAILED(hr)) {
            goto Cleanup;
        }
    } while (0)

Now that all this infrastructure is there, you can do some pretty neat things. Want to print to the screen every time we encounter a failed HRESULT in a debug build? Simply modify Chk() such as:

#ifdef DEBUG
# define IfDebug(x) (x)
#else
# define IfDebug(x)
#endif

#define Chk(f)
    do {
        hr = (f);
        if (FAILED(hr)) {
            IfDebug(printf("Encountered hr = 0x%x at %s:%d (in function %s)", hr,
                            __FILE__, __LINE__, __FUNCTION__));
            goto Cleanup;
        }
    } while (0)

Want to simplify the if (...) goto Cleanup; for early termination? Write a ChkTrue() macro:

// If f is false, unconditionally set hr to hrNew and jump to Cleanup
#define ChkTrue(f, hrNew)
    do {
        if (!(f)) {
            hr = (hrNew);
            goto Cleanup;
        }
    } while (0)

Want to call Windows functions which return a boolean and provide extended error information in GetLastError() (such as CreateDirectory?):

#define ChkTrueGLE(f) ChkTrue(f, HRESULT_FROM_WIN32(::GetLastError()))

Want to call Windows functions which return a HANDLE and provide extended error information in GetLastError() (such as CreateFile?):

#define ChkHandleGLE(f) ChkTrueGLE((f) != INVALID_HANDLE_VALUE)

Once this infrastructure is in place, writing code which properly handles errors becomes a lot easier!

Update: Be sure to read this post with further observations about this style of error handling.

WP Theme & Icons by N.Design Studio
Entries RSS Comments RSS Log in