Debugging Lesson

C No Comments »

Today, I had to debug a crash which was caused by a developer providing an untrusted string to a printf() function. This string had escape sequences in it which caused printf() to start popping parameters off the stack when no such parameters existed. While I was fortunate in that the crash was immediate and reproducible, I must note the following:

  • DO NOT DO THIS. Either use printf("%s", str); or one of the puts() functions.
  • Variable-argument functions are inherently brittle and crash-prone.
  • If you suspect the crash is due to a buffer-overflow problem such as this, and you are trying to debug after the behavior has already happened (for example, after the access violation exception has been thrown) don’t blindly trust the values of any variables the debugger tells you after this point, as the stack is likely garbage.

The proper way for writing multiple-line macros in C

C, C++ No Comments »

An article I just read on Eric Lippert’s blog finally made me realize the full justification for why I write C/C++ multi-line macros the way I do. Now to build up the justification piece by piece…

Imagine you are writing a macro, DoStuff(), which must call two functions in sequence, Func1() and Func2(). Naïvely, one might write:

#define DoStuff() Func1(); Func2();

Now say you want to execute DoStuff() only if SomeTest() returns true. One might reasonably write:

if (SomeTest())
    DoStuff();

What does the C preprocessor expand this into?

if (SomeTest())
    Func1(); Func2();;

Due to C’s syntax rules, the semicolon after Func1() will terminate the if statement. This means that Func2() will be executed regardless of the returned value of SomeTest()! Not exactly the behavior the user of the macro expected!

There is also an extra semicolon after Func2(), but it is harmless in this instance as it will be interpreted by the compiler as a null statement. However, this extra semicolon will end up being rather important in the next example!

The next idea might be to encase the macro in a scoping block using { and }. The macro would then look like:

#define DoStuff() { Func1(); Func2(); }

Expanding the macro as in the above example will result in the code:

if (SomeTest())
    { Func1(); Func2(); };

Excellent — the code now functions as expected, with the extra semicolon harmlessly terminating the if statement. However, there’s a different problem with this construct. Consider the following use of the macro:

if (SomeTest())
    DoStuff();
else
    Func3();

This will be expanded by the C preprocessor into:

if (SomeTest())
    { Func1(); Func2(); };
else
    Func3();

As I mentioned before, the semicolon at the end of the { } block will terminate the if statement, which means that the else statement is no longer bound to any if, and the code will not compile! Uh-oh!

Naturally, there is a solution, which is to use a do { ... } while (0) construct when writing multi-line macros, as follows:

#define DoStuff() do { Func1(); Func2(); } while (0)

The C preprocessor will expand this into:

if (SomeTest())
    do { Func1(); Func2(); } while (0);
else
    Func3();

This code compiles and acts exactly how a user of the macro would expect it to. Therefore, when writing multi-line macros, be sure to encase them in do { ... } while(0).

A C/C++ Error Handling Discipline: Part 3

C, C++, Error Handling No Comments »

One major disadvantage of using the error handling style I mentioned in A C/C++ Error Handling Discipline is that as you are using out parameters as opposed to a return value, any object returned from a function must have a default constructor. The alternative is to return values directly from the function and use exceptions for error handling.

This obviously isn’t an issue if you are using COM as you are forced into HRESULT-style code.

A C/C++ Error Handling Discipline: Part 2

C, C++, Error Handling No Comments »

In my post A C/C++ Error Handling Discipline, I described a macro-based method for cleanly handling errors along with proper resource deallocation in C/C++. I want to point out a few observations I’ve made since then:

  • Using the version of the Chk() macro which tests for SUCCEEDED(hr) before executing the statement means that if you ever want to execute any code not within a Chk() block, you must explicitly check for SUCCEEDED(hr) before executing it. For example:

    #define Chk(f)
        do {
            if (SUCCEEDED(hr)) {
                hr = (f);
            }
        } while (0)
    
    HRESULT DoStuff()
    {
        HRESULT hr = S_OK;
        RESOURCE r1 = INVALID_RESOURCE_VALUE;
    
        Chk(AllocResource(&r1));
    
        // The following line of code will execute EVEN IF AllocResource failed.
        // Consider wrapping in SUCCEEDED(hr).
        if (SomeEarlyTerminationConditionIsMet())
            goto Cleanup;
    
        ...
    
    Cleanup:
        if (r1 != INVALID_RESOURCE_VALUE)
            FreeResource(r1);
    
        return hr;
    }
    

    This is different from the version of Chk() which uses goto to jump to the end of the function, as this guarantees no subsequent code will be executed. Therefore, I recommend using the goto-based version of Chk().

  • However, using the goto-based version of Chk() interacts poorly with C++, due to C++’s ability to declare a variable anywhere within the function. It interacts especially poorly with C++’s RAII (resource acquisition is initialization) idiom, as this may mean you cannot declare the variable until somewhere in the middle of your function (after you have computed some value the constructor requires, for example). The problem stems from the fact that the goto may jump past the constructor of any object after the Chk() test — this will cause most (all?) C++ compilers to generate an error. For example:

    #define Chk(f)
        do {
            hr = (f);
            if (FAILED(hr)) {
                goto Cleanup;
            }
        } while (0)
    
    HRESULT DoStuff()
    {
        HRESULT hr = S_OK;
    
        Chk(Function1());
    
        std::string s("This is an argument to Function2()");
        Chk(Function2(s));
    
    Cleanup:
        return hr;
    }
    

    Note how if Function1 fails the constructor for s will never execute, but s is still in scope and could hypothetically be used. Therefore, C++ compilers will usually forbid this construct.

    To solve this problem, one has a number of options:

    • Use a C-like style of declaration at the top of the function block and usage anywhere else in the function. The downside to this is that a default constructor must exist and any work done in the default constructor is wasted. Furthermore, this may directly conflict with the RAII idiom and it requires your objects to support an “uninitialized” state.
    • Encase the use of variables such as s in their own context by using { and }. This is a little ugly and sometimes requires recursive nesting, which can get out of hand.
    • Recommended: Write/use C++ class wrappers for all dynamically acquired resources using RAII and replace the goto within Chk() with return. However, once you’ve reached this point, you are only an easy step or two away from exceptions…
    • Hope that someday C++ compilers become smart enough to notice that even though you used goto to jump past the initialization of the object, you never use the object after that point, and thus there is nothing to worry about (however, this may not be technically possible).

Update: See this post for some more thoughts.

WP Theme & Icons by N.Design Studio
Entries RSS Comments RSS Log in