STL objects and module boundaries

STL, Win32 3 Comments »

Let’s say you have the following function:

  1. void AppendChar(std::string& s, char ch)
  2. {
  3.     s += ch;
  4. }

What happens if this function is exported as an ordinal function from a DLL (not an inlined piece of code inside a header) and you call it from an EXE?
Read the rest of this entry »

STL Map Use

STL 2 Comments »

What’s wrong with the following code?

  1. template<typename T1, typename T2>
  2. struct my_pair
  3. {
  4.     typedef T1 first_type;
  5.     typedef T2 second_type;
  6.  
  7.     my_pair() : first(T1()), second(T2()) {}
  8.     my_pair(const T1& v1, const T2& v2) : first(v1), second(v2) {}
  9.  
  10.     T1 first;
  11.     T2 second;
  12. };
  13.  
  14. template<typename T1, typename T2>
  15. inline bool operator<
  16.     (
  17.     const my_pair<T1, T2>& x,
  18.     const my_pair<T1, T2>& y
  19.     )
  20. {
  21.     return (x.first < y.first || x.second < y.second);
  22. }
  23.  
  24. void f()
  25. {
  26.     typedef my_pair<…, …> key_type;
  27.     typedefvalue_type;
  28.     typedef std::map<key_type, value_type> map_type;
  29.  
  30.     map_type map;
  31.     // Use map
  32. }

Read the rest of this entry »

STL Vector Use

STL 4 Comments »

I recently wrote a piece of code that looked something like the following:

  1. static const int NUM_TOTAL_VALUES = …;
  2. typedefT;
  3.  
  4. // Create vec and reserve NUM_TOTAL_VALUES spaces for later insertion
  5. std::vector<T> vec(NUM_TOTAL_VALUES);
  6.  
  7. // Insert values into vec
  8. for (int i = 0; i != NUM_TOTAL_VALUES; ++i)
  9.     vec.push_back();
  10.  
  11. // vec should now have NUM_TOTAL_VALUES values in it (but doesn’t!)

What’s wrong with this code?

Read the rest of this entry »

Simple STL Map Tricks

STL No Comments »

Let’s say you have two STL std::maps with identical types, and you want to copy all the elements from one to the other. The easiest way to do this is to use map::insert():

typedef std::map<key_type, value_type> map_t;

map_t map1;
map_t map2;

// Copy all elements from map1 to map2
map2.insert(map1.begin(), map1.end());

Alternatively, you could use the STL std::copy algorithm:

// Copy all elements from map1 to map2.
std::copy(map1.begin(), map1.end(), std::inserter(map2, map2.begin()));

Both methods’ performance should be an amortized O(n) because they insert records in sorted order and use the hinting form of map::insert.

Note that because both methods ultimately call map::insert they will not overwrite a preexisting key’s associated value. In other words, if map1 has the value V1 associated with key K and map2 the value V2 associated with the same key K, V2 will remain in map2 after the copy operation.

Let’s say you want to perform the copy but have map1’s values overwrite map2’s for identical keys. The first way to solve this problem that entered my mind was to write my own OutputIterator which performs an overwriting assignment and pass it to std::copy. However, there’s a far simpler approach. You can copy map2’s values into map1, relying on the fact that map2’s values won’t overwrite map1’s, and then swap the results:

map1.insert(map2.begin(), map2.end());
map2.swap(map1);

Thanks, Sam, for helping me figure all this out.

C++, STL, DLLs, and Buggy Optimizations

STL No Comments »

Recently a bug in the STL implementation that comes with Visual C++ 6.0 was brought to my attention. Consider the following function:

typedef std::map<std::string, std::string> map_t;

void f(const map_t& map)
{
    for (map_t::const_iterator iter = map.begin();
         iter != map.end();
         ++iter)
    {
    }
}

For Visual C++ 6.0, if you call f() from within the same EXE or DLL, it works fine, but if calling f() requires you to cross an EXE or DLL barrier, it will fail.

After some debugging, I figured out that this bug is because Visual C++ 6.0’s implementation of std::map, which stores data in a tree structure internally, uses a class static variable to represent a NULL tree node. After putting f() in a separate DLL, the std::map class static value gets initialized to one value in the EXE (on a run on my machine, 0x002f1000) and another value in the DLL (on a run on my machine, 0x003127c0). The ++iter line in the DLL leads to a loop which terminates when it encounters a node it thinks is NULL (0x003127c0). However, as the NULL node was set by the EXE, it has the value 0x002f1000, so the loop within the DLL walks off the tree and crashes.

Per Microsoft Knowledge Base Article 172396: You may experience an access violation when you access an STL object through a pointer or reference in a different DLL or EXE, this is a problem that’s fairly endemic across all of VC++ 6.0 when using STL objects across DLL/EXE barriers. The reason is Microsoft’s STL implementation used class static variables in a number of places to try to avoid storing multiple copies of the same information, but they did not anticipate this DLL-related problem. By the way, this bug is reportedly fixed in Visual C++ 7.

The KB article has a number of suggested workarounds, but I don’t like any of them. Dinkumware, who wrote the Visual C++ 6.0 STL library, has provided a xtree header replacement which fixes the bug by making the value of NULL tree nodes a member of the class rather than a class static variable as part of their Fixes for Library Bugs in VC++ V5.0/V6.0 page, but I don’t like the idea of using third-party compiler header replacements.

My recommendation is to avoid passing STL objects as parameters to DLLs. I also suggest being very careful about using class static variables in your own code.

Write Functions Which Take Iterators, Not Collections

C#, STL No Comments »

If my experience is typical, this is a very common construct:

ReturnType Function
    (
    const std::vector<T>& container
    )
{
    typedef std::vector<T>::const_iterator iterator_t;
    for (iterator_t iter = container.begin();
         iter != container.end();
         ++iter) {
        // Work with *iter
    }
}

The problem with this construct is that you have forced a container choice upon the user of your function. Slightly better, and basically your only choice when interoping with C, is this:

ReturnType Function
    (
    T* array,
    int numItems
    )
{
    for (int i = 0; i < numItems; ++i) {
        // Work with array[numItems]
    }

    // Or perhaps:
    // for (T* pT = array; pT != array + numItems; ++pT) {
    //     Work with *pT
    // }
}

With the above construct you can pass in any container which uses contiguous storage, such as an array or a std::vector (yes, std::vectors are guaranteed to store the data contiguously). Passing a std::vector to the above function looks like:

std::vector<T> v;
ReturnType ret = Function(v.empty() ? 0 : &v[0], v.size());

However, in C++ its far better to do as the STL does and write your function to accept iterators, as in:

template <class InputIterator>
ReturnType Function
    (
    InputIterator first,
    InputIterator last
    )
{
    for (InputIterator iter = first; iter != last; ++iter) {
        // Work with *iter
    }
}

By using this construct, you allow vast usage flexibility. Try to limit yourself to input iterator expressions on first and last (basically preincrement, dereference, and comparison) to minimize the requirements the InputIterator class must fulfill.

Most (all?) STL containers can pass their contents to Function() by using the containers’ begin() and end() functions, as in:

std::vector<T> v;
ReturnType ret = Function(v.begin(), v.end());

As C pointers are random-access iterators, you can pass arrays to Function() as follows:

const int arraySize = ...;
T array[arraySize];

ReturnType ret = Function(array, array + arraySize);

By the way, this lesson also applies to C#: prefer writing functions which accept IEnumerables rather than collections such as arrays. (In C# 2.0, you should be able to regain the lost typesafety by accepting IEnumerable<T>)

Prefer Iteration To Indexing

STL No Comments »

I’ve seen the following STL construct countless times:

std::vector<T> container;

for (int i = 0; i < container.size(); ++i) {
    // Work with container[i]
}

Unless otherwise necessary, it is better to use an STL iterator because it enables you to more easily change the underlying container. You can isolate the code changes required to one line by using typedef, as in:

typedef std::vector<T> container_t;
container_t container;

// Or ::const_iterator as necessary
for (container_t::iterator iter = container.begin();
     iter != container.end();
     ++iter) {
    // Work with *iter
}

Note that I wrote iter != container.end() as opposed to iter < container.end(). The former only requires an input iterator, while the latter requires a random access iterator—a more complicated iterator type supported by fewer STL containers.

Also note that I wrote ++iter as opposed to iter++. In general, you should prefer the former expression because it is always at least as efficient as the latter, and often times more so.

Of course, for a code block like the one above, you really should consider using the STL algorithm for_each.

WP Theme & Icons by N.Design Studio
Entries RSS Comments RSS Log in