A Function Which Returns A Pointer To Itself In C++

C++ No Comments »

Courtesy of my current reading, Herb Sutter’s More Exceptional C++, I have learned that it is possible in C++ to write a function which returns a pointer to itself using a somewhat sneaky trick:

class FuncPtr_;
typedef FuncPtr_ (*FuncPtr)();

class FuncPtr_
{
public:
    FuncPtr_(FuncPtr p) : m_p(p) {}
    operator FuncPtr() { return m_p; }

private:
    FuncPtr m_p;
};

FuncPtr_ f() { return f; }

OK, I admit that this function doesn’t exactly return a pointer to itself—it returns an instance of a class which is implicitly convertible to a pointer to the original function—but it accomplishes the same thing in effect. For example:

FuncPtr p = f();
p(); // or (*p)();

C# Utility Class: CompundKey

C# No Comments »

While working in C#, I’ve often found that I want to use more than one variable as a key to a Hashtable. I’ve abstracted this functionality into an object called CompundKey. CompoundKey allows one to combine any number of variables into a single object which, through proper implementation of Equals(), GetHashCode(), and ToString(), can be used as a key to any IDictionary or even System.Web.Caching.Cache.

Usage is very simple. For example:

IDictionary urlUserAccessTimes = new Hashtable();
CompoundKey urlUserKey = new CompoundKey(new Uri("http://www.deez.info/sengelha/", "Steven Engelhardt"));
urlUserAccessTimes[urlUserKey] = DateTime.Now;

Here’s the code:

CompoundKey code
/// <summary>
/// Creates a key for an IDictionary or a System.Web.Caching.Cache
/// out of a collection of values.
/// </summary>
/// <remarks>
/// Each value stored in CompoundKey must implement Equals()
/// correctly.
/// </remarks>
public struct CompoundKey
{
    private object[] m_keyParts;

    public CompoundKey(params object[] keyParts)
    {
        Debug.Assert(keyParts != null);

        m_keyParts = keyParts;
    }

    public override bool Equals(object obj)
    {
        if (!(obj is CompoundKey))
            return false;

        CompoundKey key = (CompoundKey) obj;
        return ArrayUtils.Equals(m_keyParts, key.m_keyParts);
    }

    public override int GetHashCode()
    {
        int hashCode = 0;
        foreach (object keyPart in m_keyParts)
        {
            if (keyPart != null)
            {
                hashCode ^= keyPart.GetHashCode();
            }
        }
        return hashCode;
    }

    /// <remarks>
    /// Unfortunately, System.Web.Caching.Cache uses strings as keys
    /// instead of objects.
    /// </remarks>
    public override string ToString()
    {
        StringBuilder sb = new StringBuilder();

        foreach (object keyPart in m_keyParts)
        {
            if (sb.Length > 0)
                sb.Append(",");
            sb.Append(keyPart != null ? keyPart.ToString() : "(null)");
        }

        return sb.ToString();
    }
}

C# Utility Class: ArrayUtils

C# No Comments »

At work, I’ve written a collection of C# utility classes which implement commonly used functionality. The classes are encapsulated in a class library project which is included by virtually every C# application or class library I write. One of the simplest utility classes I’ve written is called ArrayUtils, and it is (duh) a set of useful functions for dealing with arrays. First, the code:

ArrayUtils code
/// <summary>
/// ArrayUtils is a collection of static helper functions which implement
/// common array tasks.
/// </summary>
public sealed class ArrayUtils
{
    /// <summary>
    /// Determines whether the provided array contains the specified
    /// member.
    /// </summary>
    /// <remarks>
    /// Execution time is O(n).
    /// </remarks>
    public static bool Contains(Array a1, object val)
    {
        Debug.Assert(a1 != null);
        Debug.Assert(val != null);

        foreach (object o in a1)
        {
            if (Object.Equals(o, val))
                return true;
        }

        return false;
    }

    /// <summary>
    /// Determines whether the provided array has any duplicated members.
    /// </summary>
    /// <remarks>
    /// Execution time is O(n^2).
    /// </remarks>
    public static bool ContainsDuplicates(Array a1)
    {
        Debug.Assert(a1 != null);

        for (int i = 0; i < a1.Length; i++)
        {
            for (int j = i + 1; j < a1.Length; j++)
            {
                if (Object.Equals(a1.GetValue(i), a1.GetValue(j)))
                {
                    return true;
                }
            }
        }

        return false;
    }

    /// <summary>
    /// Determine whether the two arrays are equal.  Equality is defined
    /// as having the same number of members and each member, in order,
    /// matches the corresponding member in the other array.
    /// </summary>
    /// <remarks>
    /// Execution time is O(n).
    /// </remarks>
    public static bool Equals(Array a1, Array a2)
    {
        Debug.Assert(a1 != null);
        Debug.Assert(a2 != null);

        if (a1.Length != a2.Length)
            return false;

        for (int i = 0; i < a1.Length; i++)
        {
            if (!Object.Equals(a1.GetValue(i), a2.GetValue(i)))
                return false;
        }

        return true;
    }

    private ArrayUtils() {}
}

Most of the code is very straight forward. Here’s a few pieces which I believe deserve closer attention:

public sealed class ArrayUtils
{
    ...

    private ArrayUtils() {}
}

The Utils suffix on the class name is a convention I use to denote a class with only static methods. The combination of sealed and a private constructor is a common idiom when declaring such classes. The next version of C#, Whidbey, introduces a static class attribute to denote such classes; classes with this attribute will be disallowed from declaring any non-static methods.

public static bool Contains(Array a1, object val)
{
    Debug.Assert(a1 != null);
    Debug.Assert(val != null);

    ...
}

I tend to use debug-only parameter validation for my utility classes; the alternative is to do things such as throw ArgumentNullExceptions. My justification is that these utility methods are internal and all validation should have been performed before calling them. I’m not particularly beholden to this convention—I consider there to be a decent chance that in the future I will change my mind.

if (Object.Equals(o, val))

I am very careful to use Object.Equals() throughout this class to allow for the possibility that either object may be null and to consider two null objects identical.

The rest, I think, is self-explanatory. Future posts about these C# utility classes should be more interesting as I plan to discuss much less trivial code.

By the way, I mentioned earlier that this post was going to be part of of a series. I’ve changed my mind; these utility classes are largely independent and will stand on their own.

Beware ThreadPools And HttpWebRequest

C# No Comments »

At work I am responsible for a program which involves a very large number of authenticated HTTP requests to retrieve data. In an effort to make it as efficient as possible, I used asynchronous HTTP requests. Furthermore, I needed to retrieve and cache certain data from the webserver across threads.

However, I ran across a terrible problem. To illustrate, I’ve simplified the code down to the following (the TimedLock object is from my post Useful IDisposable Class 1: TimedLock (Post 3 of 5)):

class Class1
{
    private ValueType sharedData;
    private object sharedDataLock = new object();

    public void Run()
    {
        for (int i = 0; i < 100; i++)
        {
            ThreadPool.QueueUserWorkItem(new WaitCallback(Process));
        }
    }

    private void Process(object state)
    {
        using (TimedLock.TryLock(sharedDataLock, TimeSpan.FromSeconds(60)))
        {
            WebRequest request = HttpWebRequest.Create(...);
            using (WebResponse response = request.GetResponse())
            {
                // update sharedData
            }
        }

        // Do other work
    }
}

The problem manifests itself as a lock timeout (or a deadlock if you use the lock keyword): One thread acquires the lock in TimedLock.TryLock() but request.GetResponse() blocks forever, so it never releases the lock. The other threads remain stuck at the TimedLock.TryLock() line, waiting for the first thread to relinquish the lock.

Why does this happen? Well, it took a long time for me to figure out, but I finally determined it is due to the problem described in this .NET Matters article:

The first thing to be aware of is that in version 1.x of the Microsoft®.NET Framework, HttpWebRequest never makes synchronous requests. What do I mean by that? Take a look at the code for HttpWebRequest.GetResponse as coded in the Shared Source CLI (SSCLI), shown here omitting the code that checks to see if the response was previously retrieved and that accounts for timeouts:

public override WebResponse GetResponse() {
    •••
    IAsyncResult asyncResult = BeginGetResponse(null, null);
    •••
    return EndGetResponse(asyncResult);
}

As you can see, HttpWebRequest.GetResponse is simply a wrapper around the pairing of BeginGetResponse and EndGetResponse. These operate asynchronously, meaning that BeginGetResponse makes the actual HTTP request from a different thread than the one from which it was called, and EndGetResponse blocks until the request has completed. The net result of this is that HttpWebRequest queues a work item to the ThreadPool for every outbound request.

In the scenario I have provided, the ThreadPool spins up a number of threads to process the QueueUserWorkItem() requests, which then all block on TimedLock.TryLock() — except for one, which gets to request.GetResponse(). Then, when GetResponse() attempts to grab a ThreadPool thread of its own (per the article), it deadlocks waiting for a ThreadPool thread to become free. Incidentally, the HttpWebRequest class is supposed to throw an Exception if the number of threads in the ThreadPool is too low, but that didn’t seem to be happening for me.

What’s the solution? I first tried keeping a minimum number of ThreadPool threads available by explicitly checking the number of available threads in the ThreadPool by using ThreadPool.GetAvailableThreads() before calling QueueUserWorkItem(), but ThreadPool threads aren’t started immediately — they are started up at a later time, spaced apart with a small delay — so GetAvailableThreads() indicated that there were plenty of threads available.

Outside of upgrading to the .NET Framework 2.0 — which isn’t even released yet — the article suggests writing a “throttling” ThreadPool which handles thread management itself and limits the number of active threads to a programmer-specified maximum number. Here’s the article’s sample implementation:

public sealed class Semaphore : WaitHandle
{
    public Semaphore() : this(1, 1) {}

    public Semaphore(int initialCount, int maximumCount)
    {
        if (initialCount < 0 || initialCount > maximumCount)
            throw new ArgumentOutOfRangeException("initialCount");
        if (maximumCount < 1)
            throw new ArgumentOutOfRangeException("maximumCount");
        IntPtr h = CreateSemaphore(
            IntPtr.Zero, initialCount, maximumCount, null);
        if (h == WaitHandle.InvalidHandle || h == IntPtr.Zero)
            throw new Win32Exception();
        Handle = h;
    }

    public void ReleaseOne()
    {
        int previousCount;
        if (!ReleaseSemaphore(Handle, 1, out previousCount))
            throw new Win32Exception();
    }

    [DllImport("kernel32.dll", SetLastError=true)]
    private static extern IntPtr CreateSemaphore(
        IntPtr lpSemaphoreAttributes, int lInitialCount,
        int lMaximumCount, string lpName);

    [DllImport("kernel32.dll", SetLastError=true)]
    private static extern bool ReleaseSemaphore(
        IntPtr hSemaphore, int lReleaseCount, out int lpPreviousCount);
}
public class ThreadPoolThrottle : IDisposable
{
    private Semaphore _throttle;

    public ThreadPoolThrottle(int maximumAllowed)
    {
        if (maximumAllowed > 1)
            throw new ArgumentOutOfRangeException("maximumAllowed");
        _throttle = new Semaphore(maximumAllowed,maximumAllowed);
    }

    public void QueueUserWorkItem(WaitCallback callback)
    {
        QueueUserWorkItem(callback, null);
    }

    public void QueueUserWorkItem(WaitCallback callback, object state)
    {
        if (_throttle == null)
            throw new ObjectDisposedException(this.GetType().FullName);
        if (callback == null)
            throw new ArgumentNullException("callback");

        _throttle.WaitOne();
        try
        {
            QueuedCallback qc = new QueuedCallback();
            qc.Callback = callback;
            qc.State = state;
            ThreadPool.QueueUserWorkItem(
                new WaitCallback(HandleWorkItem), qc);
        }
        catch
        {
            _throttle.ReleaseOne();
            throw;
        }
    }

    private void HandleWorkItem(object state)
    {
        QueuedCallback qc = (QueuedCallback)state;
        try { qc.Callback(qc.State); }
        finally { _throttle.ReleaseOne(); }
    }

    private class QueuedCallback
    {
        public WaitCallback Callback;
        public object State;
    }

    public void Dispose()
    {
        if (_throttle != null)
        {
            ((IDisposable)_throttle).Dispose();
            _throttle = null;
        }
    }
}

I was able to easily adapt this solution for my needs. Unfortunately, I do not have a solution if one desires to continue using asynchronous method calls — such as delegates’ BeginInvoke() method or Stream.BeginRead() — as they internally use ThreadPool threads.

Matching Enumerated Types Using Regular Expressions

C#, Regular Expressions No Comments »

Regular expressions are a very useful tool. Among the uses I’ve found for them are validating user input, performing simple HTML manipulation (although in general this is a bad idea — one should prefer a real HTML parser), and parsing textual data in custom formats from numerous sources.

Naturally, regular expressions have downsides as well. They are virtually a write-only language (although Perl’s x flag combined with copious comments largely alleviates this), some regular expressions have ghastly performance characteristics, learning their syntax takes quite a bit of time, far too many developers seem to be unaware of their existence, different regular expression implementations have different features, and one needs to get intimately familiar with the escaping rules for both regular expressions and the programming language (e.g. to create a regular expression which matches a single backslash character in C/C++, one needs to write “\\\\”)

One common task I often need to perform is to create a regular expression which matches any one of a number of values, e.g., matching an enumerated type1. Consider creating a regular expression which matches any two-letter U.S. state code. Most people will write something like (greatly simplified):

regex = "(AK|AL|AR|AZ|...|WA|WI|WV|WY)"

This will work fine, but as ( ) defines a capturing group I prefer to use the non-capturing (?: ) unless otherwise required:

regex = "(?:AK|AL|AR|AZ|...|WA|WI|WV|WY)"

Furthermore, since I don’t know the rules for operator precedence in regular expressions very well, I prefer to encase each allowed value in its own non-capturing group. This will also allow me to use any regular expression as an allowed value, even those which include | characters:

regex = "(?:(?:AK)|(?:AL)|(?:AR)|(?:AZ)|...|(?:WA)|(?:WI)|(?:WV)|(?:WY))"

One can easily write a function to perform this enumerated type regular expression generation. Here’s one implementation in C#:

class RegexUtils
{
    public static string CreateEnumeration(string[] regexs)
    {
        Debug.Assert(regexs != null);
        Debug.Assert(regexs.Length >= 2);

        StringBuilder sb = new StringBuilder();
        sb.Append("(?:");

        foreach (string regex in regexs)
        {
            sb.Append("(?:");
            sb.Append(regex);
            sb.Append(")|");
        }

        sb.Remove(sb.Length - 1, 1);
        sb.Append(")");
        return sb.ToString();
    }
}

The function is used as follows:

string[] stateCodeRegexs = new string[] { "AK", "AL", "AR", "AZ", ..., "WA", "WI", "WV", "WY" };
string anyStateCodeRegex = RegexUtils.CreateEnumeration(stateCodeRegexs);

Please note that the contents of stateCodeRegexsAK, AL, etc. — are themselves regular expressions and not simple character strings. This means that one can use the full set of regular expression features, but one must also beware of escaping issues.

In general, one must be very careful when combining regular expressions together. Typically, copious use of non-capturing groups is required in order to ensure correct behavior; blind string concatenation is just asking for bugs.

[1] For single characters one can use the [ ] construct, but that doesn’t work for more complicated enumerated types.
WP Theme & Icons by N.Design Studio
Entries RSS Comments RSS Log in