Implementing a Spell Checker

Programming No Comments »

The following algorithms might be useful if you want to implement a spell checker (or Google-style “did you mean” feature):

Reader/Writer Lock Pattern

Programming No Comments »

A reader-writer lock is a lock which will allow multiple concurrent readers but only one writer.  A reader-writer lock can be significantly more efficient than a standard mutex if reads on your shared memory far outnumber writes.

Reader-writer locks naturally fit together with caches, as caches are only effective if reads far outnumber writes.

Here is a general pattern for using a reader-writer lock with a cache:

  1. Acquire a reader lock.
  2. Check the cache for the value. If it exists, save the value and go to step 8.
  3. Upgrade the reader lock to a writer lock.
  4. Check the cache for the value. If it exists, save the value and go to step 7.
  5. Calculate the value (expensive, otherwise we wouldn’t cache it)
  6. Insert the value into the cache.
  7. Release the writer lock.
  8. Release the reader lock.
  9. Return the value.

The reason why we have to check the cache for the value again in step (4) is because of the following possibility (assume step 4 doesn’t exist):

Thread 1                         Thread 2
============================     ===============================
- Acquire reader lock            - Acquire reader lock
- Check cache for value          - Check cache for value
  with key A (not found)           with key A (not found)
- Upgrade to writer lock         - (block)
- Calculate value (expensive)
- Insert value into cache
- Release writer lock
- Release reader lock
                                 - Upgrade to writer lock
                                 - Calculate value (expensive)
                                   (We are paying this cost
                                   twice)
                                 - Insert value into cache
                                   (We are inserting two values
                                   with the same key, which may
                                   be fatal)

With step 4 it becomes:

Thread 1                         Thread 2
============================     ===============================
- Acquire reader lock            - Acquire reader lock
- Check cache for value          - Check cache for value
  with key A (not found)           with key A (not found)
- Upgrade to writer lock         - (block)
- Calculate value (expensive)
- Insert value into cache
- Release writer lock
- Release reader lock
                                 - Upgrade to writer lock
                                 - Check cache for value
                                   with key A (found)
                                 - Release writer lock
                                 - Release reader lock
- Return value                   - Return value

Handling Multiple QueryString Parameters With the Same Key in ASP.NET

C# No Comments »

When you are processing an HTTP request in ASP.NET you can retrieve the user-provided query string parameters using the HttpRequest.QueryString property.  This property is an instance of the NameValueCollection class.

If the user has provided multiple parameters with the same key in the query string, HttpRequest.QueryString[key] will return all the values concatenated together with commas.  If you would rather process the values individually, use HttpRequest.QueryString.GetValues(key), which will return an array of all the provided values.

For example:

URL: http://example.com?a=1&a=2
HttpRequest.QueryString["a"] = "1,2"
HttpRequest.QueryString.GetValues("a") = { "1", "2" }

Pseudoloc

Win32 No Comments »

I have released a pseudolocalization tool I wrote for 32-bit Windows resource DLLs here.

Balloon Tooltips

Win32 2 Comments »

In the Windows XP login screen, the password text box will warn you with a balloon tooltip if you accidentally turn Caps Lock on:

Windows XP Caps Lock Warning Message

The balloon tooltip appears to be a Windows tooltip common control with the TTS_BALLOON style.

To replicate this functionality, I decided to write a function called ShowMsgBalloon() which, given a control and the various balloon tooltip parameters, creates and shows the balloon tooltip below the control.

The key insight to making ShowMsgBallon() work as intended was to use the TTF_TRACK option to create a tracking tooltip. This will immediately show the tooltip without requiring the user to position the mouse over the control. The main downside to using TTF_TRACK is that the tooltip will not move with the control if the window is moved; you need to manually move the tooltip using TTM_TRACKPOSITION as required. One could probably make this automatic by subclassing the tooltip’s parent control and handling WM_WINDOWPOSCHANGED messages.

Here is the source code to ShowMsgBalloon(). When you are done with the balloon, call DestroyWindow() on the returned HWND. Note: you may want your application to use comctl32.dll version 6 as it will lead to a nicer visual style, including a close button.

  1. #include <windows.h>
  2. #include <commctrl.h>
  3.  
  4. // Options to ShowMsgBallon() (see dwOpts parameter).  These are the
  5. // standard icon types for balloon tooltips.
  6. #define SMB_ICON_INFO    (1 << 0)
  7. #define SMB_ICON_WARNING (1 << 1)
  8. #define SMB_ICON_ERROR   (1 << 2)
  9.  
  10. // Given the options passed to ShowMsgBalloon(), determine what
  11. // parameter to send to TTM_SETTITLE for the balloon tooltip’s icon.
  12. static DWORD
  13. GetTitleIcon(DWORD dwOpts)
  14. {
  15.     if (dwOpts &amp; SMB_ICON_INFO)
  16.         return TTI_INFO;
  17.     else if (dwOpts &amp; SMB_ICON_WARNING)
  18.         return TTI_WARNING;
  19.     else if (dwOpts &amp; SMB_ICON_ERROR)
  20.         return TTI_ERROR;
  21.     else
  22.         return 0;
  23. }
  24.  
  25. // Create and show a balloon tooltip immediately below the control
  26. // hwndCtrl with the given title, message, and options.
  27. HWND
  28. ShowMsgBalloon(HWND hwndCtrl, LPCTSTR szTitle, LPCTSTR szMsg,
  29.                DWORD dwOpts)
  30. {
  31.     HWND hwndRet = NULL;
  32.     HWND hwndTT = NULL;
  33.     TOOLINFO ti = { 0 };
  34.     RECT rc;
  35.  
  36.     // Even though TTS_CLOSE is always specified, a close button will
  37.     // only be shown if your application has a manifest that requires
  38.     // comctl32.dll version 6.
  39.     hwndTT = CreateWindow
  40.         (
  41.         TOOLTIPS_CLASS,
  42.         TEXT(""),
  43.         WS_POPUP | TTS_NOPREFIX | TTS_BALLOON | TTS_CLOSE,
  44.         CW_USEDEFAULT, CW_USEDEFAULT,
  45.         CW_USEDEFAULT, CW_USEDEFAULT,
  46.         hwndCtrl,
  47.         NULL,
  48.         NULL,
  49.         NULL
  50.         );
  51.     if (hwndTT == NULL)
  52.         goto Cleanup;
  53.  
  54.     // By using TTTOOLINFO_V1_SIZE rather than sizeof(TOOLINFO),
  55.     // we don’t require users to be using comctl32 version 6.
  56.     ti.cbSize = TTTOOLINFO_V1_SIZE;
  57.     ti.uFlags = TTF_TRACK;
  58.     ti.hwnd = hwndCtrl;
  59.     ti.lpszText = const_cast<lptstr>(szMsg);
  60.     if (!SendMessage(hwndTT, TTM_ADDTOOL, 0, (LPARAM) &amp;ti))
  61.         goto Cleanup;
  62.     if (!SendMessage(hwndTT, TTM_SETTITLE, GetTitleIcon(dwOpts),
  63.                      (LPARAM) szTitle))
  64.         goto Cleanup;
  65.  
  66.     // Position the tooltip below the control
  67.     if (!GetWindowRect(hwndCtrl, &amp;rc))
  68.         goto Cleanup;
  69.     SendMessage(hwndTT, TTM_TRACKPOSITION, 0,
  70.                 MAKELONG(rc.left + 10, rc.bottom));
  71.  
  72.     // Show the tooltip
  73.     if (!SendMessage(hwndTT, TTM_TRACKACTIVATE, TRUE, (LPARAM) &amp;ti))
  74.         goto Cleanup;
  75.  
  76.     hwndRet = hwndTT;
  77.     hwndTT = NULL;
  78.  
  79. Cleanup:
  80.     if (hwndTT != NULL)
  81.         ::DestroyWindow(hwndTT);
  82.  
  83.     return hwndRet;
  84. }

Update 2008-11-01 3:08PM: If you are targeting comctl32.dll version 6 or later, I recommend using the EM_SHOWBALLOONTIP message.  Comctl32.dll version 6 or later also automatically shows the caps lock warning balloon for edit boxes with the ES_PASSWORD window style.

Escaping Strings in XPath 1.0

C++, XPath 1 Comment »

XPath is a language for selecting nodes from an XML document. XPath is used extensively in XSLT and other XML technologies. I also vastly prefer using XPath (e.g. with XPathNavigator) over the XML DOM when manipulating XML in a non-streaming fashion.

In XPath, strings must be delimited by either single or double quotes. Given a quote character used to delimit a string, one can’t represent that same quote character within the string. This means that if you decide to use single quotes to delimit your XPath string, you couldn’t represent the string O'Reilly; use double quotes, and you can’t represent "Hello".

However, given a quote delimiter, you can represent the other quote character. We can use this observation along with the concat XPath function to devise a general quoting rule for XPath strings. It’s easiest to show this via a series of examples:

Original String Quoted XPath String
a 'a' (or "a")
O'Reilly "O'Reilly"
"Hello" '"Hello"'
"Hello, Mr. O'Reilly" concat('"Hello, Mr. O', "'Reilly", '"')

Below is a piece of C++ code which implements these quotation rules:

  1. std::string
  2. QuoteXPathString(const std::string& xpath)
  3. {
  4.     // If we don’t have any single or double-quote characters, quote the
  5.     // expression in single quotes.
  6.     std::string::size_type pos = xpath.find_first_of("’\"");
  7.     if (pos == std::string::npos)
  8.         return "’" + xpath + "’";
  9.  
  10.     // If we cannot find the alternate quotation character, quote the
  11.     // expression in the alternate quotation character.
  12.     char chOther = (xpath[pos] == ‘"’ ? \’ : ‘"’);
  13.     pos = xpath.find(chOther, pos + 1);
  14.     if (pos == std::string::npos)
  15.         return chOther + xpath + chOther;
  16.  
  17.     // The string has both quotation characters.  We need to use concat()
  18.     // to form the string.
  19.     std::stringstream ss;
  20.     ss << "concat("
  21.        << chOther
  22.        << xpath.substr(0, pos)
  23.        << chOther;
  24.     do {
  25.         chOther = (xpath[pos] == ‘"’ ? \’ : ‘"’);
  26.         std::string::size_type pos2 = xpath.find(chOther, pos + 1);
  27.         ss << ‘,’
  28.            << chOther
  29.            << xpath.substr(pos, pos2 – pos)
  30.            << chOther;
  31.         pos = pos2;
  32.     } while (pos != std::string::npos);
  33.     ss << ")";
  34.  
  35.     return ss.str();
  36. }

Usage looks like:

  1. std::string lastName = …; // May come from user input
  2. std::string xpath = "//Customer[LastName = " +
  3.     QuoteXPathString(lastName) + "]";

Representing Date/Times as Strings for Non-Human Consumption

Programming 1 Comment »

If you ever have the need to represent a date/time (or part of a date/time) as a string for programmatic rather than human consumption (e.g. you are defining a save file format or a network protocol), please use ISO 8601 unless you have a very strong reason not to.

For more information, please read what the W3C has to say about ISO 8601 style date and time formats.

Don’t Forget to Reap your Zombies

Unix No Comments »

I recently received a bug report for my quick-and-dirty TCP debugging tool tcpconndbg where it was creating a large number of zombie processes. The person who filed the bug, Peter Viskup, was even kind enough to send a patch. While this is old news to anyone with extensive Unix programming experience, always remember the following:

If you create a child process using fork(), you must either:

  1. Explicitly retrieve the child process’s exit code using one of the wait() functions (e.g. waitpid())
  2. Tell the system that you aren’t interested in the child process’s exit code by using either:
    1. sigaction() with the SA_NOCLDWAIT parameter (preferred)
    2. signal(SIGCHILD, SIG_IGN); (for systems which do not support sigaction())

As I fixed this bug, I realized I hadn’t looked at tcpconndbg in 5 years. My how programming style changes…

Geometric Annual Return In SQL

Finance, SQL 1 Comment »

Here is some quick-and-dirty SQL to calculate an geometric annual return (as a percent) from a column of monthly returns (in percents).

  1. /* Convert the annualized number back to a percent */
  2. SELECT (T3.AnnHPR – 1) * 100 AS GeomAnnRet
  3. FROM
  4.   (
  5.   /* Annualize the holding period return */
  6.   SELECT POWER(T2.HPR, 12.0 / T2.NumReturns) AS AnnHPR
  7.   FROM
  8.     (
  9.     /* Calculate the holding period return over the time
  10.         period.
  11.    
  12.        POWER(10, SUM(LOG10(n))) is a simulated PRODUCT(n)
  13.        aggregate function.
  14.    
  15.        The precision of POWER is determined by the precision
  16.        of the first argument, so use a lot of decimals. */
  17.     SELECT POWER(10.0000000000000000,
  18.                  SUM(LOG10(T.MonthReturn))) AS HPR,
  19.            COUNT(*) AS NumReturns
  20.     FROM
  21.       (
  22.       /* Convert all percent returns to multipliers (1% ->
  23.          1.01) */
  24.       SELECT 1 + MonthPctReturn / 100 AS MonthReturn
  25.       FROM
  26.       ) AS T
  27.     ) AS T2
  28.   ) AS T3</code>

Update 2008-01-30 10:52PM: Here’s the equivalent “one-liner”:

  1. SELECT 100 * (POWER(POWER(10.000000000000000,
  2.                           SUM(LOG10(1 + MonthPctReturn / 100))),
  3.                     12.0 / COUNT(*))1)
  4. FROM

Silverlight 1.0 Mandelbrot Set Generator

Silverlight 2 Comments »

While reading the Mandelbrot set chapter in Dewdney’s The New Turing Omnibus, I realized that this would be a great test application for Microsoft’s new interactive Web application framework Silverlight. Below is the component, its source code, and a few things I learned along the way.

Beware: the Mandelbrot set is computationally expensive and may appear to lock up your web browser. If a “Stop running this script?” dialog pops up, please click no to allow the calculations to finish.
Read the rest of this entry »

WP Theme & Icons by N.Design Studio
Entries RSS Comments RSS Log in