[Previous] [Next]

String Handling

WDM drivers can work with string data in any of four formats:

The UNICODE_STRING and ANSI_STRING data structures both have the layout depicted in Figure 3-14. The Buffer field of either structure points to a data area elsewhere in memory that contains the string data. MaximumLength gives the length of the buffer area, and Length provides the (current) length of the string without regard to any null terminator that might be present. Both length fields are in bytes, even for the UNICODE_STRING structure.

Figure 3-14. The UNICODE_STRING and ANSI_STRING structures.

Table 3-7 lists the service functions that you can use for working with Unicode and ANSI strings. I've listed them side by side because there's a fair amount of duplication. I've also listed some functions from the standard C run-time library that are available in kernel mode for manipulating regular C-style strings. The standard DDK headers include declarations of these functions, and the libraries with which you link drivers contain them, so there's no particular reason not to use them even though they've never been documented in the DDK as being available.

Table 3-7. Functions for string manipulation.

Operation ANSI String Function Unicode String Function
Length strlen wcslen
Concatenate strcat, strncat wcscat, wcsncat, RtlAppendUnicodeStringToString, RtlAppendUnicodeToString
Copy strcpy, strncpy, RtlCopyString wcscpy, wcsncpy, RtlCopyUnicodeString
Reverse _strrev _wcsrev
Compare strcmp, strncmp, _stricmp, _strnicmp, RtlCompareString, RtlEqualString wcscmp, wcsncmp, _wcsicmp, _wcsnicmp, RtlCompareUnicodeString, RtlEqualUnicodeString, RtlPrefixUnicodeString
Initialize _strset, _strnset, RtlInitAnsiString, RtlInitString _wcsnset, RtlInitUnicodeString
Search strchr, strrchr, strspn, strstr wcschr, wcsrchr, wcsspn, wcsstr
Upper/lowercase _strlwr, _strupr, RtlUpperString _wcslwr, _wcsupr, RtlUpcaseUnicodeString
Character isdigit, islower, isprint, isspace, isupper, isxdigit, tolower, toupper, RtlUpperChar towlower, towupper, RtlUpcaseUnicodeChar
Format sprintf, vsprintf, _snprintf, _vsnprintf swprintf, _snwprintf
String conversion atoi, atol, _itoa _itow, RtlIntegerToUnicodeString, RtlUnicodeStringToInteger
Type conversion RtlAnsiStringToUnicodeSize, RtlAnsiStringToUnicodeString RtlUnicodeStringToAnsiString
Memory release RtlFreeAnsiString RtlFreeUnicodeString

Many more RtlXxx functions are exported by the system DLLs, but I've listed the ones for which the DDK header files (and the SDK headers they include) define prototypes. These are the only ones we should use in drivers.

Allocating and Releasing String Buffers

I'm not going to describe the string manipulation functions in detail because the DDK documentation does this perfectly well and you already know, based on your general programming experience, how to put functions like this together to get your work done. But I do want to discuss a problem that can rear up and bite you if you don't look out for it.

You often define UNICODE_STRING (or ANSI_STRING) structures as automatic variables or as parts of your own device extension. The string buffers to which these structures point usually occupy dynamically allocated memory, but you'll sometimes want to work with string constants, too. Keeping track of who owns the memory to which a particular UNICODE_STRING or ANSI_STRING structure points can be a bit of a problem. Consider the following fragment of a function:

UNICODE_STRING foo;
if (bArriving)
  RtlInitUnicodeString(&foo, L"Hello, world!");
else
  RtlAnsiStringToUnicodeString(&foo, "Goodbye, cruel world!", TRUE);
...
RtlFreeUnicodeString(&foo); //  don't do this!

In one case, we initialize foo.Length, foo.MaximumLength, and foo.Buffer to describe a wide character string constant in our driver. In another case, we ask the system (by means of the TRUE third argument to RtlAnsiStringToUnicodeString) to allocate memory for the Unicode translation of an ANSI string. In the first case, it's a mistake to call RtlFreeUnicodeString because it will unconditionally try to release a memory block that's part of our code or data. In the second case, it's mandatory to call RtlFreeUnicodeString eventually if we want to avoid a memory leak.

Data Blobs

I've borrowed the term data blob from the world of database management to describe a random collection of bytes that you want to manipulate somehow. Table 3-8 lists the functions (including some from the standard run-time library) that you can call in kernel mode for that purpose. Once again, I'm going to assume that you can figure out how to use these functions (based on their largely mnemonic names). I need to point out a few nonobvious facts, however:

Table 3-8. Service functions for working with blobs of data.

Service Function or Macro Description
memchr Find a byte in a blob
memcpy, RtlCopyBytes, RtlCopyMemory Copy bytes, assuming no overlap
memmove, RtlMoveMemory Copy bytes when there might be an overlap
memset, RtlFillBytes, RtlFillMemory Fill blob with given value
memcmp, RtlCompareMemory, RtlEqualMemory Compare one blob to another
memset, RtlZeroBytes, RtlZeroMemory Zero-fill a blob