WDM drivers can work with string data in any of four formats:
The UNICODE_STRING and ANSI_STRING data structures both have the layout depicted in Figure 3-14. The Buffer field of either structure points to a data area elsewhere in memory that contains the string data. MaximumLength gives the length of the buffer area, and Length provides the (current) length of the string without regard to any null terminator that might be present. Both length fields are in bytes, even for the UNICODE_STRING structure.
Figure 3-14. The UNICODE_STRING and ANSI_STRING structures.
Table 3-7 lists the service functions that you can use for working with Unicode and ANSI strings. I've listed them side by side because there's a fair amount of duplication. I've also listed some functions from the standard C run-time library that are available in kernel mode for manipulating regular C-style strings. The standard DDK headers include declarations of these functions, and the libraries with which you link drivers contain them, so there's no particular reason not to use them even though they've never been documented in the DDK as being available.
Table 3-7. Functions for string manipulation.
Operation | ANSI String Function | Unicode String Function |
---|---|---|
Length | strlen | wcslen |
Concatenate | strcat, strncat | wcscat, wcsncat, RtlAppendUnicodeStringToString, RtlAppendUnicodeToString |
Copy | strcpy, strncpy, RtlCopyString | wcscpy, wcsncpy, RtlCopyUnicodeString |
Reverse | _strrev | _wcsrev |
Compare | strcmp, strncmp, _stricmp, _strnicmp, RtlCompareString, RtlEqualString | wcscmp, wcsncmp, _wcsicmp, _wcsnicmp, RtlCompareUnicodeString, RtlEqualUnicodeString, RtlPrefixUnicodeString |
Initialize | _strset, _strnset, RtlInitAnsiString, RtlInitString | _wcsnset, RtlInitUnicodeString |
Search | strchr, strrchr, strspn, strstr | wcschr, wcsrchr, wcsspn, wcsstr |
Upper/lowercase | _strlwr, _strupr, RtlUpperString | _wcslwr, _wcsupr, RtlUpcaseUnicodeString |
Character | isdigit, islower, isprint, isspace, isupper, isxdigit, tolower, toupper, RtlUpperChar | towlower, towupper, RtlUpcaseUnicodeChar |
Format | sprintf, vsprintf, _snprintf, _vsnprintf | swprintf, _snwprintf |
String conversion | atoi, atol, _itoa | _itow, RtlIntegerToUnicodeString, RtlUnicodeStringToInteger |
Type conversion | RtlAnsiStringToUnicodeSize, RtlAnsiStringToUnicodeString | RtlUnicodeStringToAnsiString |
Memory release | RtlFreeAnsiString | RtlFreeUnicodeString |
Many more RtlXxx functions are exported by the system DLLs, but I've listed the ones for which the DDK header files (and the SDK headers they include) define prototypes. These are the only ones we should use in drivers.
I'm not going to describe the string manipulation functions in detail because the DDK documentation does this perfectly well and you already know, based on your general programming experience, how to put functions like this together to get your work done. But I do want to discuss a problem that can rear up and bite you if you don't look out for it.
You often define UNICODE_STRING (or ANSI_STRING) structures as automatic variables or as parts of your own device extension. The string buffers to which these structures point usually occupy dynamically allocated memory, but you'll sometimes want to work with string constants, too. Keeping track of who owns the memory to which a particular UNICODE_STRING or ANSI_STRING structure points can be a bit of a problem. Consider the following fragment of a function:
UNICODE_STRING foo; if (bArriving) RtlInitUnicodeString(&foo, L"Hello, world!"); else RtlAnsiStringToUnicodeString(&foo, "Goodbye, cruel world!", TRUE); ... RtlFreeUnicodeString(&foo); // don't do this! |
In one case, we initialize foo.Length, foo.MaximumLength, and foo.Buffer to describe a wide character string constant in our driver. In another case, we ask the system (by means of the TRUE third argument to RtlAnsiStringToUnicodeString) to allocate memory for the Unicode translation of an ANSI string. In the first case, it's a mistake to call RtlFreeUnicodeString because it will unconditionally try to release a memory block that's part of our code or data. In the second case, it's mandatory to call RtlFreeUnicodeString eventually if we want to avoid a memory leak.
I've borrowed the term data blob from the world of database management to describe a random collection of bytes that you want to manipulate somehow. Table 3-8 lists the functions (including some from the standard run-time library) that you can call in kernel mode for that purpose. Once again, I'm going to assume that you can figure out how to use these functions (based on their largely mnemonic names). I need to point out a few nonobvious facts, however:
Table 3-8. Service functions for working with blobs of data.
Service Function or Macro | Description |
---|---|
memchr | Find a byte in a blob |
memcpy, RtlCopyBytes, RtlCopyMemory | Copy bytes, assuming no overlap |
memmove, RtlMoveMemory | Copy bytes when there might be an overlap |
memset, RtlFillBytes, RtlFillMemory | Fill blob with given value |
memcmp, RtlCompareMemory, RtlEqualMemory | Compare one blob to another |
memset, RtlZeroBytes, RtlZeroMemory | Zero-fill a blob |