As I said at the outset of this chapter, WDM drivers need to track their devices through the state transitions diagrammed in Figure 6-1. This state tracking also ties in with how you queue and cancel I/O requests. Cancellation in turn implicates the global cancel spin lock, which is a performance bottleneck in a multi-CPU system. The standard model of IRP processing can't solve all these interrelated problems. In this section, therefore, I'll present a new type of object—called a DEVQUEUE—that you can use in your PnP request handlers and in place of the standard model routines StartPacket and StartNextPacket. DEVQUEUE is my own invention, but it's based on sample drivers, especially PNPPOWER and CANCEL, that used to be in the DDK. See also the discussion of IRP cancellation in Ervin Peretz's "The Windows Driver Model Simplifies Management of Device Driver I/O Requests," (Microsoft Systems Journal, January 1999). A portion of the IRP cancellation logic I'm describing also derives from work by Peretz and other Microsoft employees and by Jamie Hanrahan that had not been published at the time I was writing this book.
I described the KDEVICE_QUEUE queue object in the previous chapter as encompassing an idle state, a busy but empty state, and a busy but not empty state. The support routines you use to manipulate a KDEVICE_QUEUE assume that if the device is not currently busy, all you want to do is start any new request running on the device. It's precisely this behavior that we need to overcome to successfully manage PnP states. Figure 6-4 illustrates the states of a DEVQUEUE.
Figure 6-4. States of a DEVQUEUE object.
In the READY state, the queue operates much like a KDEVICE_QUEUE, accepting and forwarding requests to your StartIo routine in such a way that the device stays busy. In the STALLED state, however, the queue does not forward IRPs to StartIo even when the device is idle. In the REJECTING state, the queue doesn't even accept new IRPs. Figure 6-5 illustrates the flow of IRPs through the queue.
Figure 6-5. Flow of IRPs through a DEVQUEUE.
You define a DEVQUEUE object for each queue of requests you'll manage in the driver. For example, if your device manages reads and writes in a single queue, you'd define one DEVQUEUE:
typedef struct _DEVICE_EXTENSION { ... DEVQUEUE dqReadWrite; ... } DEVICE_EXTENSION, *PDEVICE_EXTENSION; |
Table 6-3 lists the support functions you can use with a DEVQUEUE.
Table 6-3. DEVQUEUE service routines.
Support Function | Description |
---|---|
AbortRequests | Aborts current and future requests |
AllowRequests | Undoes effect of previous AbortRequests |
AreRequestsBeingAborted | Are we currently aborting new requests? |
CancelRequest | Generic cancel routine |
CheckBusyAndStall | Checks for idle device and stalls requests in one atomic operation |
CleanupRequests | Cancels all requests for a given file object in order to service IRP_MJ_CLEANUP |
GetCurrentIrp | Determines which IRP is currently being processed by associated StartIo routine |
InitializeQueue | Initializes DEVQUEUE object |
RestartRequests | Restarts a stalled queue |
StallRequests | Stalls the queue |
StartNextPacket | Dequeues and starts the next request |
StartPacket | Starts or queues a new request |
WaitForCurrentIrp | Waits for current IRP to finish |
For the moment, I'll just discuss the support functions that replace functions like StartPacket and StartNextPacket in the standard IRP processing model. For each queue, you provide a separate StartIo routine. Your DriverEntry routine would not store anything in the DriverStartIo pointer field of the driver object. Instead, during AddDevice, you'd initialize your queue object(s) like so:
NTSTATUS AddDevice(...) { ... PDEVICE_EXTENSION pdx = ...; InitializeQueue(&pdx->dqReadWrite, StartIo); ... } |
The dispatch function for an IRP that uses a DEVQUEUE would follow the following pattern:
NTSTATUS DispatchWrite(PDEVICE_OBJECT fdo, PIRP Irp) { <some power management stuff you haven't heard about yet> IoMarkIrpPending(Irp); StartPacket(&pdx->dqReadWrite, fdo, Irp, OnCancel); return STATUS_PENDING; } |
That is, instead of calling IoStartPacket, you call the queue's StartPacket function with the address of the queue object, the device object, the IRP, and your cancel routine. At the start of a dispatch routine, you'll also have a small bit of code to handle restoring power after a period of disuse; I'll discuss that code in Chapter 8.
Here's a sketch of the new kind of StartIo routine you use with a DEVQUEUE:
VOID StartIo(PDEVICE_OBJECT fdo, PIRP Irp) { <some PnP stuff you haven't heard about yet> // start request on device } |
StartIo doesn't worry about IRP cancellation. The cancel routine you use in this scheme is different from a standard one—it simply delegates all work to the DEVQUEUE:
VOID OnCancel(PDEVICE_OBJECT fdo, PIRP Irp) { PDEVICE_EXTENSION pdx = (PDEVICE_EXTENSION) fdo->DeviceExtension; CancelRequest(&pdx->dqReadWrite, Irp); } |
CancelRequest will release the global cancel spin lock, which your cancel routine owns when it gets control, and will then cancel the IRP in a thread-safe and multiprocessor-safe way.
The deferred procedure call (DPC) routine you use when the request finishes also looks a little different from the standard-model one I showed you in Chapter 5, as you can see here:
VOID DpcForIsr(PKDPC Dpc, PDEVICE_OBJECT device, PIRP junk, PVOID context) { PIRP Irp = GetCurrentIrp(&pdx->dqReadWrite); ... StartNextPacket(&pdx->dqReadWrite, device); <some PnP stuff you haven't heard about yet> CompleteRequest(Irp, ...); } |
Like IoStartNextPacket, the StartNextPacket function removes the next IRP from the queue and sends it to your (queue-specific) StartIo routine. It also returns the address of the IRP you were processing or NULL to indicate that your device was not processing an IRP. A NULL return value indicates that the IRP was cancelled or aborted for some reason, so it would be incorrect for you to try to complete it. Since you'll obtain the address of the finishing IRP by calling GetCurrentIrp, don't use the IRP pointer that comes to you as the third argument to the DPC routine. I named it junk to reinforce the point.
The DEVQUEUE also simplifies the handling of an IRP_MJ_CLEANUP. In fact, the code is almost trivial:
NTSTATUS DispatchCleanup(PDEVICE_OBJECT fdo, PIRP Irp) { PDEVICE_EXTENSION pdx = (PDEVICE_EXTENSION) fdo->DeviceExtension; PIO_STACK_LOCATION stack = IoGetCurrentIrpStackLocation(Irp); CleanupRequests(&pdx->dqReadWrite, stack->FileObject, STATUS_CANCELLED); return CompleteRequest(Irp, STATUS_SUCCESS, 0); } |
The real point of using a DEVQUEUE instead of a KDEVICE_QUEUE is that a DEVQUEUE makes it easier to manage the transitions between PnP states. In all of my sample drivers, the device extension contains a state variable with the imaginative name state. I also define an enumeration named DEVSTATE whose values correspond to the PnP states. When you initialize your device object in AddDevice, you'll call InitializeQueue for each of your device queues and also indicate that the device is in the STOPPED state:
NTSTATUS AddDevice(...) { ... PDEVICE_EXTENSION pdx = ...; InitializeQueue(&pdx->dqRead, StartIoReadWrite); pdx->state = STOPPED; ... } |
After AddDevice returns, the system sends IRP_MJ_PNP requests to direct you through the various PnP states the device can assume.
A newly initialized DEVQUEUE is in a STALLED state, such that a call to StartPacket will queue a request even when the device is idle. You'll keep the queue(s) in the STALLED state until you successfully process IRP_MN_START_DEVICE, whereupon you'll execute code like the following:
NTSTATUS HandleStartDevice(...) { status = StartDevice(...); if (NT_SUCCESS(status)) { pdx->state = WORKING; RestartRequests(&pdx->dqReadWrite, fdo); } } |
You record WORKING as the current state of your device, and you call RestartRequests for each of your queues to release any IRPs that might have arrived between the time AddDevice ran and the time you received the IRP_MN_START_DEVICE request.
The PnP Manager always asks your permission before sending you an IRP_MN_STOP_DEVICE. The query takes the form of an IRP_MN_QUERY_STOP_DEVICE request that you can succeed or fail as you choose. The query basically means, "Would you be able to immediately stop your device if the system were to send you an IRP_MN_STOP_DEVICE in a few nanoseconds?" You can handle this query in two slightly different ways. Here's the first way, which is appropriate when your device might be busy with an IRP that either finishes quickly or can be easily terminated in the middle:
1 2 3 4 |
NTSTATUS HandleQueryStop(PDEVICE_OBJECT fdo, PIRP Irp) { Irp->IoStatus.Status = STATUS_SUCCESS; PDEVICE_EXTENSION pdx = (PDEVICE_EXTENSION) fdo->DeviceExtension; if (pdx->state != WORKING) return DefaultPnpHandler(fdo, Irp); if (!OkayToStop(pdx)) return CompleteRequest(Irp, STATUS_UNSUCCESSFUL, 0); StallRequests(&pdx->dqReadWrite); WaitForCurrentIrp(&pdx->dqReadWrite); pdx->state = PENDINGSTOP; return DefaultPnpHandler(fdo, Irp); } |
The other basic way of handling QUERY_STOP is appropriate when your device might be busy with a request that will take a long time and can't be stopped in the middle, such as a tape retension operation that can't be stopped without potentially breaking the tape. In this case, you can use the DEVQUEUE's CheckBusyAndStall function. That function returns TRUE if the device is busy, whereupon you'd fail the QUERY_STOP with STATUS_UNSUCCESSFUL. The function returns FALSE if the device is idle, in which case it also stalls the queue. (The operations of checking the state of the device and stalling the queue need to be protected by a spin lock, which is why I wrote this function in the first place.)
You can fail a stop query for many reasons. Disk devices that are used for paging, for example, cannot be stopped. Neither can devices that are used for storing hibernation or crash dump files. (You'll know about these characteristics as a result of an IRP_MN_DEVICE_USAGE_NOTIFICATION request, which I'll discuss later in "Other Configuration Functionality.") Other reasons may also apply to your device.
Even if you succeed the query, one of the drivers underneath you might fail it for some reason. Even if all the drivers succeed the query, the PnP Manager might decide not to shut you down. In any of these cases, you'll receive another PnP request with the minor code IRP_MN_CANCEL_STOP_DEVICE to tell you that your device won't be shut down. You should then clear whatever state you set during the initial query:
NTSTATUS HandleCancelStop(PDEVICE_OBJECT fdo, PIRP Irp) { Irp->IoStatus.Status = STATUS_SUCCESS; PDEVICE_EXTENSION pdx = (PDEVICE_EXTENSION) fdo->DeviceExtension; if (pdx->state != PENDINGSTOP) return DefaultPnpHandler(fdo, Irp); NTSTATUS status = ForwardAndWait(fdo, Irp); if (NT_SUCCESS(status)) { pdx->state = WORKING; RestartRequests(&pdx->dqReadWrite, fdo); } return CompleteRequest(Irp, status, Irp->IoStatus.Information); } |
We first check to see whether a stop operation is even pending. Some higher-level driver might have vetoed a query that we never saw, so we'd still be in the WORKING state. If we're not in the PENDINGSTOP state, we simply forward the IRP. Otherwise, we send the CANCEL_STOP IRP synchronously to the lower-level drivers. That is, we use our ForwardAndWait helper function to send the IRP down the stack and await its completion. We wait for low-level drivers because we're about to resume processing IRPs, and the drivers might have work to do before we send them an IRP. If the lower layers successfully handle this IRP_MN_CANCEL_STOP_DEVICE, we change our state variable to indicate that we're back in the WORKING state, and we call RestartRequests to unstall the queues we stalled when we succeeded the query.
If, on the other hand, all device drivers succeed the query and the PnP Manager decides to go ahead with the shutdown, you'll get an IRP_MN_STOP_DEVICE next. Your subdispatch function would look like this one:
1 2 3 4 |
NTSTATUS HandleStopDevice(PDEVICE_OBJECT fdo, PIRP Irp) { Irp->IoStatus.Status = STATUS_SUCCESS; PDEVICE_EXTENSION pdx = (PDEVICE_EXTENSION) fdo->DeviceExtension; if (pdx->state != PENDINGSTOP); { <complicated stuff> } StopDevice(fdo, pdx->state == WORKING); pdx->state = STOPPED; return DefaultPnpHandler(fdo, Irp); } |
Just as the PnP Manager asks your permission before shutting your device down with a stop device request, it also might ask your permission before removing your device. This query takes the form of an IRP_MN_QUERY_REMOVE_DEVICE request that you can, once again, succeed or fail as you choose. And, just as with the stop query, the PnP Manager will use an IRP_MN_CANCEL_REMOVE_DEVICE request if it changes its mind about removing the device.
1 2 3 4 5 |
NTSTATUS HandleQueryRemove(PDEVICE_OBJECT fdo, PIRP Irp) { Irp->IoStatus.Status = STATUS_SUCCESS; PDEVICE_EXTENSION pdx = (PDEVICE_EXTENSION) fdo->DeviceExtension; if (OkayToRemove(fdo)) { StallRequests(&pdx->dqReadWrite); WaitForCurrentIrp(&pdx->dqReadWrite); pdx->prevstate = pdx->state; pdx->state = PENDINGREMOVE; return DefaultPnpHandler(fdo, Irp); } return CompleteRequest(Irp, STATUS_UNSUCCESSFUL, 0); } NTSTATUS HandleCancelRemove(PDEVICE_OBJECT fdo, PIRP Irp) { Irp->IoStatus.Status = STATUS_SUCCESS; PDEVICE_EXTENSION pdx = (PDEVICE_EXTENSION) fdo->DeviceExtension; if (pdx->state != PENDINGREMOVE) return DefaultPnpHandler(fdo, Irp); NTSTATUS status = ForwardAndWait(fdo, Irp); if (NT_SUCCESS(status)) { pdx->state = pdx->prevstate; if (pdx->state == WORKING) RestartRequests(&pdx->dqReadWrite, fdo); } return CompleteRequest(Irp, status, Irp->IoStatus.Information); } |
It turns out that the I/O Manager can send you PnP requests simultaneously with other substantive I/O requests, such as requests that involve reading or writing. It's entirely possible, therefore, for you to receive an IRP_MN_REMOVE_DEVICE at a time when you're still processing another IRP. It's up to you to prevent untoward consequences, and the standard way to do that involves using an IO_REMOVE_LOCK object and several associated kernel-mode support routines.
The basic idea behind the standard scheme for preventing premature removal is that you acquire the remove lock each time you start processing a request and you release the lock when you're done. Before you remove your device object, you make sure that the lock is free. If not, you wait until all references to the lock are released. Figure 6-6 illustrates the process.
Figure 6-6. Operation of an IO_REMOVE_LOCK.
To handle the mechanics of this process, you define a variable in the device extension:
struct DEVICE_EXTENSION { ... IO_REMOVE_LOCK RemoveLock; ... }; |
You initialize the lock object during AddDevice:
NTSTATUS AddDevice(PDRIVER_OBJECT DriverObject, PDEVICE_OBJECT pdo) { ... IoInitializeRemoveLock(&pdx->RemoveLock, 0, 0, 256); ... } |
The last three parameters to IoInitializeRemoveLock are, respectively, a tag value, an expected maximum lifetime for a lock, and a maximum lock count, none of which are used in the free build of the operating system.
These preliminaries set the stage for what you do during the lifetime of the device object. Whenever you receive an I/O request, you call IoAcquireRemoveLock. IoAcquireRemoveLock will return STATUS_DELETE_PENDING if a removal operation is underway. Otherwise, it will acquire the lock and return STATUS_SUCCESS. Whenever you finish an I/O operation, you call IoReleaseRemoveLock , which will release the lock and might unleash a heretofore pending removal operation. In the context of some purely hypothetical dispatch function that completes the IRP it's handed, the code might look like this:
NTSTATUS DispatchSomething(PDEVICE_OBJECT fdo, PIRP Irp) { PDEVICE_EXTENSION pdx = (PDEVICE_EXTENSION) fdo->DeviceExtension; NTSTATUS status = IoAcquireRemoveLock(&pdx->RemoveLock, Irp); if (!NT_SUCCESS(status)) return CompleteRequest(Irp, status, 0); ... IoReleaseRemoveLock(&pdx->RemoveLock, Irp); return CompleteRequest(Irp, <some code>, <info value>); } |
The second argument to IoAcquireRemoveLock and IoReleaseRemoveLock is just a tag value that a checked build of the OS can use to match up acquisition and release calls, by the way.
The calls to acquire and release the remove lock dovetail with additional logic in the PnP dispatch function and the remove device subdispatch function. First, DispatchPnp has to obey the rule about locking and unlocking the device, so it will contain the following code that I didn't show you earlier in "IRP_MJ_PNP Dispatch Function":
NTSTATUS DispatchPnp(PDEVICE_OBJECT fdo, PIRP Irp) { PDEVICE_EXTENSION pdx = (PDEVICE_EXTENSION) fdo->DeviceExtension; NTSTATUS status = IoAcquireRemoveLock(&pdx->RemoveLock, Irp); if (!NT_SUCCESS(status)) return CompleteRequest(Irp, status, 0); ... status = (*fcntab[fcn](fdo, Irp); if (fcn != IRP_MN_REMOVE_DEVICE) IoReleaseRemoveLock(&pdx->RemoveLock, Irp); return status; } |
In other words, DispatchPnp locks the device, calls the subdispatch routine, and then (usually) unlocks the device afterward. The subdispatch routine for IRP_MN_REMOVE_DEVICE has additional special logic that you also haven't seen yet:
1 2 3 |
NTSTATUS HandleRemoveDevice(PDEVICE_OBJECT fdo, PIRP Irp) { Irp->IoStatus.Status = STATUS_SUCCESS; PDEVICE_EXTENSION pdx = (PDEVICE_EXTENSION) fdo->DeviceExtension; AbortRequests(&pdx->dqReadWrite, STATUS_DELETE_PENDING); DeregisterAllInterfaces(pdx); StopDevice(fdo, pdx->state == WORKING); pdx->state = REMOVED; NTSTATUS status = DefaultPnpHandler(pdx->LowerDeviceObject, Irp); IoReleaseRemoveLockAndWait(&pdx->RemoveLock, Irp); RemoveDevice(fdo); return status; } |
NOTE
You'll notice that the IRP_MN_REMOVE_DEVICE handler might block while some IRP finishes. This is certainly okay in Windows 98 and Windows 2000, which were designed with this possibility in mind—the IRP gets sent in the context of a system thread that is allowed to block. Some WDM functionality (a Microsoft developer even called it "embryonic") is present in OEM releases of Microsoft Windows 95, but you can't block a remove device request there. Consequently, if your driver needs to run in Windows 95, you need to discover that fact and avoid blocking. That discovery process is left as an exercise for you.
These are the mechanics of locking and unlocking the device to forestall removing the device while it's still in use. You still need to know when to invoke IoAcquireRemoveLock and IoReleaseRemoveLock to bring that mechanism into play. Basically, an IRP dispatch function that will complete the request quickly should acquire and release the lock.
A dispatch routine that queues an IRP should not acquire the remove lock, however. For a queued IRP, you acquire the lock inside StartIo and release it inside your DPC routine. So, we can expand the earlier skeleton of StartIo and DpcForIsr as follows:
1 2 3 |
VOID StartIo(PDEVICE_OBJECT fdo, PIRP Irp) { PDEVICE_EXTENSION pdx =(PDEVICE_EXTENSION) fdo->DeviceExtension; NTSTATUS status = IoAcquireRemoveLock(&pdx->RemoveLock, Irp); if (!NT_SUCCESS(status)) { CompleteRequest(Irp, status, 0); return; } // start request on device } VOID DpcForIsr(PKDPC Dpc, PDEVICE_OBJECT device, PIRP junk, PVOID context) { PDEVICE_EXTENSION pdx = (PDEVICE_EXTENSION) fdo->DeviceExtension; PIRP Irp = GetCurrentIrp(&pdx->dqReadWrite); ... StartNextPacket(&pdx->dqReadWrite, device); IoReleaseRemoveLock(&pdx->RemoveLock, Irp); CompleteRequest(Irp, ...); } |
You should also acquire the remove lock when you successfully process an IRP_MJ_CREATE. In contrast to the other situations we've considered, you don't release the lock before returning from the DispatchCreate routine. The balancing call to IoReleaseRemoveLock occurs instead in the dispatch routine for IRP_MJ_CLOSE. In other words, you hold the remove lock for the entire time something has a handle open to your device. Here's a sketch of what I mean:
NTSTATUS DispatchCreate(...) { ... IoAcquireRemoveLock(&pdx->RemoveLock, stack->FileObject); return CompleteRequest(...); } NTSTATUS DispatchClose(...) { ... IoReleaseRemoveLock(&pdx->RemoveLock, stack->FileObject); return CompleteRequest(...); } |
For debugging purposes, the balancing calls to IoAcquireRemoveLock and IoReleaseRemoveLock should use the same value for the second argument. You wouldn't use the IRP pointer as I've done in my other examples because the CREATE and CLOSE requests are different IRPs. The file object will be the same in both requests, though, which is why I used the file object in this example.
If the end user uses the Device Manager to remove a device when some application has an open handle, the operating system declines to remove the device and so informs the user. In that situation, the fact that you've also claimed the remove lock won't influence the course of events because you'll never get the IRP_MN_REMOVE_DEVICE that would cause you to wait for all holders of the lock to release it. If it's possible for the device to be physically removed from the computer without first going through the Device Manager, however, a correctly written application will be looking for a WM_DEVICECHANGE message that signals departure of the device. (See the discussion of user-mode notifications near the end of this chapter in "PnP Notifications".) The application will then close its handles. You should delay IRP_MN_REMOVE_DEVICE until the handles are actually closed, and the locking logic I've just described allows you to do that.
In contrast to other examples in this book, I'm going to show you the full implementation of the DEVQUEUE object even though the source code is on the companion disc. I'm making an exception in this case because I think an annotated listing of the functions will make it easier for you to understand how to use it.
The DEVQUEUE object has this declaration in my DEVQUEUE.H header file:
typedef struct _DEVQUEUE { LIST_ENTRY head; KSPIN_LOCK lock; PDRIVER_START StartIo; LONG stallcount; PIRP CurrentIrp; KEVENT evStop; NTSTATUS abortstatus; } DEVQUEUE, *PDEVQUEUE; |
InitializeQueue initializes one of these objects like this:
1 2 3 4 5 6 7 |
VOID NTAPI InitializeQueue(PDEVQUEUE pdq, PDRIVER_STARTIO StartIo) { InitializeListHead(&pdq->head); KeInitializeSpinLock(&pdq->lock); pdq->StartIo = StartIo; pdq->stallcount = 1; pdq->CurrentIrp = NULL; KeInitializeEvent(&pdq->evStop, NotificationEvent, FALSE); pdq->abortstatus = (NTSTATUS) 0; } |
Stalling the IRP queue involves two DEVQUEUE functions:
1 2 3 4 |
VOID NTAPI StallRequests(PDEVQUEUE pdq) { InterlockedIncrement(&pdq->stallcount); } BOOLEAN NTAPI CheckBusyAndStall(PDEVQUEUE pdq) { KIRQL oldirql; KeAcquireSpinLock(&pdq->lock, &oldirql); BOOLEAN busy = pdq->CurrentIrp != NULL; if (!busy) InterlockedIncrement(&pdq->stallcount); KeReleaseSpinLock(&pdq->lock, oldirql); return busy; } |
IRPs get added to the queue when a dispatch function calls StartPacket:
1 2 3 4 5 6 |
VOID NTAPI StartPacket(PDEVQUEUE pdq, PDEVICE_OBJECT fdo, PIRP Irp, PDRIVER_CANCEL cancel) { KIRQL oldirql; KeAcquireSpinLock(&pdq->lock, &oldirql); if (pdq->abortstatus) { KeReleaseSpinLock(&pdq->lock, oldirql); Irp->IoStatus.Status = pdq->abortstatus; IoCompleteRequest(Irp, IO_NO_INCREMENT); } else if (pdq->CurrentIrp || pdq->stallcount) { IoSetCancelRoutine(Irp, cancel); if (Irp->Cancel && IoSetCancelRoutine(Irp, NULL)) { KeReleaseSpinLock(&pdq->lock, oldirql); Irp->IoStatus.Status = STATUS_CANCELLED; IoCompleteRequest(Irp, IO_NO_INCREMENT); } else { InsertTailList(&pdq->head, &Irp->Tail.Overlay.ListEntry); KeReleaseSpinLock(&pdq->lock, oldirql); } else { pdq->CurrentIrp = Irp; KeReleaseSpinLock(&pdq->lock, DISPATCH_LEVEL); (*pdq->StartIo)(fdo, Irp); KeLowerIrql(oldirql); } } |
I'd like to discuss a pesky nonproblem in the above code. Programs that change CurrentIrp do so while owning our spin lock, so we can be sure there's no ambiguity in our test of CurrentIrp. The stall counter, on the other hand, can be incremented without the spin lock inside StallRequests. It should be obvious that the only potential problem occurs when the counter is being incremented from 0 to 1 more or less simultaneously with us, because we behave the same way no matter what nonzero value the counter might have. Consider the potential race with a call to StallRequests that will increment the counter from 0 to 1. If we beat the increment and find the counter 0, we'll go ahead and start a request. That's okay, because the caller of StallRequests is willing to have the device be busy. (If the caller weren't willing, it would have used CheckBusyAndStall instead.) If we find the counter already incremented, we'll queue the IRP, which is also consistent with what the caller of StallRequests intended.
The function that dequeues most IRPs is StartNextPacket, which is called from a DPC routine:
1 2 3 4 5 6 7 |
PIRP NTAPI StartNextPacket(PDEVQUEUE pdq, PDEVICE_OBJECT fdo) { KIRQL oldirql; KeAcquireSpinLock(&pdq->lock, &oldirql)); PIRP CurrentIrp = (PIRP) InterlockedExchangePointer (&pdq->CurrentIrp, NULL); if (CurrentIrp) KeSetEvent(&pdq->evStop, 0, FALSE); while (!pdq->stallcount && !pdq->abortstatus && !IsListEmpty(&pdq->head)) { PLIST_ENTRY next = RemoveHeadList(&pdq->head); PIRP Irp = CONTAINING_RECORD(next, IRP, Tail.Overlay.ListEntry); if (!IoSetCancelRoutine(Irp, NULL)) { InitializeListHead(&Irp->Tail.Overlay.ListEntry); continue; } pdq->CurrentIrp = Irp; KeReleaseSpinLockFromDpcLevel(&pdq->lock); (*pdq->StartIo)(fdo, Irp); KeLowerIrql(oldirql); return CurrentIrp; } KeReleaseSpinLock(&pdq->lock, oldirql); return CurrentIrp; } |
The RestartRequests function balances a call to StallRequests or CheckBusyAndStall. It's complicated—very slightly—by the need to send the first IRP to the StartIo routine. Luckily, it can just call StartNextPacket:
VOID NTAPI RestartRequests(PDEVQUEUE pdq, PDEVICE_OBJECT fdo) { if (InterlockedDecrement(&pdq->stallcount) > 0) return; StartNextPacket(pdq, fdo); } |
StartPacket registers a cancel routine supplied by its caller, which in turn simply delegates the work to the queue's CancelRequest function:
VOID NTAPI CancelRequest(PDEVQUEUE pdq, PIRP Irp) { KIRQL oldirql = Irp->CancelIrql; IoReleaseCancelSpinLock(DISPATCH_LEVEL); KeAcquireSpinLockAtDpcLevel(&pdq->lock); RemoveEntryList(&Irp->Tail.Overlay.ListEntry); KeReleaseSpinLock(&pdq->lock, oldirql); Irp->IoStatus.Status = STATUS_CANCELLED; IoCompleteRequest(Irp, IO_NO_INCREMENT); } |
We're called while we own the global cancel spin lock, which we release almost immediately. After this everything is protected by the queue's spin lock instead. When IoCancelIrp called IoAcquireCancelSpinLock , it saved the previous interrupt request level (IRQL) value in the CancelIrql field of the IRP, and we need to eventually revert to that same IRQL; hence, we save it in the oldirql variable.
NOTE
The caller of IoCancelIrp is responsible for making sure that the IRP has not already been completed.
IRPs can also be cancelled as a result of an IRP_MJ_CLEANUP, which we'll receive prior to an IRP_MJ_CLOSE. The DEVQUEUE CleanupRequests function is almost identical to the standard-model DispatchCleanup routine I showed you in the previous chapter. The only substantive difference between the two is that we only need to acquire the queue's spin lock:
1 2 3 4 5 6 7 |
VOID NTAPI CleanupRequests(PDEVQUEUE pdq, PFILE_OBJECT fop, NTSTATUS status) { LIST_ENTRY cancellist; InitializeListhead(&cancellist); KIRQL oldirql; KeAcquireSpinLock(&pdq->lock, &oldirql); PLIST_ENTRY first = &pdq->head; PLIST_ENTRY next; for (next = first->Flink; next != first; ) { PIRP Irp = CONTAINING_RECORD(next, IRP, Tail.Overlay.ListEntry); PIO_STACK_LOCATION stack = IoGetCurrentIrpStackLocation(Irp); next = next->Flink; if (fop && stack->FileObject != fop) continue; if (!IoSetCancelRoutine(Irp, NULL)) continue; RemoveEntryList(next); InsertTailList(&cancellist, next); } KeReleaseSpinLock(&pdq->lock, oldirql); while (!IsListEmpty(&cancellist)) { next = RemoveHeadList(&cancellist); PIRP Irp = CONTAINING_RECORD(next, IRP, Tail.Overlay.ListEntry); Irp->IoStatus.Status = status; IoCompleteRequest(Irp, IO_NO_INCREMENT); } } |
CleanupRequests can be called from elsewhere in the driver, by the way. For example, earlier I showed you a call from the IRP_MN_REMOVE_DEVICE handler, which supplied a NULL file object pointer (in order to select all IRPs) and a status code of STATUS_DELETE_PENDING.
The handler for IRP_MN_STOP_DEVICE might need to wait for the current IRP, if any, to finish by calling WaitForCurrentIrp:
1 2 3 |
VOID NTAPI WaitForCurrentIrp(PDEVQUEUE pdq) { KeClearEvent(&pdq->evStop); ASSERT(pdq->stallcount != 0); KIRQL oldirql; KeAcquireSpinLock(&pdq->lock, &oldirql); BOOLEAN mustwait = pdq->CurrentIrp != NULL; KeReleaseSpinLock(&pdq->lock, oldirql); if (mustwait) KeWaitForSingleObject(&pdq->evStop, Executive, KernelMode, FALSE, NULL); } |
Surprise removal of the device demands that we immediately halt every outstanding IRP that might try to touch the hardware. In addition, we want to make sure that all further IRPs get rejected. The AbortRequests function helps with these tasks:
VOID NTAPI AbortRequests(PDEVQUEUE pdq, NTSTATUS status) { pdq->abortstatus = status; CleanupRequests(pdq, NULL, status); } |
Setting abortstatus puts the queue into the REJECTING state so that all future IRPs will be rejected with whatever status value our caller supplied. Calling CleanupRequests at this point—with a NULL file object pointer so that CleanupRequests will process the entire queue—empties the queue.
We don't dare try to do anything with the IRP, if any, that's currently active on the hardware. Drivers that don't use the hardware abstraction layer (HAL) to access the hardware—USB drivers, for example, which rely on the hub and host-controller drivers—can count on another driver to fail the current IRP. Drivers that use the HAL might, however, need to worry about hanging the system or, at the very least, leaving an IRP in limbo because the nonexistent hardware can't generate the interrupt that would let the IRP finish. To deal with situations like this, you call AreRequestsBeingAborted:
NTSTATUS AreRequestsBeingAborted(PDEVQUEUE pdq) { return pdq->abortstatus; } |
It would be silly, by the way, to use the queue spin lock in this routine. Suppose that we were to capture the instantaneous value of abortstatus in a thread-safe and multiprocessor-safe way. The value we return could become obsolete as soon as we release the spin lock.
NOTE
If your device might be removed in such a way that an outstanding request simply hangs, you should also have a watchdog timer of some sort running that will let you kill the IRP after some period of time. See the "Watchdog Timers" section in Chapter 9, "Specialized Topics."
Sometimes we need to undo the effect of a previous call to AbortRequest. AllowRequests lets us do that:
VOID NTAPI AllowRequests(PDEVQUEUE pdq) { pdq->abortstatus = (NTSTATUS) 0; } |