Particle physics has its "standard model" for the universe, and so does WDM. Figure 5-5 illustrates a typical flow of ownership for an IRP as it progresses through various stages in its life. Not every type of IRP would go through these steps, and some of the steps might be missing or altered depending on the type of device and the type of IRP. Notwithstanding the possible variability, however, the picture provides a useful starting point for discussion.
Figure 5-5. The "standard model" for IRP processing.
It's Even More Complicated than You Thought…The first time you encounter the concepts that make up the standard model for IRP processing, they'll probably seem pretty complicated. Unfortunately, the standard model is also not quite sufficient to handle all the problems that can arise in a regime that includes hot pluggable devices, dynamic resource reconfiguration, and power management. In later chapters, I'll describe another way of queuing and cancelling IRPs that deals with these extra problems. The standard model will seem like a model of clarity when you're done reading about that!
Despite the problems that some devices present, many devices can still employ the standard model (which is, of course, why I'm bothering to explain it here). If your device cannot be removed or reconfigured while the system is running and can reject I/O requests while in a low-power state, you can use the standard model.
The IRP begins life when some entity calls an I/O Manager function to create it. In the figure, I used the term I/O Manager to describe this entity, as though there were a single system component responsible for creating IRPs. In reality, no such single actor in the population of operating system routines exists, and it would have been more accurate to just say that something creates the IRP. Your own driver will be creating IRPs from time to time, for example, and you will occupy the initial ownership box for those particular IRPs.
You can use any of four functions to create a new IRP:
The Fsd in the first two of these function names stands for file system driver (FSD). Although FSDs are the primary users of the functions, any driver is allowed to call them. The DDK also documents a function named IoMakeAssociatedIrp for building an IRP that's subordinate to some other IRP. WDM drivers should not call this function. Indeed, completion of associated IRPs doesn't work correctly in Microsoft Windows 98 anyway.
Deciding which of these functions to call and determining what additional initialization you need to perform on an IRP is a rather complicated matter. I'll come back to this subject, therefore, at the end of this chapter.
After you create an IRP, you call IoGetNextIrpStackLocation to obtain a pointer to the first stack location. Then you initialize just that first location. At the very least, you need to fill in the MajorFunction code. Having initialized the stack, you call IoCallDriver to send the IRP to a device driver:
PDEVICE_OBJECT DeviceObject; // something gives you this PIO_STACK_LOCATION stack = IoGetNextIrpStackLocation(Irp); stack->MajorFunction = IRP_MJ_Xxx; <other initialization of "stack"> NTSTATUS status = IoCallDriver(DeviceObject, Irp); |
The first argument to IoCallDriver is the address of a device object that you've obtained somehow. I'll describe two common ways of getting a device object pointer at the very end of this chapter in "Where Do Device Object Pointers Come From?" For the time being, imagine that these pointers just come to you out of the blue.
The initial stack location pointer in the IRP gets initialized to one before the actual first location. Since the I/O stack is an array of IO_STACK_LOCATION structures, you could think of the stack pointer as being initialized to point to the "-1" element, which doesn't exist. (In fact, the stack "grows" from high toward low addresses, but that detail shouldn't obscure the concept I'm trying to describe here.) We therefore ask for the "next" stack location when we want to initialize the first one. IoCallDriver will advance the stack pointer to the 0 entry and extract the major function code that we left there. That's the made-up value IRP_MJ_Xxx in this example. Then, IoCallDriver will follow the DriverObject pointer inside the device object to the MajorFunction table belonging to the top-level driver. Recall that the driver's DriverEntry function filled that table in with pointers to dispatch functions in the driver. IoCallDriver will use the major function code to index the table, and it will then call the function whose address it finds.
You can imagine IoCallDriver as looking something like this (but I hasten to add that this is not a copy of the actual source code):
NTSTATUS IoCallDriver(PDEVICE_OBJECT device, PIRP Irp) { IoSetNextIrpStackLocation(Irp); PIO_STACK_LOCATION stack = IoGetCurrentIrpStackLocation(Irp); stack->DeviceObject = device; ULONG fcn = stack->MajorFunction; PDRIVER_OBJECT driver = device->DriverObject; return (*driver->MajorFunction[fcn])(device, Irp); } |
An archetypal IRP dispatch routine would look similar to this example:
1 2 3 |
NTSTATUS DispatchXxx(PDEVICE_OBJECT device, PIRP Irp) { PIO_STACK_LOCATION stack = IoGetCurrentIrpStackLocation(Irp); PDEVICE_EXTENSION pdx = (PDEVICE_EXTENSION) device->DeviceExtension; ... return STATUS_Xxx; } |
In this book, I'll be using names of the form DispatchXxx (for example, DispatchRead, DispatchPnp, and so forth) for the dispatch functions in my sample drivers. Other authors use different conventions for these names. Microsoft recommends, for example, that you use a name like RandomDispatchRead for the IRP_MJ_READ dispatch function in a driver named RANDOM.SYS. Conventions like this make it easier to understand debugger traces in some situations, but they also require you to do more typing. Since these names aren't visible outside the name space of your own driver, it's up to you whether you use very specific names as Microsoft recommends or names such as Fred that have meaning to you.
Where I used an ellipsis in the prototypical dispatch function above, a dispatch function has to choose between three courses of action. It can complete the request immediately, pass the request down to a lower-level driver in the same driver stack, or queue the request for later processing by other routines in this driver. I'm going to discuss each of these alternatives fully in this chapter, but I'm going to talk about only the queuing possibility now because that's what comes next in the standard model for IRP processing. You see, the largest number of requests that come into a device involves reading or writing data, and you usually need to put these requests into a queue to serialize access to your hardware.
Every device object gets a request queue object "for free," and there's a standard way of using this queue:
1 2 3 |
NTSTATUS DispatchXxx(...) { ... IoMarkIrpPending(Irp); IoStartPacket(device, Irp, NULL, NULL); return STATUS_PENDING; } |
It's very important not to touch the IRP once we call IoStartPacket. By the time that function returns, the IRP may have been completed and the memory it occupies released. The pointer we have might, therefore, now be invalid.
The I/O Manager calls your StartIo routine to process one IRP at a time:
VOID StartIo(PDEVICE_OBJECT device, PIRP Irp) { PIO_STACK_LOCATION stack = IoGetCurrentIrpStackLocation(Irp); PDEVICE_EXTENSION pdx = (PDEVICE_EXTENSION) device->DeviceExtension; ... } |
Your StartIo routine receives control at DISPATCH_LEVEL, meaning that it must not generate any page faults. In addition, the CurrentIrp field of the device object and the Irp argument will both point to the IRP that's being submitted to you for processing.
Your job in StartIo is to commence the IRP you've been handed. How you do this depends entirely on your device. Often you will need to access hardware registers that are also used by your interrupt service routine (ISR) and, perhaps, by other routines in your driver. In fact, sometimes the easiest way to commence a new operation is to store some state information in your device extension and then fake an interrupt. Since either of these approaches needs to be carried out under protection of the same spin lock that protects your ISR, the correct way to proceed is to call KeSynchronizeExecution. For example:
VOID StartIo(...) { ... KeSynchronizeExecution(pdx->InterruptObject, TransferFirst, (PVOID) pdx); } BOOLEAN TransferFirst(PVOID context) { PDEVICE_EXTENSION pdx = (PDEVICE_EXTENSION) context; ... return TRUE; } |
The TransferFirst routine shown here is an example of the generic class of SynchCritSection routines, so called because they are synchronized with the ISR. I'll discuss the SynchCritSection concept in more detail in Chapter 7, "Reading and Writing Data."
Once StartIo gets the device busy handling the new request, it returns. You'll see the request next when your device interrupts to signal that it's done with whatever transfer you started.
When your device is finished transferring data, it might signal a hardware interrupt. In Chapter 7, I'll show you how to use IoConnectInterrupt to "hook" the interrupt. One of the arguments to IoConnectInterrupt is the address of your ISR. When the interrupt occurs, the hardware abstraction layer (HAL) calls your ISR. The ISR runs at the device IRQL (DIRQL) of your particular device and under the protection of a spin lock associated specifically with your ISR. The ISR has the following prototype:
BOOLEAN OnInterrupt(PKINTERRUPT InterruptObject, PVOID context) { ... } |
The first argument of your ISR is the address of the interrupt object created by IoConnectInterrupt, but you're unlikely to use this argument. The second argument is whatever context value you specified in your original call to IoConnectInterrupt; it will probably be the address of your device object or of your device extension, depending on your preference.
I'll discuss the duties of your ISR in detail in Chapter 7 in connection with reading and writing data, the subject to which interrupt handling is most relevant. To carry on with this discussion of the standard model, I need to tell you that one of the likely things for the ISR to do is to schedule a deferred procedure call (DPC). The purpose of the DPC is to let you do things, like calling IoCompleteRequest, that can't be done at the rarified DIRQL at which your ISR runs. So, supposing you develop a pointer named device to your device object inside the ISR, you'd have a line of code like this one:
IoRequestDpc(device, device->CurrentIrp, NULL); |
You'll next see the IRP in the DPC routine you registered inside AddDevice with your call to IoInitializeDpcRequest. The traditional name for that routine is DpcForIsr because it's the DPC routine your ISR requests.
The DpcForIsr routine requested by your ISR receives control at DISPATCH_LEVEL. Generally, its job is to finish up the processing of the IRP that caused the most recent interrupt. Often that job entails calling IoCompleteRequest to complete this IRP and IoStartNextPacket to remove the next IRP from your device queue for forwarding to StartIo.
1 2 |
VOID DpcForIsr(PKDPC Dpc PDEVICE_OBJECT device, PIRP Irp, PVOID context) { ... IoStartNextPacket(device, FALSE); IoCompleteRequest(Irp, boost); } |
The call to IoCompleteRequest is the end of this standard way of handling an I/O request. After that call, the I/O Manager (or whatever created the IRP in the first place) owns the IRP once more. That entity will destroy the IRP and might unblock a thread that has been waiting for the request to complete.
Some devices operate in such a way that it makes sense to have more than one queue of requests. A common example is a serial port, which can handle independent streams of input and output requests simultaneously. Both IoStartPacket and IoStartNextPacket (and their key-sorting equivalents) work with a queue that you get "for free" as part of the device object. It's relatively easy to create additional queues that work the same way as the standard queue managed by those routines.
To make it easier to discuss things, let's suppose that you need a separate queue to manage IRP_MJ_SPECIAL requests. (There's no such major function code—I made it up just so that we'd have a concrete topic for the discussion.) You would write two helper functions that would do for these special IRPs pretty much the same thing as the StartIo and DpcForIsr routines I mentioned earlier:
You'll also create a KDEVICE_QUEUE object in your device extension. You'd initialize this object during AddDevice:
NTSTATUS AddDevice(...) { ... KeInitializeDeviceQueue(&pdx->dqSpecial); ... } |
where dqSpecial is the name of the KDEVICE_OBJECT we'll use for IRP_MJ_SPECIAL requests. A device queue object is a three-state object. (See Figure 5-6.) These states influence how the support routines for device queues operate:
Figure 5-6. States of a KDEVICE_QUEUE queue.
We use these support routines and the special device queue in our dispatch and DPC routines, as follows:
1 2 3 4 |
NTSTATUS DispatchSpecial(PDEVICE_OBJECT fdo, PIRP Irp) { IoMarkIrpPending(Irp); KIRQL oldirql; KeRaiseIrql(DISPATCH_LEVEL, &oldirql); PDEVICE_EXTENSION pdx = (PDEVICE_EXTENSION) fdo->DeviceExtension; if (!KeInsertDeviceQueue(&pdx->dqSpecial, &Irp->Tail.Overlay.DeviceQueueEntry)) StartIoSpecial(fdo, Irp); KeLowerIrql(oldirql); return STATUS_PENDING; } VOID DpcSpecial(...) { ... PKDEVICE_QUEUE_ENTRY qep = KeRemoveDeviceQueue(&pdx->dqSpecial); if (qep) StartIoSpecial(fdo, CONTAINING_RECORD(qep, IRP, Tail.Overlay.DeviceQueueEntry)); ... } |
It's no coincidence that my earlier descriptions of StartPacket and StartNextPacket sound so similar to what I've just described. Those functions work with a KDEVICE_QUEUE object named DeviceQueue that's one of the opaque fields of a device object, and their logic is the same as your logic when you manage your own device queue.