[Previous] [Next]

The "Standard Model" for IRP Processing

Particle physics has its "standard model" for the universe, and so does WDM. Figure 5-5 illustrates a typical flow of ownership for an IRP as it progresses through various stages in its life. Not every type of IRP would go through these steps, and some of the steps might be missing or altered depending on the type of device and the type of IRP. Notwithstanding the possible variability, however, the picture provides a useful starting point for discussion.

Click to view at full size.

Figure 5-5. The "standard model" for IRP processing.

Creating an IRP

The IRP begins life when some entity calls an I/O Manager function to create it. In the figure, I used the term I/O Manager to describe this entity, as though there were a single system component responsible for creating IRPs. In reality, no such single actor in the population of operating system routines exists, and it would have been more accurate to just say that something creates the IRP. Your own driver will be creating IRPs from time to time, for example, and you will occupy the initial ownership box for those particular IRPs.

You can use any of four functions to create a new IRP:

The Fsd in the first two of these function names stands for file system driver (FSD). Although FSDs are the primary users of the functions, any driver is allowed to call them. The DDK also documents a function named IoMakeAssociatedIrp for building an IRP that's subordinate to some other IRP. WDM drivers should not call this function. Indeed, completion of associated IRPs doesn't work correctly in Microsoft Windows 98 anyway.

Deciding which of these functions to call and determining what additional initialization you need to perform on an IRP is a rather complicated matter. I'll come back to this subject, therefore, at the end of this chapter.

Forwarding to a Dispatch Routine

After you create an IRP, you call IoGetNextIrpStackLocation to obtain a pointer to the first stack location. Then you initialize just that first location. At the very least, you need to fill in the MajorFunction code. Having initialized the stack, you call IoCallDriver to send the IRP to a device driver:

PDEVICE_OBJECT DeviceObject; //  something gives you this
PIO_STACK_LOCATION stack = IoGetNextIrpStackLocation(Irp);
stack->MajorFunction = IRP_MJ_Xxx;
<other initialization of "stack">
NTSTATUS status = IoCallDriver(DeviceObject, Irp);

The first argument to IoCallDriver is the address of a device object that you've obtained somehow. I'll describe two common ways of getting a device object pointer at the very end of this chapter in "Where Do Device Object Pointers Come From?" For the time being, imagine that these pointers just come to you out of the blue.

The initial stack location pointer in the IRP gets initialized to one before the actual first location. Since the I/O stack is an array of IO_STACK_LOCATION structures, you could think of the stack pointer as being initialized to point to the "-1" element, which doesn't exist. (In fact, the stack "grows" from high toward low addresses, but that detail shouldn't obscure the concept I'm trying to describe here.) We therefore ask for the "next" stack location when we want to initialize the first one. IoCallDriver will advance the stack pointer to the 0 entry and extract the major function code that we left there. That's the made-up value IRP_MJ_Xxx in this example. Then, IoCallDriver will follow the DriverObject pointer inside the device object to the MajorFunction table belonging to the top-level driver. Recall that the driver's DriverEntry function filled that table in with pointers to dispatch functions in the driver. IoCallDriver will use the major function code to index the table, and it will then call the function whose address it finds.

You can imagine IoCallDriver as looking something like this (but I hasten to add that this is not a copy of the actual source code):

NTSTATUS IoCallDriver(PDEVICE_OBJECT device, PIRP Irp)
  {
  IoSetNextIrpStackLocation(Irp);
  PIO_STACK_LOCATION stack = IoGetCurrentIrpStackLocation(Irp);
  stack->DeviceObject = device;
  ULONG fcn = stack->MajorFunction;
  PDRIVER_OBJECT driver = device->DriverObject;
  return (*driver->MajorFunction[fcn])(device, Irp);
  }

Duties of a Dispatch Routine

An archetypal IRP dispatch routine would look similar to this example:



1 
2 

3 

NTSTATUS DispatchXxx(PDEVICE_OBJECT device, PIRP Irp)
  {
  PIO_STACK_LOCATION stack = IoGetCurrentIrpStackLocation(Irp);
  PDEVICE_EXTENSION pdx = (PDEVICE_EXTENSION) device->DeviceExtension;
  ...
  return STATUS_Xxx;
  }

  1. You generally need to access the current stack location to determine parameters or to examine the minor function code.
  2. You also generally need to access the device extension you created and initialized during AddDevice.
  3. You'll be returning some NTSTATUS code to IoCallDriver, which will propagate the code back to its caller.

In this book, I'll be using names of the form DispatchXxx (for example, DispatchRead, DispatchPnp, and so forth) for the dispatch functions in my sample drivers. Other authors use different conventions for these names. Microsoft recommends, for example, that you use a name like RandomDispatchRead for the IRP_MJ_READ dispatch function in a driver named RANDOM.SYS. Conventions like this make it easier to understand debugger traces in some situations, but they also require you to do more typing. Since these names aren't visible outside the name space of your own driver, it's up to you whether you use very specific names as Microsoft recommends or names such as Fred that have meaning to you.

Where I used an ellipsis in the prototypical dispatch function above, a dispatch function has to choose between three courses of action. It can complete the request immediately, pass the request down to a lower-level driver in the same driver stack, or queue the request for later processing by other routines in this driver. I'm going to discuss each of these alternatives fully in this chapter, but I'm going to talk about only the queuing possibility now because that's what comes next in the standard model for IRP processing. You see, the largest number of requests that come into a device involves reading or writing data, and you usually need to put these requests into a queue to serialize access to your hardware.

Every device object gets a request queue object "for free," and there's a standard way of using this queue:




1 
2 
3 

NTSTATUS DispatchXxx(...)
  {
  ...
  IoMarkIrpPending(Irp);
  IoStartPacket(device, Irp, NULL, NULL);
  return STATUS_PENDING;
  }

  1. Whenever we return STATUS_PENDING from a dispatch routine (as we're about to do here), we make this call to help the I/O Manager avoid an internal race condition. We must do this before we relinquish ownership of the IRP.
  2. If our device is currently busy, IoStartPacket puts the request onto a queue. If our device is idle, IoStartPacket marks the device as being busy and calls our StartIo routine. I'll describe the StartIo routine in the next section. The third argument to IoStartPacket is the address of a ULONG key used for sorting the queue. Disk drivers, for example, might specify a cylinder address here to provide for ordered-seek queuing. If you supply NULL, as here, this request is added to the tail of the queue. The last argument is the address of a cancel routine. I'll discuss cancel routines later in this chapter—they're complicated!
  3. We return STATUS_PENDING to tell our caller that we're not done with this IRP yet.

It's very important not to touch the IRP once we call IoStartPacket. By the time that function returns, the IRP may have been completed and the memory it occupies released. The pointer we have might, therefore, now be invalid.

The StartIo Routine

The I/O Manager calls your StartIo routine to process one IRP at a time:

VOID StartIo(PDEVICE_OBJECT device, PIRP Irp)
  {
  PIO_STACK_LOCATION stack = IoGetCurrentIrpStackLocation(Irp);
  PDEVICE_EXTENSION pdx = (PDEVICE_EXTENSION) device->DeviceExtension;
  ...
  }

Your StartIo routine receives control at DISPATCH_LEVEL, meaning that it must not generate any page faults. In addition, the CurrentIrp field of the device object and the Irp argument will both point to the IRP that's being submitted to you for processing.

Your job in StartIo is to commence the IRP you've been handed. How you do this depends entirely on your device. Often you will need to access hardware registers that are also used by your interrupt service routine (ISR) and, perhaps, by other routines in your driver. In fact, sometimes the easiest way to commence a new operation is to store some state information in your device extension and then fake an interrupt. Since either of these approaches needs to be carried out under protection of the same spin lock that protects your ISR, the correct way to proceed is to call KeSynchronizeExecution. For example:

VOID StartIo(...)
  {
  ...
  KeSynchronizeExecution(pdx->InterruptObject,
    TransferFirst, (PVOID) pdx);
  }

BOOLEAN TransferFirst(PVOID context)
  {
  PDEVICE_EXTENSION pdx = (PDEVICE_EXTENSION) context;
  ...
  return TRUE;
  }

The TransferFirst routine shown here is an example of the generic class of SynchCritSection routines, so called because they are synchronized with the ISR. I'll discuss the SynchCritSection concept in more detail in Chapter 7, "Reading and Writing Data."

Once StartIo gets the device busy handling the new request, it returns. You'll see the request next when your device interrupts to signal that it's done with whatever transfer you started.

The Interrupt Service Routine

When your device is finished transferring data, it might signal a hardware interrupt. In Chapter 7, I'll show you how to use IoConnectInterrupt to "hook" the interrupt. One of the arguments to IoConnectInterrupt is the address of your ISR. When the interrupt occurs, the hardware abstraction layer (HAL) calls your ISR. The ISR runs at the device IRQL (DIRQL) of your particular device and under the protection of a spin lock associated specifically with your ISR. The ISR has the following prototype:

BOOLEAN OnInterrupt(PKINTERRUPT InterruptObject, PVOID context)
  {
  ...
  }

The first argument of your ISR is the address of the interrupt object created by IoConnectInterrupt, but you're unlikely to use this argument. The second argument is whatever context value you specified in your original call to IoConnectInterrupt; it will probably be the address of your device object or of your device extension, depending on your preference.

I'll discuss the duties of your ISR in detail in Chapter 7 in connection with reading and writing data, the subject to which interrupt handling is most relevant. To carry on with this discussion of the standard model, I need to tell you that one of the likely things for the ISR to do is to schedule a deferred procedure call (DPC). The purpose of the DPC is to let you do things, like calling IoCompleteRequest, that can't be done at the rarified DIRQL at which your ISR runs. So, supposing you develop a pointer named device to your device object inside the ISR, you'd have a line of code like this one:

IoRequestDpc(device, device->CurrentIrp, NULL);

You'll next see the IRP in the DPC routine you registered inside AddDevice with your call to IoInitializeDpcRequest. The traditional name for that routine is DpcForIsr because it's the DPC routine your ISR requests.

Deferred Procedure Call Routine

The DpcForIsr routine requested by your ISR receives control at DISPATCH_LEVEL. Generally, its job is to finish up the processing of the IRP that caused the most recent interrupt. Often that job entails calling IoCompleteRequest to complete this IRP and IoStartNextPacket to remove the next IRP from your device queue for forwarding to StartIo.




1 
2 

VOID DpcForIsr(PKDPC Dpc PDEVICE_OBJECT device, PIRP Irp, PVOID context)
  {
  ...
  IoStartNextPacket(device, FALSE);
  IoCompleteRequest(Irp, boost);
  }

  1. IoStartNextPacket removes the next IRP from your queue and sends it to StartIo. The FALSE argument indicates that this IRP can't be cancelled in the normal way. By the time you finish this chapter, you'll know how to handle the more normal case in which you specify TRUE for the second argument.
  2. IoCompleteRequest completes the IRP you specify as the first argument. The second argument specifies a priority boost for the thread that has been waiting for this IRP. You'll also fill in the IoStatus block within the IRP before calling IoCompleteRequest, as I'll explain later in the section "Completion Mechanics."

The call to IoCompleteRequest is the end of this standard way of handling an I/O request. After that call, the I/O Manager (or whatever created the IRP in the first place) owns the IRP once more. That entity will destroy the IRP and might unblock a thread that has been waiting for the request to complete.

Custom Queues

Some devices operate in such a way that it makes sense to have more than one queue of requests. A common example is a serial port, which can handle independent streams of input and output requests simultaneously. Both IoStartPacket and IoStartNextPacket (and their key-sorting equivalents) work with a queue that you get "for free" as part of the device object. It's relatively easy to create additional queues that work the same way as the standard queue managed by those routines.

To make it easier to discuss things, let's suppose that you need a separate queue to manage IRP_MJ_SPECIAL requests. (There's no such major function code—I made it up just so that we'd have a concrete topic for the discussion.) You would write two helper functions that would do for these special IRPs pretty much the same thing as the StartIo and DpcForIsr routines I mentioned earlier:

You'll also create a KDEVICE_QUEUE object in your device extension. You'd initialize this object during AddDevice:

NTSTATUS AddDevice(...)
  {
  ...
  KeInitializeDeviceQueue(&pdx->dqSpecial);
  ...
  }

where dqSpecial is the name of the KDEVICE_OBJECT we'll use for IRP_MJ_SPECIAL requests. A device queue object is a three-state object. (See Figure 5-6.) These states influence how the support routines for device queues operate:

Figure 5-6. States of a KDEVICE_QUEUE queue.

We use these support routines and the special device queue in our dispatch and DPC routines, as follows:



1 

2 

3 









4 





NTSTATUS DispatchSpecial(PDEVICE_OBJECT fdo, PIRP Irp)
  {
  IoMarkIrpPending(Irp);
  KIRQL oldirql;
  KeRaiseIrql(DISPATCH_LEVEL, &oldirql);
  PDEVICE_EXTENSION pdx = (PDEVICE_EXTENSION) fdo->DeviceExtension;
  if (!KeInsertDeviceQueue(&pdx->dqSpecial,
    &Irp->Tail.Overlay.DeviceQueueEntry))
    StartIoSpecial(fdo, Irp);
  KeLowerIrql(oldirql);
  return STATUS_PENDING;
  }

VOID DpcSpecial(...)
  {
  ...
  PKDEVICE_QUEUE_ENTRY qep = KeRemoveDeviceQueue(&pdx->dqSpecial);
  if (qep)
    StartIoSpecial(fdo, CONTAINING_RECORD(qep, IRP,
      Tail.Overlay.DeviceQueueEntry));
  ...
  }

  1. As with a "regular" dispatch routine, we mark this IRP as pending because we're going to queue it and return STATUS_PENDING.
  2. KeInsertDeviceQueue and our own StartIoSpecial expect to be called at DISPATCH_LEVEL. Hence, we explicitly raise IRQL to that level. We'll use KeLowerIrql shortly to lower IRQL back to what it currently is (probably PASSIVE_LEVEL).
  3. This call to KeInsertDeviceQueue might add the IRP to the queue, in which case the return value will be TRUE and we won't do anything more with the IRP. If the device is currently idle, however, the return value will be FALSE and the IRP will not have been placed on the queue. We therefore call StartIoSpecial directly.
  4. This call to KeRemoveDeviceQueue from the DPC routine will have one of two results. If the queue is currently empty, the return value will be NULL and we won't do anything more about starting a new request (as there aren't any!). Otherwise, the return value will be the address of the queue linking field within the IRP. We use CONTAINING_RECORD to recover the address of the IRP, which we then pass to StartIoSpecial. Note that this DPC routine is already running at DISPATCH_LEVEL, so we don't need to adjust IRQL before removing an entry from the queue or calling the StartIo routine.

It's no coincidence that my earlier descriptions of StartPacket and StartNextPacket sound so similar to what I've just described. Those functions work with a KDEVICE_QUEUE object named DeviceQueue that's one of the opaque fields of a device object, and their logic is the same as your logic when you manage your own device queue.