Performing power management tasks correctly requires very accurate coding, and there are many complicating factors. For example, your device might have the ability to wake up the system from a sleeping state. Deciding whether to succeed or fail a query, and deciding which device power state corresponds to a given new system power state, depends on whether your wake-up feature is currently armed. You may have powered down your own device because of inactivity, and you need to provide for restoring power when a substantive IRP comes along. Maybe your device is an "inrush" device that needs a large spike of current to power on, in which case the Power Manager treats you specially. And so on.
When I thought about solving all the problems of handling query-power and set-power operations in a traditional way—that is, with normal-looking dispatch and completion routines—I was daunted by the sheer number of different subroutines that would be required and that would end up doing fairly similar things. I therefore decided to build my power support around a finite state machine that could easily deal with the asynchronous nature of the activities.
I'll explain this finite state machine as it appears in GENERIC.SYS, which is a support driver that most of the code samples on the companion disc use. Appendix B, "Using GENERIC.SYS," explains the client interface to GENERIC.SYS in complete detail. GENERIC.SYS amounts to a kernel-mode DLL containing helper functions for WDM drivers. You could think of it as a generic class driver with broad applicability. Client drivers, including most of my own sample drivers, delegate handling of power IRPs to GENERIC.SYS by calling GenericDispatchPower. GENERIC.SYS also implements the DEVQUEUE object I discussed in Chapter 6, "Plug and Play."
I wrote a function named HandlePowerEvent to implement the finite state machine that manages power IRPs. I call this function with two arguments:
NTSTATUS HandlePowerEvent(PPOWCONTEXT ctx, enum POWEVENT event); |
The first argument is a context structure that contains a state variable, among other things:
typedef struct _POWCONTEXT { LONG id; LONG eventcount; PGENERIC_EXTENSION pdx; PIRP irp; enum POWSTATE state; NTSTATUS status; PKEVENT pev; DEVICE_POWER_STATE devstate; UCHAR MinorFunction; BOOLEAN UnstallQueue; } POWCONTEXT, *PPOWCONTEXT; |
The id and eventcount fields are for debugging. If you compile POWER.CPP in the GENERIC project with the preprocessor macro VERBOSETRACE defined as a nonzero value, the POWTRACE macro will produce volumes of trace messages. I used this feature to debug the finite state machine. The prebuilt version of GENERIC.SYS on the companion disc was built without VERBOSETRACE to cut down on the sheer number of trace messages you'd be confronted with when you began to try out my samples.
The pdx member points to GENERIC's portion of the device extension for a given device. There are just a couple of members in the device extension that are important for power management, and I'll mention them later in "Initial Handling for a New IRP." The irp member points to the power IRP that the finite state machine is currently working on; state is the state variable for the machine. The status member is the ending status of an IRP. In some situations, we want to wait while HandlePowerEvent originates and completes a device power IRP; we use the event pointed to by pev to await completion in those situations. The devstate member holds the device power state we want to use in a device IRP, and MinorFunction holds the minor function code (IRP_MN_QUERY_POWER or IRP_MN_SET_POWER) we want to use in that IRP. Finally, UnstallQueue indicates whether we want the state machine to unstall the substantive IRP queue when it finishes handling the current power IRP.
The second argument to HandlePowerEvent is an event code that indicates why we're calling the function. There are just these few event codes:
HandlePowerEvent uses the value of the state variable and the event code to determine an action to take. See Table 8-3. (In the table, by the way, an empty cell denotes an impossible situation that leads to an ASSERT failure in the checked build of GENERIC.SYS.) An action corresponds to a series of program steps that advance the power IRP along its processing path.
Table 8-3. Table giving initial action for each event and state.
State | Event | ||
---|---|---|---|
NewIrp | MainIrpComplete | AsyncNotify | |
InitialState | TriageNewIrp | ||
SysPowerUpPending | SysPowerUpComplete | ||
SubPowerUpPending | SubPowerUpComplete | ||
SubPowerDownPending | SubPowerDownComplete | ||
SysPowerDownPending | SysPowerDownComplete | ||
DevPowerUpPending | DevPowerUpComplete | ||
DevPowerDownPending | CompleteMainIrp | ||
ContextSavePending | ContextSaveComplete | ||
ContextRestorePending | ContextRestoreComplete | ||
DevQueryUpPending | DevQueryUpComplete | ||
DevQueryDownPending | DevQueryDownComplete | ||
QueueStallPending | QueueStallComplete | ||
FinalState |
Since many of the events require multiple actions in some situations, I coded HandlePowerEvent in what may seem at first like a peculiar way, as follows:
NTSTATUS HandlePowerEvent(...) { NTSTATUS status; POWACTION action = ...; while (TRUE) { switch (action) { case <someaction>: action = <someotheraction>; continue; case <anotheraction>: break; } break; } return status; } |
That is, the function amounts to a switch on the action code imbedded within an infinite loop. An action case that performs a continue statement repeats the loop; this is how I string together a series of actions during one call to the function. An action case that performs a break from the switch reaches another break statement that exits from the loop, whereupon the function returns.
I adopted this coding style for the state machine because I really took to heart the structured programming precepts I learned in my youth. I wanted there to be just one return statement in this whole function to make it easier to prove that the function worked correctly. To aid in the proof, I developed a couple of rules for myself that I could test either by inspection or with ASSERT statements at the end of the function. Here are the rules:
When we receive a new query-power or set-power IRP, we create a context structure to drive the finite state machine and call HandlePowerEvent:
1 2 |
NTSTATUS GenericDispatchPower(PGENERIC_EXTENSION pdx, PIRP Irp) { NTSTATUS status = IoAcquireRemoveLock(pdx->RemoveLock, Irp); if (!NT_SUCCESS(status)) return CompleteRequest(Irp, status); PIO_STACK_LOCATION stack = IoGetCurrentIrpStackLocation(Irp); ULONG fcn = stack->MinorFunction; if (fcn == IRP_MN_SET_POWER || fcn == IRP_MN_QUERY_POWER) { PPOWCONTEXT ctx = (PPOWCONTEXT) ExAllocatePool(NonPagedPool, sizeof(POWCONTEXT)); RtlZeroMemory(ctx, sizeof(POWCONTEXT)); ctx->pdx = pdx; ctx->irp = Irp; status = HandlePowerEvent(ctx, NewIrp); } IoReleaseRemoveLock(pdx->RemoveLock, Irp); return status; } |
The initial state of the finite state machine is InitialState. When we call HandlePowerEvent for the NewIrp event, the first action taken will be the following, which I named TriageNewIrp:
1 2 3 4 5 6 7 |
case TriageNewIrp: { status = STATUS_PENDING; IoMarkIrpPending(Irp); IoAcquireRemoveLock(pdx->RemoveLock, Irp); if (stack->Parameters.Power.Type == SystemPowerState) { // system IRP if (stack->Parameters.Power.State.SystemState < pdx->syspower) { action = ForwardMainIrp; ctx->state = SysPowerUpPending; } else { action = SelectDState; ctx->state = SubPowerDownPending; } } // system IRP else { // device IRP ctx->state = QueueStallPending; if (!pdx->StalledForPower) { ctx->UnstallQueue = TRUE; pdx->StalledForPower = TRUE; NTSTATUS qstatus = StallRequestsAndNotify(pdx->dqReadWrite, GenericSaveRestoreComplete, ctx); if (qstatus == STATUS_PENDING) break; } action = QueueStallComplete; } // device IRP continue; } |
Basically, TriageNewIrp is distinguishing between system power IRPs (that is, IRPs whose Type is SystemPowerState ) that increase the power level, system power IRPs that leave the power level alone or reduce it, and device power IRPs (that is, IRPs whose Type is DevicePowerState ), regardless of whether they raise or lower the power level. The state machine doesn't distinguish at this stage between QUERY_POWER and SET_POWER requests, so they end up being treated very similarly up to a point.
For us to know whether power is rising or falling, our device extension needs two variables for keeping track of system power and device power states:
typedef struct _GENERIC_EXTENSION { ... DEVICE_POWER_STATE devpower; // current dev power state SYSTEM_POWER_STATE syspower; // current sys power state } GENERIC_EXTENSION, *PGENERIC_EXTENSION; |
We initialize these values to PowerDeviceD0 and PowerSystemWorking, respectively, when the client driver first registers with GENERIC.SYS.
You can guess from context that the device extension also has a BOOLEAN member named StalledForPower. This flag, when set, indicates that the substantive IRP queue is presently stalled for purposes of power management. Incidentally, you'll notice (if you've got the right sort of nasty and suspicious mind to be doing device driver programming, that is) that I'm not explicitly synchronizing access to the power state fields or this flag. No additional synchronization is required beyond the serialization that the Power Manager already imposes.
I'll discuss the three initial categories of IRPs separately now.
If a system power IRP implies an increase in the system power level, you'll forward it immediately to the next lower driver. In your completion routine for the system power IRP, you'll request the corresponding device power IRP and return STATUS_MORE_PROCESSING_REQUIRED to temporarily halt the completion process. In a completion routine for the device power IRP, you'll finish the completion processing for the system power IRP. Figure 8-5 diagrams the flow of the IRP through all of the drivers. Figure 8-6 is a state diagram that shows how our finite state machine handles the IRP.
Figure 8-5. IRP flow when increasing system power.
Figure 8-6. State transitions when increasing system power.
In terms of how the code works, I showed you earlier that TriageNewIrp puts the machine into the SysPowerUpPending state and requests the ForwardMainIrp action, which is as follows:
case ForwardMainIrp: { IoCopyCurrentIrpStackLocationToNext(Irp); IoSetCompletionRoutine(Irp, (PIO_COMPLETION_ROUTINE) MainCompletionRoutine, (PVOID) ctx, TRUE, TRUE, TRUE); PoCallDriver(pdx->LowerDeviceObject, Irp); break; } |
HandlePowerEvent will now return STATUS_PENDING, as mandated by the code we already saw in TriageNewIrp. This return value percolates back out through GenericDispatchPower and, presumably, the client driver's IRP_MJ_POWER dispatch function.
Our next contact with this IRP is when the bus driver completes it. Our own MainCompletionRoutine gets control as part of the completion process, saves the IRP's ending status in the context structure's status field, and invokes the finite state machine:
NTSTATUS MainCompletionRoutine(PDEVICE_OBJECT junk, PIRP Irp, PPOWCONTEXT ctx) { ctx->status = Irp->IoStatus.Status; return HandlePowerEvent(ctx, MainIrpComplete); } |
Our initial action will be SysPowerUpComplete:
1 2 3 |
case SysPowerUpComplete: { if (!NT_SUCCESS(ctx->status)) action = CompleteMainIrp; else { if (stack->MinorFunction == IRP_MN_SET_POWER) pdx->syspower = stack->Parameters.Power.State.SystemState; action = SelectDState; ctx->state = SubPowerUpPending; status = STATUS_MORE_PROCESSING_REQUIRED; } continue; } |
If the IRP failed, you can see that we'll do the CompleteMainIrp action next:
1 2 3 4 5 |
case CompleteMainIrp: { PoStartNextPowerIrp(Irp); if (event == MainIrpComplete) status = ctx->status; else { Irp->IoStatus.Status = ctx->status; IoCompleteRequest(Irp, IO_NO_INCREMENT); } IoReleaseRemoveLock(pdx->RemoveLock, Irp); if (ctx->UnstallQueue) { pdx->StalledForPower = FALSE; RestartRequests(pdx->dqReadWrite, pdx->DeviceObject); } action = DestroyContext; continue; } |
When handling a system power IRP that increases power, the machine enters CompleteMainIrp after a MainIrpComplete event. CompleteMainIrp will therefore arrange to return the error status we originally fetched (inside MainCompletionRoutine) from the IRP. That will permit the completion process to continue. There are other code paths we haven't studied yet in which CompleteMainIrp calls IoCompleteRequest instead. CompleteMainIrp finishes by requesting yet another action:
1 2 |
case DestroyContext: { if (ctx->pev) KeSetEvent(ctx->pev, IO_NO_INCREMENT, FALSE); else ExFreePool(ctx); break; } |
DestroyContext is, of course, the last action the finite state machine ever performs.
The other possible path out of SysPowerUpComplete generates a device power IRP with a power state that corresponds to the system power state. We perform the mapping of system to device states in the SelectDState action:
case SelectDState: { SYSTEM_POWER_STATE sysstate = stack->Parameters.Power.State.SystemState; if (sysstate == PowerSystemWorking) ctx->devstate = PowerDeviceD0; else { DEVICE_POWER_STATE maxstate = pdx->devcaps.DeviceState[sysstate]; DEVICE_POWER_STATE minstate = pdx->WakeupEnabled ? pdx->devcaps.DeviceWake : PowerDeviceD3; ctx->devstate = minstate > maxstate ? minstate : maxsstate; } ctx->MinorFunction = stack->MinorFunction; action = SendDeviceIrp; continue; } |
By the way, the Power Manager never transitions directly from one low system power state to another: it always moves via PowerSystemWorking. That's why I coded SelectDState to choose one mapping for PowerSystemWorking and a different mapping for all other system power states.
In general, we always want to put our device into the lowest power state that's consistent with current device activity, with our own wake-up feature (if any), with device capabilities, and with the impending state of the system. These factors can interplay in a relatively complex way. To explain them fully, I need to digress briefly and talk about a Plug and Play IRP that I avoided discussing in Chapter 6: IRP_MN_QUERY_CAPABILITIES.
The PnP Manager sends a capabilities query shortly after starting your device and perhaps at other times. The parameter for the request is a DEVICE_CAPABILITIES structure that contains several fields relevant to power management. Since this is the only time in this book I'm going to discuss this structure, I'm showing you the entire declaration:
typedef struct _DEVICE_CAPABILITIES { USHORT Size; USHORT Version; ULONG DeviceD1:1; ULONG DeviceD2:1; ULONG LockSupported:1; ULONG EjectSupported:1; ULONG Removable:1; ULONG DockDevice:1; ULONG UniqueID:1; ULONG SilentInstall:1; ULONG RawDeviceOK:1; ULONG SurpriseRemovalOK:1; ULONG WakeFromD0:1; ULONG WakeFromD1:1; ULONG WakeFromD2:1; ULONG WakeFromD3:1; ULONG HardwareDisabled:1; ULONG NonDynamic:1; ULONG Reserved:16; ULONG Address; ULONG UINumber; DEVICE_POWER_STATE DeviceState[PowerSystemMaximum]; SYSTEM_POWER_STATE SystemWake; DEVICE_POWER_STATE DeviceWake; ULONG D1Latency; ULONG D2Latency; ULONG D3Latency; } DEVICE_CAPABILITIES, *PDEVICE_CAPABILITIES; |
Table 8-4 describes the fields in this structure that relate to power management.
Table 8-4. Power-management fields in DEVICE_CAPABILITIES structure.
Field | Description |
---|---|
DeviceState | Array of highest device states possible for each system state |
SystemWake | Lowest system power state from which the device can generate a wake-up signal for the system—PowerSystemUnspecified indicates that device can't wake up the system |
DeviceWake | Lowest power state from which the device can generate a wake-up signal—PowerDeviceUnspecified indicates that device can't generate a wake-up signal |
D1Latency | Approximate worst-case time (in 100-microsecond units) required for device to switch from D1 to D0 states |
D2Latency | Approximate worst-case time (in 100-microsecond units) required for device to switch from D2 to D0 states |
D3Latency | Approximate worst-case time (in 100-microsecond units) required for device to switch from D3 to D0 states |
WakeFromD0 | Flag indicating whether device's system wake-up feature is operative when the device is in the indicated state |
WakeFromD1 | Same as above |
WakeFromD2 | Same as above |
WakeFromD3 | Same as above |
You normally handle the query capabilities IRP synchronously by passing it down and waiting for the lower layers to complete it. After the pass-down, you'll make any desired changes to the capabilities recorded by the bus driver. Your subdispatch routine would look like this one:
1 2 3 |
NTSTATUS HandleQueryCapabilities(IN PDEVICE_OBJECT fdo, IN PIRP Irp) { PIO_STACK_LOCATION stack = IoGetCurrentIrpStackLocation(Irp); PDEVICE_EXTENSION pdx = (PDEVICE_EXTENSION) fdo->DeviceExtension; PDEVICE_CAPABILITIES pdc = stack-> Parameters.DeviceCapabilities.Capabilities; if (pdc->Version < 1) return DefaultPnpHandler(fdo, Irp); NTSTATUS status = ForwardAndWait(fdo, Irp); if (NT_SUCCESS(status)) { stack = IoGetCurrentIrpStackLocation(Irp); pdc = stack->Parameters.DeviceCapabilities.Capabilities; <stuff> pdx->devcaps = *pdc; } return CompleteRequest(Irp, status); } |
Don't bother altering the characteristics structure before you pass this IRP down: the bus driver will completely reinitialize it. When you regain control, you can modify SystemWake and DeviceWake to specify a higher power state than the bus driver thought was appropriate. You can't specify a lower power state for the wake-up fields, and you can't override the bus driver's decision that your device is incapable of waking the system. If your device is ACPI-compliant, the ACPI filter will set the LockSupported, EjectSupported, and Removable flags automatically based on the ACPI Source Language (ASL) description of the device—you won't need to worry about these capabilities.
You might want to set the SurpriseRemovalOK flag at point "2" in the capabilities handler. Setting the flag suppresses the dialog box that Windows 2000 normally presents when it detects the sudden and unexpected removal of a device. It's normally okay for the end user to remove a universal serial bus (USB) or 1394 device without first telling the system, and the function driver should set this flag to avoid annoying the user.
To return to our discussion of SelectDState, suppose we're dealing with a set-power request that will take the computer from Working to Sleeping1; we'll therefore execute the second branch of the if statement in SelectDState. Let's suppose that the bus driver knows that our device can be in any of the states D0, D1, D2, or D3 when the system is in Sleeping1. When it answered the PnP capabilities query it would therefore have filled in DeviceState [PowerSystemSleeping1] in the device capabilities structure with the value PowerDeviceD0 because D0 is the highest power state our device can occupy for this system state. We'll initially record PowerDeviceD0, then, as the value of maxstate.
Our device might also have a wake-up feature. I'll say more about wake-up later on. If so, the bus driver will have set the DeviceWake member of the capabilities structure equal to the lowest power state from which wake-up can occur. Let's suppose that value is PowerDeviceD1. If our wake-up feature happens to be enabled right now, we'll set minstate to PowerDeviceD1.
If we don't have a wake-up feature, however, or if we have one and it's not currently enabled, we're free to choose any device power state lower than the maxstate value we derived from the device capabilities structure. We could blindly choose D3, but that wouldn't be right for every type of device because generally speaking it takes longer to resume from D3 to D0 than from D2 or D1. The choice you make in this case therefore depends on factors for which I can't give you cut-and-dried guidance. If your device is capable of the D2 state, for example, you might decide to enter D2 for any of the system sleeping states and reserve D3 for the hibernate and shutdown states.
It seems reasonable to leave your device in a low power state when the system resumes from a sleeping state. The DDK suggests you do this, and so does good sense. There are two situations in which you would need to restore your device to D0 when the system goes to Working. The first situation is when your device has the INRUSH characteristic. In this case, the Power Manager won't send power IRPs to any other INRUSH device until you've powered on your device. The second situation is when you've got substantive IRPs queued and waiting to run once power is back. Notwithstanding what a good idea it seems to be to just leave your device in a low power state, you'll notice that the code fragment I just showed you for SelectDState unconditionally picks the D0 state. In my testing, Windows 2000 seemed to hang coming out of standby if I didn't do that. Maybe there's a mistake in my code or in the operating system. Stay tuned to my errata page for more information about this.
In Chapter 5, "The I/O Request Packet," I discussed support functions such as IoAllocateIrp that you can use to build IRPs. You don't use those functions when you want to create power IRPs, though. (Actually, you would use one of those functions for an IRP_MN_POWER_SEQUENCE request, but not for the other IRP_MJ_POWER requests.) Instead, you use PoRequestPowerIrp, as shown here in the code for the SendDeviceIrp action we'd perform after SelectDState:
1 2 3 4 |
case SendDeviceIrp: { if (win98 && ctx->devstate == pdx->devpower) { ctx->status = STATUS_SUCCESS; action = actiontable[ctx->state][AsyncNotify]; continue; } POWER_STATE powstate; powstate.DeviceState = ctx->devstate; NTSTATUS postatus = PoRequestPowerIrp(pdx->Pdo, ctx->MinorFunction, powstate, (PREQUEST_POWER_COMPLETE) PoCompletionRoutine, ctx, NULL); if (NT_SUCCESS(postatus)) break; action = CompleteMainIrp; ctx->status = postatus; continue; } |
In the system power-up scenario I'm currently discussing, our state machine will be in the SubPowerUpPending state when we get to SendDeviceIrp. The status variable will be STATUS_MORE_PROCESSING_REQUIRED, which is the right value for MainCompletionRoutine to return if we're going to wait for the device IRP to finish. Normally, then, when we break from SendDeviceIrp, we'll interrupt the completion processing for the system power IRP for the time being.
I'll discuss what happens to the device IRP we request via PoRequestPowerIrp later on.
Eventually, the device IRP that SendDeviceIrp requests will finish, whereupon the Power Manager will call the PoCompletionRoutine callback routine. It in turn calls HandlePowerEvent with the event code AsyncNotify. Our first action in the SubPowerUpPending state will be SubPowerUpComplete:
case SubPowerUpComplete: { if (status == -1) status = STATUS_SUCCESS; action = CompleteMainIrp; continue; } |
The only job performed by this action routine is to alter the status variable. The reason we do that is that we have an ASSERT statement at the end of HandlePowerEvent to make sure someone changes status. In this exact scenario, it doesn't matter what status value we return because PoCompletionRoutine is a void function. But you don't want to trigger an ASSERT and a BSOD unless something is really wrong.
The next action after SubPowerUpComplete is CompleteMainIrp, which leads to DestroyContext. You've already seen what those action routines do.
If the system power IRP implies no change or a reduction in the system power level, you'll request a device power IRP with the same minor function code (set or query) and a device power state that corresponds to the system state. When the device power IRP completes, you'll forward the system power IRP to the next lower driver. You'll need a completion routine for the system power IRP so that you can make the requisite call to PoStartNextPowerIrp and so that you can perform some additional cleanup. See Figure 8-7 for an illustration of how the IRPs flow through the system in this case.
Figure 8-7. IRP flow when decreasing system power.
Figure 8-8 diagrams how our finite state machine handles this type of IRP. TriageNewIrp puts the state machine into the SubPowerDownPending state and jumps to the SelectDState action. You already saw that SelectDState selects a device power state and leads to a SendDeviceIrp action to request a device power IRP. In the system power-down scenario, we'll be specifying a lower power state in this device IRP.
Figure 8-8. State transitions when decreasing system power.
When the device IRP finishes, we execute SubPowerDownComplete:
case SubPowerDownComplete: { if (status == -1) status = STATUS_SUCCESS; if (NT_SUCCESS(ctx->status)) { ctx->state = SysPowerDownPending; action = ForwardMainIrp; } else action = CompleteMainIrp; continue; } |
As you can see, if the device IRP fails, we fail the system IRP too. If the device IRP succeeds, we enter the SysPowerDownPending state and exit via ForwardMainIrp. When the system IRP finishes, and MainCompletionRoutine runs, we'll execute SysPowerDownComplete:
case SysPowerDownComplete: { if (stack->MinorFunction == IRP_MN_SET_POWER) pdx->syspower = stack->Parameters.Power.State.SystemState; action = CompleteMainIrp; continue; } |
The only purpose of this action is to record the new system power state in our device extension and then to exit via CompleteMainIrp and DestroyContext.
All we actually do with system power IRPs is act as a conduit for them and request a device IRP either as the system IRP travels down the driver stack or as it travels back up. We have more work to do with device power IRPs, however.
To begin with, we don't want our device occupied by any substantive I/O operations while a change in the device power state is under way. As early as we can in a sequence that leads to powering down our device, therefore, we wait for any outstanding operation to finish, and we stop processing new operations. Since we're not allowed to block the system thread in which we receive power IRPs, an asynchronous mechanism is required. Once the current IRP finishes, we'll continue processing the device IRP.
If the device power IRP implies an increase in the device power level, we'll forward it to the next lower driver. Refer to Figure 8-9 for an illustration of how the IRP flows through the system. The bus driver will process a device set-power IRP by, for example, using whatever bus-specific mechanism is appropriate to turn on the flow of electrons to your device, and it will complete the IRP. Your completion routine will initiate whatever operations are required to restore context information to the device, and it will return STATUS_MORE_PROCESSING_REQUIRED to interrupt the completion process for the device IRP. When the context restore operation finishes, you'll resume processing substantive IRPs and finish completing the device IRP.
Figure 8-9. IRP flow when increasing device power.
If the device power IRP implies no change or a reduction in the device power level, you perform any device-specific processing (asynchronously, as we've discussed) and then forward the device IRP to the next lower driver. See Figure 8-10. The "device-specific processing" for a set operation includes saving device context information, if any, in memory so that you can restore it later. There probably isn't any device-specific processing for a query operation beyond deciding whether to succeed or fail the query. The bus driver completes the request. In the case of a query operation, you can expect the bus driver to complete the request with STATUS_SUCCESS to indicate acquiescence in the proposed power change. In the case of a set operation, you can expect the bus driver to take whatever bus-dependent steps are required to put your device into the specified device power state. Your completion routine cleans up by calling PoStartNextPowerIrp, among other things.
Figure 8-10. IRP flow when decreasing device power.
I invented StallRequestsAndNotify for use in TriageNewIrp. (It's so new that Chapter 6, where all the other DEVQUEUE functions are described, was already beyond my reach when I created it.) The first step it performs is to stall the request queue. If the device is currently busy, it records a callback routine address—in this case, GenericSaveRestoreComplete, which I'm overloading for purposes of receiving a notification—and returns STATUS_PENDING. TriageNewIrp will then exit in the QueueStallPending state.
If the device isn't busy, StallRequestsAndNotify returns STATUS_SUCCESS without arranging any callback; the device can't become busy now because the queue is stalled. TriageNewIrp will then go directly to the QueueStallComplete action.
We reach the QueueStallComplete routine either directly from TriageNewIrp (when the device is idle or if the queue was previously stalled for some other power-related reason) or when the client driver calls StartNextPacket to indicate that it's finished processing the current IRP. StartNextPacket calls the notification routine we gave to StallRequestsAndNotify, and that routine signals an AsyncNotify event to the state machine. QueueStallComplete now separates the device IRP into one of four categories, as follows:
case QueueStallComplete: { if (stack->MinorFunction == IRP_MN_SET_POWER) { if (stack->Parameters.Power.State.DeviceState < pdx->devpower) { action = ForwardMainIrp; SETSTATE(DevPowerUpPending); } else action = SaveContext; } else { if (stack->Parameters.Power.State.DeviceState < pdx->devpower) { action = ForwardMainIrp; SETSTATE(DevQueryUpPending); } else action = DevQueryDown; } continue; } |
The upshot of QueueStallComplete is that we perform the next action indicated in Table 8-5 for the type of IRP we're dealing with.
Table 8-5. Next action for device IRPs.
Minor Function | More or Less Power? | Next Action |
---|---|---|
IRP_MN_QUERY_POWER | More power Less or same power |
ForwardMainIrp DevQueryDown |
IRP_MN_SET_POWER | More power Less or same power |
ForwardMainIrp SaveContext |
Figure 8-11 diagrams the state transitions that occur for an IRP_MN_SET_POWER that specifies a higher device power state than that which is current.
Figure 8-11. State transitions when setting a higher device power state.
ForwardMainIrp will install a completion routine and send the IRP down the driver stack. When MainCompletionRoutine eventually gains control, it signals a MainIrpComplete event. We will be in the DevPowerUpPending state, so we'll execute the DevPowerUpComplete action:
case DevPowerUpComplete: { if (!NT_SUCCESS(ctx->status) || stack->MinorFunction != IRP_MN_SET_POWER) { action = CompleteMainIrp; continue; } status = STATUS_MORE_PROCESSING_REQUIRED; DEVICE_POWER_STATE oldpower = pdx->devpower; pdx->devpower = stack->Parameters.Power.State.DeviceState; if (pdx->RestoreContext) { ctx->state = ContextRestorePending; (*pdx->RestoreDeviceContext)(pdx->DeviceObject, oldpower, pdx->devpower, ctx); break; } action = ContextRestoreComplete; continue; } |
The main task we need to accomplish is restoring any device context that was lost during the previous power-down transition. Since we're not allowed to block our thread, we initiate whatever operations are required and return STATUS_MORE_PROCESSING_REQUIRED to interrupt the completion of the device IRP. When the restore operations finish, the client driver calls GenericSaveRestoreComplete, which signals an AsyncNotify event. We'll be in the ContextRestorePending state at that point, so we'll perform the ContextRestoreComplete action:
case ContextRestoreComplete: { if (event == AsyncNotify) status = STATUS_SUCCESS; action = CompleteMainIrp; if (!NT_SUCCESS(ctx->status) || pdx->devpower != PowerDeviceD0) continue; ctx->UnstallQueue = TRUE; continue; } |
The main result of this action routine is that we unstall the queue of substantive IRPs at the conclusion of an IRP_MN_SET_POWER to the D0 state. We exit via CompleteMainIrp and DestroyContext.
You shouldn't expect to receive an IRP_MN_QUERY_POWER that refers to a higher power state than your device is already in, but you shouldn't crash the system if you happen to receive one. The following code shows what GENERIC does when such a query completes in the lower level drivers. (Refer to Figure 8-12 for a state diagram.)
case DevQueryUpComplete: { if (NT_SUCCESS(ctx->status) && pdx->QueryPower) if (!(*pdx->QueryPower)(pdx->DeviceObject, pdx->devpower, stack->Parameters.Power.State.DeviceState)) ctx->status = STATUS_UNSUCCESSFUL; action = CompleteMainIrp; continue; } |
That is, GENERIC allows the client driver to accept or veto the query by calling its QueryPower function, and then it exits via CompleteMainIrp and DestroyContext.
Figure 8-12. State transitions for a query about a higher device power state.
If the IRP is an IRP_MN_SET_POWER for the same or a lower device power state than current, the finite state machine goes through the state transitions diagrammed in Figure 8-13.
Figure 8-13. State transitions when setting a lower device power state.
SaveContext will initiate an asynchronous process to save any device context that will be lost when the device loses power:
case SaveContext: { DEVICE_POWER_STATE devpower = stack->Parameters.Power.State.DeviceState; if (pdx->SaveDeviceContext && devpower > pdx->devpower) { ctx->state = ContextSavePending; (*pdx->SaveDeviceContext)(pdx->DeviceObject, pdx->devpower, devpower, ctx); break; } action = ContextSaveComplete; } |
When the save operations finish, the client driver calls GenericSaveRestoreComplete, which signals an AsyncNotify event. We'll be in the ContextSavePending state at that point, so we'll perform the ContextSaveComplete action:
1 2 3 4 |
case ContextSaveComplete: { if (event == AsyncNotify) status = STATUS_SUCCESS; ctx->state = DevPowerDownPending; action = ForwardMainIrp; DEVICE_POWER_STATE devpower = stack->Parameters.Power.State.DeviceState; if (devpower <= pdx->devpower) continue; pdx->devpower = devpower; if (devpower > PowerDeviceD0) ctx->UnstallQueue = FALSE; continue; } |
The next action, ForwardMainIrp, sends the device IRP down the driver stack. The bus driver will turn the physical flow of current off and complete the IRP. We'll see it next when MainCompletionRoutine signals a MainIrpComplete event, which takes us directly to CompleteMainIrp and thence to DestroyContext.
An IRP_MN_QUERY_POWER that specifies the same or a lower device power state than current is the basic vehicle by which a function driver gets to vote on changes in power levels. Although the DDK doesn't specifically say you should create one of these requests when you handle a system query, it's a good idea to do so. You have to handle device queries anyway and might as well put all the query logic in one place. Figure 8-14 shows how our state machine will handle such a query.
The DevQueryDown action follows QueueStallComplete for this kind of IRP:
case DevQueryDown: { DEVICE_POWER_STATE devpower = stack->Parameters.Power.State.DeviceState; if (devpower > pdx->devpower && pdx->QueryPower && !(*pdx->QueryPower)(pdx->DeviceObject, pdx->devpower, devpower)) { ctx->status = STATUS_UNSUCCESSFUL; action = DevQueryDownComplete; continue; } ctx->state = DevQueryDownPending); action = ForwardMainIrp; continue; } |
Figure 8-14. State transitions for a query about a lower device power state.
GENERIC basically lets the client driver decide whether the query should succeed. If the client driver says "Yes," we enter the DevQueryDownPending state and exit via ForwardMainIrp to send the query down the driver stack. Completion of the IRP sends us to the DevQueryDownComplete action:
case DevQueryDownComplete: { if (NT_SUCCESS(ctx->status)) ctx->UnstallQueue = FALSE; action = CompleteMainIrp; continue; } |
The basic action we take is to leave the substantive IRP queue stalled if the query succeeds. (CompleteMainIrp will unstall the queue if it sees the UnstallQueue flag set in the context structure. Clearing the flag causes this step to be skipped.) Recall that we first stalled the queue when we received the query. We'll leave it stalled until someone eventually sends us a set-power IRP to put the device into D0.