Table Of Contents
The Anatomy of a VxD The Virtual Machine Manager Memory Management V86/PM VxD API Nested Execution I/O Trapping IRQ Virtualization Virtualized DMA VKD and Keyboard Processing Writing VxDs in C Using the Debugging Services VCOMMD Design Notes Win-Link Design and Implementation Notes

Chapter 1

The Anatomy of a VxD

The VxD Structure
Virtual Device Initialization
The VSIMPLED Sources
Debugging the VSIMPLED VxD

Virtual device drivers (VxDs) are not just for people writing drivers for hardware devices anymore than DOS device drivers are used for the same. A VxD is Windows' way of letting you do almost anything you want. If you miss the DOS world where you have direct access to the hardware, can interface to vital CPU functions, or can take over parts of the operating system - then welcome to VxDs, where you can do the a lot of same under Windows.

A VxD is code and data that runs at ring 0 in 32-bit flat model as part of the Windows 386 virtual machine manager (VMM). In fact, the VMM (WIN386.EXE) is primarily a number of standard VxDs compounded in a single file. VxDs only operate when Windows runs in 386 Enhanced mode.

VMM is not really a part of Windows; instead, it is a preemptive, multitasking kernel that controls multiple virtual machines. Once VMM has initialized, the Windows Graphical User Interface composed of KRNL386.EXE, GDI.EXE, USER.EXE, and all of the supporting drivers are loaded into the System VM (the initial virtual machine created when VMM is started). However, VMM could easily load COMMAND.COM into the System VM and with the assistance of a VxD and some helper hot-keys, so that you have a multitasking DOS instead of the fancy Windows GUI.

Because VxDs operate at ring 0, the operating-system level of protection, the CPU allows the code to execute any 386 instruction. At higher ring levels, access to memory addresses or I/O ports can be restricted from the VM, allowing the VMM or a VxD to process the exception as it wishes. Of course, certain instructions executed by the VM always cause processor exceptions, but a VxD can simulate the functionality of that instruction for the VM, allowing it to operate as if it has sufficient privilege.

With this power comes responsibility. Although a VxD can play with the Interrupt Descriptor Table (IDT) entries directly, this is something that should probably be avoided. Besides, the VMM provides enough functionality to get as close the IDT as needed, so why reinvent the wheel?

A VxD is always active, unlike any other part of Windows. When a DOS box is running exclusive mode, the primary code executing apart from the DOS box itself includes any VxDs responding to IRQs, code causing faulting instructions, and trapped I/O or page faults in the DOS box.

A VxD is the only program with unobstructed access to the hardware. If a VxD performs I/O to a port, it communicates directly to the physical port, without restrictions. If a VxD owns a hardware interrupt, the VxD receives the IRQ directly from the Virtual Programmable Interrupt Controller Driver (VPICD), without ring transitions. For example, an interrupt service routine for an non-owned interrupt in a VM sees a virtualized interrupt through events scheduled by the VPICD, whereas a VxD has a more direct path for interrupt servicing. Where code communicating to hardware in a VM may be restricted or slowed by ring transitions and access permission lookups, a VxD is unrestricted and extremely fast.

VxDs operate in 32-bit flat model, the 386 equivalent of small model. All of the segment registers are fixed to the same base address. The CS and DS selector values differ, due to access and execution restrictions (code versus data), but point to the same memory. Because a VxD is in 32-bit flat model, all offsets to code and data are 32-bit; therefore, you can access any part of the address space (4 gigabytes) with just an offset.

A VxD is also given priority on all actions in a VM. A VxD can intercept and/or generate interrupts (hardware or software), trap port I/O, and even restrict access to specific regions of memory. VxDs can determine whether to allow such access to occur, provide simulation, terminate (or nuke) the VM, or simply ignore the request.

Because VxDs utilize the base components of the 80386 chipset, it is important that you have a working knowledge of 386 architecture.

For a good description of 80386/80486 system architecture, see Hummel, Robert L. (1992), PC Magazine Programmer's TechnicaI Reference: The Processor and Coprocessor, Emeryville, CA: Ziff-Davis Press.

A misbehaving MS-DOS application will usually crash the DOS virtual machine. A misbehaving Windows application may affect the operation of other Windows applications. However, a misbehaving VxD will crash the entire Windows operating system. Because a VxD is part of the WlN386 kernel, the VxD is active during critical processing of the Windows operating system. The smallest, most subtle bug can have devastating effects on the operating system. Thorough testing of virtual device drivers is absolutely necessary. Do not simply test how the VxD operates under stringent configurations; instead, expand your testing to include all possible permutations of end-user system configurations you can design (limited only by a testing hardware budget of course!).

VxDs were originally designed to handle hardware device contention between multiple processes and to translate or buffer data transfers from a VM to hardware devices. When two or more programs attempt to access the same device, some method of contention management must be used. You can use a VxD to allow each process to act as though it has exclusive access to the device. For example, a Virtual Printer Device (VPD) would provide the process with a virtual printer port, and characters written to the port would be written to a print spooler. The VxD would then send the job to the printer when it becomes available. Windows 3.X does not operate in this fashion, but the Win-Link VxD provides this functionality (see Chapter 13 for more information). Another method would be to assign the physical device to only one process at a time, so that when a process attempts to access the device while it is in use, the VxD does not pass the request to the actual hardware, and the process operates as though the hardware did not exist. The Virtual COMM Device (VCD) uses this method.

Recently, the use of VxDs has been expanded to include interprocess communication (demonstrated in the Win-Link example). Some VxDs now also implement a truly virtual device, providing the necessary mechanisms to allow a virtual machine to see a device that may not actually exist in hardware. VxDs can also implement client-server hardware management, providing an interface to a VM that virtualizes I/O to the device and translates this information to commands to be sent across a network to a hardware server.

The VxD Structure

A VxD has a rather simple structure. It includes a 16-bit real-mode initialization code and data segment, 32-bit initialization code and data segments, 32-bit locked or "non-locked" code and data segments, and a virtual device driver declaration block (exported in the linear executable file as the VxD's DDB). Similar to the "suicide" fence of a DOS terminate-and-stay resident program, the initialization fragments of the VxD are discarded after initialization has been completed. Under Windows 3.x, all 32-bit code and data segments are locked, because the macros provided in the VMM.INC included with the Windows 3.X Device Driver Kit resolve to the same segment definition. However, you should not assume that non-locked segments are the same as locked segments, as these definitions most likely will change in the future. Note the distinction between the two now and save yourself the bug-tracking hassle later.

Real-Mode Initialization Segment

The real-mode initialization segment is a 16-bit code and data segment of the VxD defined by the VxD_REAL_MODE_INIT_SEG macro and is called during VMM's startup. This allows a VxD to communicate with TSRs or other real-mode procedures to gather and then pass vital information to the VxD's protected mode initialization routines or to fail the load of the VxD or VMM prior to entering protected mode. The term "real-mode initialization" is relative. If you have installed an EMM emulator (EMM386, 386Max, or QEMM), it is likely that the real-mode initialization procedures are invoked in V86 mode and are subject to trapped I/O or other virtualization occurring under these systems. In other words, during real-mode initialization, VMM does not switch the processor to real mode and then call these procedures. Instead, it executes the 16-bit code in the mode configured prior to the startup of VMM (such as invoking WlN.COM).

Note: Due to problems in Windows 3.x, you will need to make sure that your real-mode initialization segment is not exactly 4k, 8k, 12k, or 16k in size. Additionally, real-mode initialization segments greater than 8k (or 12k in Windows 3.1) must be a multiple of 4. Real-mode initialization segments cannot be greater than 12k under Windows 3.0 or greater than 16k under Windows 3.1. Using code segments greater than these restrictions will cause problems and will eventually hang VMM. These problems were reported on the CompuServe WinSDK forum and confirmed by Developer Support Engineers. Avoid these problems with real-mode initialization by adding the necessary boundary checks in your code.

Declare_Virtual_Device	VSIMPLED,VSIMPLED_Major_Ver,\
			VSIMPLED_Minor_Ver,\
			VSIMPLED_Control_Proc,\
			VSIMPLED_Device_ID,\
			Undefined_Init_Order,\
			VSIMPLED_V86_API_Proc,\
			VSIMPLED_PM_API_Proc

This declaration dispatches the system control events to the VSIMPLED_Control_Proc. This procedure must be declared in a VxD_LOCKED_CODE segment, which handles system event control such as the initialization dispatch, VM control events (creation or suspension of VMs), device focus changes, and system shutdown notifications. Defining it in any other segment will cause problems.

VxD Control Procedure

The control procedure is the main dispatch entry point for the VxD. The initialization messages from VMM are sent to this procedure and then dispatched to the appropriate handlers:

VXD_LOCKED_CODE_SEG

;VSIMPLED_Control_Proc
;
;Description:
;	This is the entry point for system control calls from VMM.
;	Control for system messages are dispatched through the
;	Control_Dispatch macro in VMM.INC.

BeginProc VSIMPLED_Control_Proc

	Control_Dispatch Sys_Critical_Init, VSIMPLED_Sys_Critical Init
	Control_Dispatch Device_Init, VSIMPLED_Device Init

EndProc VSIMPLED_Control Proc

VXD_LOCKED_CODE_ENDS

VXD_LOCKED_CODE_SEG and VXD_LOCKED_CODE_ENDS are macros that define a segment of 32-bit code in a page-locked segment. Defining this segment as "page-locked" is necessary because the calls are dispatched during critical processing of the VMM. This procedure cannot be included in the initialization code segments, because it would be discarded after VMM completed its startup procedures and system failure would occur when the VMM attempted to dispatch a control message to the VxD during later processing.

The BeginProc and EndProc macros define the beginning and end of a specific VxD entry point. These macros define the procedure name of a VxD, declare it callable by other VxD, align the procedure for "fast-calling", declare the procedure as public for access outside of this module, or additionally define the procedure as an asynchronous service callable from another VxD at interrupt time. The valid parameters to BeginProc macro are PUBLIC, HIGH_FREQ, SERVICE, and ASYNC_SERVICE, and their functionality corresponds to the following table:

PUBLIC	Procedure is callable from an external module
HIGH_FREQ	Aligns this procedure on a DWORD boundary. Useful for procedures called frequently such as hardware interrupt procedures or I/O trapping routines.
SERVICE	Procedure can be called from another VxD. Requires an exported service table.
ASYNC_SERVICE	Same as SERVICE, but the VxD routine can be called during interrupt procedures. VxD services that do not specify this option and are called at interrupt time will cause debug traces when using the debug version of VMM (WIN386.EXE). If you declare a service to be asynchronous be sure that it is atomic or can be interrupted while processing the request.

Virtual Device ID

A specialized VxD ID may be required if your VxD provides an external V86 or PM API or if your VxD exports services callable by other VxDs. In these cases, you need to request a VxD ID from Microsoft (Internet address vxdid@microsoft.com; CompuServe email, >INTERNET:vxdid@microsoft.com). Microsoft will send you a registration form, which you will need to fill out and return for processing.

If you are replacing an existing VxD, such as the Virtual Comm Device (VCD), you should use the value specified in VMM.INC. The replacement VCD would then have a device ID of VCD_Device_ID. Otherwise, assuming that your VxD does not provide an external API or services, you can use the predefined value of Undefined_Device_ID.

Initialization Order

The initialization order of the DDB defines the load order of the virtual device drivers. VMM will load and initialize the VxDs in the order specified by the load-order values. For most secondary virtual device drivers, the Undefined_Init_Order equate is sufficient. If your VxD requires other VxDs to be present and initialized prior to calling your initialization procedures, you need to specify a load order constant here.

API Entry Procedures

API entry procedures are invoked when a VM running in either protected mode or V86 mode calls the VxD's entry point. A VxD entry point is available to VMs only when the VxD defines the necessary entry procedures in the DDB. These procedures are discussed in depth in Chapter 4.

Virtual Device Initialization

System Critical Initialization (Sys_Critical_Init)

VMM dispatches a Sys_Critical_Init message to all VxDs in the order defined by the Initialization Order values of the VxDs. During Sys_Critical_Init, interrupts are disabled, and VxDs perform system-critical initialization such as memory mapping and hooking V86 interrupts or faults. Because interrupts are disabled, you should keep the initialization during this message to a minimum.

Sys_Critical_Init may also be used to hook your VxD in front of certain handlers, such as GP fault or NMI processing. Sys_Critical_Init is an optional procedure, and you should only define this procedure if you have specific initialization to perform. Below is a sample Sys_Critical_Init handler as used in the VSIMPLED Sample:

;VSIMPLED_Sys_Critical_Init
;
;Description:
;	On entry, interrupts are disabled. Critical initialization
;	for this VxD should occur here. For example, we can read
;	settings from VMM's cached copy of the SYSTEN.INI and set up
;	our VxD as appropriate.
;
;	This procedure is called when the VSIMPLED_Control_Proc
;	dispatches the Sys_Critical_Init notification from VMM.
;	We can notify VMM of failure by returning with carry set
;	or carry clear will suggest success.

BeginProc VSIMPLED_Sys_Critical_Init

	clc
	ret

EndProc VSIMPLED_Sys_Critical_Init

Device Initialization (Device_Init)

Device initialization is where non-system critical initialization of your VxD is performed. For example, you may want to install I/O trap handlers, virtualize an interrupt using VPICD services, or allocate a VM control block. Returning from this procedure with carry set will fail the loading procedure of the VxD. If everything has passed initialization, clear the carry flag and return.

The VSIMPLED Sources

Using the information provided in this chapter, we are ready to create our first VxD. This skeleton VxD declares a DDB, and defines a control procedure supporting the two system initialization messages (Sys_Critical_Init and Device_Init):

MAKEFILE

!IFDEF DEBUG
DEFS=-DDEBUG
ENDIF

.asm.obj:
	masm5 -p -w2 -Mx $(DEFS) $*;

.asm.lst:
	masm5 -l -p -w2 -Mx $(DEFS) $*;


OBJS=vsimpled.obj

all: vsimpled.386

vsimpled.obj: vsimpled.asm

vsimpled.386:  vsimpled.def $(OBJS)
	link386 /NOI /NOD /NOP /MAP @<<
$ (OBJS)
vsimpled.386
vsimpled.map

vsimpled.def
<<
	addhdr vsimpled.386
	mapsym32 vsimpled

clean:
	del *.386
	del *.obj
	del *.map
	del *.sym

VSIMPLED.ASM

	page    60, 132
;
	title VSIMPLED - A simple virtual device driver example
;
;(C)Copyright Woodruff Software Systems, 1993
;Title:    VSIMPLED.386 - Sample virtual device driver
;Module:   VSIMPLED.ASM - Core code
;Version:  1.00
;Date:     November 24, 1992
;Author:   Bryan A. Woodruff
;
;Change log:
;   DATE       REVISION DESCRIPTION                  AUTHOR
;   11/24/92   1.00     Wrote it.                    BryanW
;
;Functional Description:
;	Provides a minimal virtual device driver interface.
;
	.386p
;		INCLUDES & EQUATES
;
	.XLIST
	INCLUDE VMM.Inc
	INCLUDE Debug.Inc
	.LIST

VSIMPLED_Major_Ver      equ     01h
VSIMPLED_Minor_Ver      equ     00h
VSIMPLED_Device_ID      equ     Undefined_Device_ID

;		VIRTUAL DEVICE DECLARATION

Declare_Virtual_Device  VSIMPLED, VSIMPLED_Major_Ver,\
			VSIMPLED_Minor_Ver, VSIMPLED_Control_Proc,\
			VSIMPLED_Device_ID, Undefined_Init_Order,,,

;		ICODE

VxD_ICODE_SEG

;VSIMPLED_Sys_Critical_Init
;
;Description:
;	On entry, interrupts are disabled. Critical initialization
;	for this VxD should occur here. For example, we can read
;	settings from "VMM's cached copy of the SYSTEN.INI and act
;	set up our VxD as appropriate.
;
;	This procedure is called when the VSIMPLED_Control_Proc
;	dispatches the Sys_Critical_Init notification from VMM.
;
;	We can notify VMM of failure by returning with carry set
;	or carry clear will suggest success.

BeginProc VSIMPLED_Sys_Critical_Init

	Trace_Out "VSIMPLED: Sys_Critical_Init"
	clc
	ret

EndProc VSIMPLED_Sys_Critical_Init

;
;    VSIMPLED_Device_Init
;
;Description:
;	This is a non-system critical initialization procedure.
;	IRQ virtualization, I/O port trapping and VM control
;	block allocation can occur here.
;	Again, the same return value applies.
;	CLC for success, STC for error notification.

BeginProc VSIMPLED_Device_Init

	Trace_Out "VSIMPLED: Device_Init"
	clc
	ret

EndProc VSIMPLED_Device_Init

VxD_ICODE_ENDS

VxD_LOCKED_CODE_SEG

;	NONPAGEABLE CODE

;
;VSIMPLED_Control_Proc
;
;DESCRIPTION:
;	Dispatches VMM control messages to the appropriate handlers.
;ENTRY:
;	EAX = Message
;	EBX = VM associated with message
;EXIT:
;	Carry clear if no error (or if not handled by the VxD)
;	or set to indicate failure if the message can be failed.
;USES:
;	All registers.

BeginProc VSIMPLED_Control_Proc

	Control_Dispatch Sys_Critical_Init, VSIMPLED_Sys_Critical_Init
	Control_Dispatch Device_Init, VSIMPLED_Device_Init
	clc
	ret

EndProc VSIMPLED_Control_Proc

VxD_LOCKED_CODE_ENDS

END

;	End of File: vsimpled.asm

VSIMPLED.DEF

LIBRARY  VSIMPLED

DESCRIPTION 'Win386 VSIMPLED Sample Device (Version 3.10)'

EXETYPE  DEV386

SEGMENTS
	_LTEXT PRELOAD NONDISCARDABLE
	_LDATA PRELOAD NONDISCARDABLE
	_ITEXT CLASS 'ICODE' DISCARDABLE
	_IDATA CLASS 'ICODE' DISCARDABLE
	_TEXT  CLASS 'PCODE' NONDISCARDABLE
	_DATA  CLASS 'PCODE' NONDISCARDABLE

EXPORTS
	VSIMPLED_DDB @1

Debugging the VSIMPLED VxD

Before entering the Windows environment, you need to copy the debug version of the VMM into your system directory. The Windows 3.1 Device Development Kit contains this special version. There are many reasons to use this version of the VMM when developing your VxDs:

VMM displays debug traces when unexpected events occur. These messages help you track down problems with your VxD. You will know that you have a "clean" VxD, when the system does not display these messages while running with your VxD installed.
VMM includes a special debugging trace log that logs faults, device calls, and interrupt counts. These logs help to pinpoint the exact cause of a failure in your VxD.
Special services are enabled for debuggers to display VMM's execution state, VM information such as event lists, interrupt vector tables, the VM execution state, and other critical information not available in the retail release of VMM.

Using the debug version of WIN386.EXE requires either a serial terminal on COM1 or COM2 and WDEB386, the 386 debugger included with the Windows Software Development Kit and Device Driver Development Kit, or a Windows Enhanced Mode Debugger such as Soft-ICE/W^TM available from NuMega.

Note: WDEB386 and the debug version of WIN386.EXE are provided with VxD-Lite included on the accompanying disk.

The VSIMPLED device displays trace information at each initialization phase. Before the GUI starts, break into the debugger by using the appropriate hot-key (Control-D for Soft-ICE/W or a Control-C from the terminal keyboard for WDEB386) and unassemble the VSIMPLED_Sys_Critical_Init procedure:

Registration # SIW012345
:ALTSCR OFF
:LINES 50
:i1here on
:wc
:X
VSIMPLED: Sys_Critical_Init
Break Due to Hot Key
D800:00001A20   MOV    CX,0040
:u VSIMPLED_Sys_Critical_Init
VSIMPLED_Sys_Critical_Init
0028:8029478C   CALL   [Log_Proc_Call]
0028:80294792   PUSHFD
0028:80294793   PUSHAD
0028:80294794   MOV    ESI,VSIMPLED_DDB+38(800FEA2C)
0028:80294799   CALL   [Out_Debug_String]
0028:8029479F   POPAD
0028:802947A0   POPFD
:g
VSIMPLED: Device_Init
VMM Version 03.10 - Build Rev 00000103
Break Due to Hot Key
0028:800110A6   CMP    AX,0030
:u VSIMPLED_Sys_Critical_Init
VSIMPLED_Sys_Critical_Init
0028:8029478C   INVALID
0028:8029478E   INVALID
0028:80294790   INVALID
0028:80294796   INVALID
0028:80294798   INVALID
:g

Re-enter the debugger when the Windows GUI has completed initialization and unassemble the same procedure. You will find that the address is invalid because the initialization code and data segments were discarded after the device initialization was completed.

For more information on VMM's debugging services and debugging techniques, see Chapter 11, "Using the Debugging Services".

Chapter 2

The Virtual Machine Manager

Event Processing
Scheduling
Services and Dynalinking
Critical Sections
Suspending VMs, Resuming VMs, and Semaphores
Asynchronous Services

The Virtual Machine Manager is a single-threaded, non-reentrant, preemptive multi-tasking, event-driven operating system. This operating system is often referred to as "WIN386" or "VMM". VMM provides an interface layer to VxDs for event scheduling, memory management, descriptor table management, and other vital system services.

The VMM creates, runs, and destroys virtual machines (VMs). On startup, the VMM creates the System VM for the Windows GUI. The System VM interfaces to the SHELL VxD in VMM to create new virtual machines or DOS boxes -- each new VM starts operation in Virtual 8086 (V86) mode. Because a VxD is a part of the VMM, it runs within whatever VM is active when it is called. Consequently, when a DOS VM calls a VxD, the VxD runs in protected mode in the context of the calling VM.

To write a VxD, you must have a clear understanding of how the VMM works.

Event Processing

The execution path of VMM is driven by event lists. Event lists are linked lists of scheduled event procedure calls. These scheduled calls are created by the WIN386 system as the result of faults, interrupts, or specific VxD requests.

There are two types of event lists: the global event list and VM-specific event lists. The global event list is the event list for the VMM. As each VM is created, VMM creates an event list for specific events of that VM. Prior to returning control to a VM, VMM processes any events in the global event list, any pending NMI events (a special form of a global event), and then the VM event list as shown in Figure 2.1. Note that VM-specific events are only processed for the active VM.

Figure 2.1: VMM Event Processing Order

When a VxD processes an event, it has complete control of the system. Because extended event processing reduces the system performance, the event procedure must be fast and avoid lengthy processing. Returning from the event allows VMM to continue the processing of the event list.

When VM events are created, the execution priority of the VM can be adjusted. This is also known as a "boost". The boost can be temporary (automatically removed by VMM) or can be specifically removed by the VxD when all of the necessary event processing for that VM is completed. The execution priority of a VM is used by the primary scheduler (execution priority scheduling) to determine the active VM. (See the section on Scheduling for more detail.)

When all events from the global event list and active VM event list have been processed, the primary scheduler walks the VM list searching for the VM with the highest execution priority. The VM with the highest execution priority becomes the active VM. VMM returns to the active VM until it is reactivated by interrupt or fault processing.

When a VxD is processing an event, asynchronous VMM services may be called and new events generated as the result of IRQ handling. When an IRQ is generated by the PIC, the handlers installed into the IDT by VPICD (Virtual PIC Device) call the Hw_Int_Proc for the IRQ. During non-virtualized IRQ processing, the default VPICD handlers then schedule VM events for interrupt simulation. VxDs must be aware that VPICD handles interrupts while events are processed, and disabling interrupts during event processing may be necessary for VxDs performing critical hardware processing. IRQ handling is detailed in Chapter 7.

Because a VM does not continue executing until all events in the global event list and VM event list have been dispatched, the results of event processing in a VxD can become stacked in the VM. For example, a VxD processing a global timeout event may schedule an asynchronous call to a procedure in a VM. During this processing, the VxD may request that the VM resume execution. Before resuming execution of the VM, VMM processes any remaining events on the event list. If this includes an interrupt event scheduled by VPICD, the VxD may request a simulated interrupt in the VM. Finally, when VMM returns to the VM, the actual results of the event processing are executed in reverse order as pushed onto the VM's stack: The interrupt service is be processed first, before the callback scheduled by the timeout event.

Scheduling

There are two schedulers used in the WIN386 system: the primary scheduler and the secondary, or time-slice scheduler. The primary scheduler (execution priority scheduler) selects the active VM based on highest execution priority of the non-suspended VMs. A VM will remain active until a higher priority VM is found in the queue.

When a VM is boosted, its order is changed in the queue. Normally, the active VM has a boost of Cur_Run_VM_Boost in as its execution priority. Devices that require a VM to become active as the result of I/O or interrupt processing may use a device boost of High_Pri_Device_Boost to force the VM to become active. This is typically implemented using the Call_Priority_VM_Event service. Using this service, VMM adjusts execution priority of the specified VM, and a callback is notified when the VM has activated. The VxD can then continue its processing for the VM. Figures 2.2 and 2.3 demonstrate the effect in the scheduling queue of changing the execution priority. The following code example demonstrates the technique of boosting a VM's execution priority:

// Example of calling priority VM event in 'C'

DWORD			dwEventHandle;
static PEVENTPROC	pEventProc=NULL

 if (!pEventProc)
  pEventProc=vmmwrapThunkEventProc(BoostEventProc);
 dwEventBandle=vmmCallPriorityVMEvent(hVM,High_Pri_Device_Boost,
   PEF_Wait_Not_Crit,dwRefData,pEventProc,0);

// BoostEventProc - handler for VM event callback

VOID BoostEventProc(DWORD hVM, DWORD dwRefData, PCRS_32 pCRS){
 TRACEMSGPARAM("VM #EAX is now active\r\n", hVM);
} // end of BoostEventProc()

Figure 2.2: Scheduler queue prior to device boost

Figure 2.3: Scheduler queue after device boost.

The secondary scheduler (or time-slice scheduler) adjusts the execution priority for VMs for a period of time based on the background and foreground priorities set for each VM. The secondary scheduler determines which VM to boost based on the time-slice priorities specified in the .PIF file of a DOS application.

The time-slice priorities are also used to determine how long the execution priority of a VM will be boosted. The boost value is constant -- that is, changing the time-slice priorities does not affect the amount of execution priority boost that a VM receives. When the next time-slice occurs and the VM's time-slice period has been exhausted, the VM is unboosted and the next VM in the time-slice scheduler's queue receives the execution priority boost.

The time-slice scheduler's execution priority boost for a VM is low compared to other high-priority event processing. Thus, the high-priority VM remains active until it is unboosted or until another VM of higher priority is found in the primary scheduler's queue.

Services and Dynalinking

VMM, its component VxDs, and third-party VxDs can provide services callable by other VxDs. The calls to these services are resolved at runtime by the dynalink mechanism. The VxDCall and VMMCall macros provided by VMM.INC are expanded in code as follows:

	<Push any C parameters>

	int	Dyna_Link_Int
	dd	VxD-ID SHL 16 + VxD_Service

	<Clean up C parameters>

When the IDT dispatches the software interrupt to VMM, the dynalink routine patches the int 20h and the following dword with an indirect call to the VxD service handler. Stack parameters to the service are passed with the 'C' calling convention. VxDJmp is similar to VxDCall, with the exception that stack parameters cannot be used and the resulting code jumps to the VxD service handler, avoiding the extra cycles involved when the service call is followed by a return instruction.

Under some 386 'C' compilers, you cannot generate the appropriate in-line assembly instructions to duplicate this interface and/or load the registers required by the service. Consequently, you need to use .ASM thunks to provide a 'C' callable interface. Similarly, replacement VxDs (for example, a replacement VCD) may require register-parameter passing, and an assembly language front-end is necessary. The VDDVGA sample was written in 'C' and demonstrates the techniques required to interface to some of these services.

Note: The complete VDDVGA sample sources written in 'C' can be found on the enclosed diskette in the C\VDDVGA directory. The VMM "wrapper" for VxDs written in 'C' can be found in the C\VMMWRAP directory. For more information on writing VxDs in 'C' see Chapter 10.

Critical Sections

The primary scheduler implements a single critical section using the Begin_Critical_Section and End_Critical_Section services in VMM. The critical section can be claimed on behalf of a VM by a VxD. The critical section is most commonly used when calling MS-DOS or BIOS interrupt handlers because these real-mode code pieces are not reentrant. However, the critical section can also be used for other drivers or TSRs loaded prior to starting WIN386.

Note that the critical section does not halt scheduling of VMs; that is, other VMs may be scheduled while the critical section is claimed. If a second VM attempts to claim the critical section, the VM is suspended until the current critical section owner has released the critical claim. When a VM claims a critical section, the execution priority of the VM is adjusted by the predefined value of Critical_Section_Boost; the execution priority is restored when the critical section is released.

The critical section allows a VxD to prevent multiple VMs from entering the same piece of code. If two VMs are executing and interfacing to the same TSR and the TSR can not handle multiple VMs calling simultaneously because it maintains global non-instanced data for the specific procedure, a VxD may wrap the V86 interrupt chain and claim a critical section prior to reflecting the interrupt to the VM. It releases the critical section when the interrupt has returned. This prevents two VMs from simultaneously entering the same interrupt routine in the TSR. The following example demonstrates hooking the V86 interrupt, watching for a specific signature, and claiming a critical section around the API call:

;Hook the V86 interrupt (Int 60h)

BeginProc VSIMPLED_Sys_Critical_Init

	pushad
	mov	eax,60h
	mov	esi,OFFSET32 VSIMPLED_Int60_Hook
	VMMCall	Hook_V86_Int_Chain
	popad
	clc
	ret

EndProc VSIMPLED_Sys_Critical_Init

;Watches for the API signature. If found, claims
;a critical section and hooks the "back-end".

BeginProc VSIMPLED_Int60_Hook, High_Freq
	cmp     ([bp.Client_AX],4257h
	jne     SHORT VIH_Exit
	pushad

;Claim the critical section but allow interrupts
;to be serviced if we block.
	mov      ecx,Block_Svc_Ints or Block_Enable_Ints
	VMMCall Begin_Critical_Section

;Hook the back end of the Int60 call.
	xor      eax,eax
	xor      edx,edx
	mov      esi,OFFSET32 VSIMPLED_Int60_Complete
	VMMCall  Call_When_VM_Returns

	popad
VIH_Exit:
	stc			;always chain
	ret

EndProc VSIMPLED_Int60_Hook

;Completes the Int 60h handling by releasing the
;critical section and returning.

BeginProc VSIMPLED_Int60_Complete, High_Freq

	VMMCall End_Critical_Section
	ret

EndProc VSIMPLED_Int60_Complete

Suspending VMs, Resuming VMs, and Semaphores

VMM provides services to suspend and resume the execution of a VMs (Suspend_VM and Resume_VM). It is not possible for a VxD to suspend the execution of the System VM because VMM prevents this, but all other VMs can be suspended. Also, if a VM is the critical section owner, suspending the VM is not valid, and consequently the suspend call will fail.

When it suspends a VM, a VxD causes the VM to be removed from the active queue and added to the inactive queue. The primary scheduler does not activate this VM until it is resumed. If a VxD suspends a VM that is currently active, an immediate task switch occurs and the execution path in the VxD halts at the Suspend_VM call. To see this, try using debug traces to "wrap" the call to the Suspend_VM service. The debug trace in front of this call displays and a task switch occurs as when the active VM is placed in the inactive queue (the VM with the highest priority becomes the active VM), after which global events and VM events are processed. When the suspended VM has been resumed, the debug trace after the Suspend_VM call in the VxD is displayed, as the execution path of the VM continues.

VMM provides services (Wait_Semaphore and Signal_Semaphore) that allow VxDs to block and unblock VMs, based on events occurring in the VxD that decrement a token count by signaling the semaphore. A VM waiting on a semaphore resumes when the token count is less than or equal to zero. Additionally, it is possible to specify that certain events can be processed in a blocked VM. The following list describes the flags associated with the Wait_Semaphore service:

Block_Enable_Ints	Forces interrupts to be enabled and serviced even if interrupts are disabled in the blocked VM. (Only relevant if Block_Svc_Ints or Block_Svc_If_Int_Locked specified.)
Block_Poll	Causes the primary scheduler to not switch away from the blocked VM unless another VM has higher priority.
Block_Svc_Ints	Service interrupts in the VM even if the virtual machine is blocked.
Block_Svc_If_Ints_Locked	Same as Block_Svc Ints with the additional requirement that the VMStat_V86IntsLocked flag is set.

Figure 2.4 shows the flow control possible using the semaphore services. For example, a VxD can signal or wait on semaphores in response to API calls from both the V86 VM (DOS application) and from the PM VM (Windows Application), allowing the VxD to control a data transfer channel through the VxD. Note: A complete sample demonstrating semaphore usage and DOS to Windows communication, can be found on the enclosed diskette in the ASM\SEMAPHOR directory.

Figure 2.4: Possible design of semaphore implementation.

Asynchronous Services

Because VMM is non-reentrant, only a subset of VMM's API is available when a VxD is entered through an asynchronous interrupt. Services in a VxD can be declared ASYNC and are available at interrupt time. If your VxD declares such a service, it may call only asynchronous services. The following tables list all the asynchronous services that may be called in interrupt handlers:

Asynchronous VMM Services
Begin_Reentrant_Execution	Get_Time_Slice_Info
Call_Global_Event	Get_VM_Exec_Time
Call_Priority_VM_Event	Get_VMM_Reenter_Count
Call_VM_Event	Get_VMM_Version
Cancel_Global_Event	List_Allocate
Cancel_VM_Event	List_Attach
Close_VM	List_Attach_Tail
Crash_Cur_VM	List_Deallocate
End_Reentrant_Execution	List_Get_First
Fatal_Error_Handler	List_Get_Next
Fatal_Memory_Error	List_Insert
Get_Crit_Section_Status	List_Remove
Get_Crit_Status_No_Block	List_Remove_First
Get_Cur_VM_Handle	Schedule_Global_Event
Get_Execution_Focus	ScheduIe_VM_Event
Get_Last_Updated_System_Time	Signal_Semaphore
Get_Last_Updated_VM_Exec_Time	Test_Cur_VM_Handle
Get_Next_VM_Handle	Test_Debug_Installed
GetSetDetailedVMError	Test_Sys_VM_Handle
Get_System_Time	Update_System_Clock
Get_Sys_VM_Handle	Validate_VM_Handle
Asynchronous Debugging Services
Clear_Mono_Screen	Is_Debug_Chr
Debug_Convert_Hex_Binary	Log_Proc_Call
Debug_Convert_Hex_Decimal	Out_Debug_Chr
Debug_Test_Cur_VM	Out_Debug_String
Debug_Test_VaIid_Handle	Out_Mono_Chr
DisabIe_Touch_1st_Meg	Out_Mono_String
EnabIe_Touch_1st_Meg	Queue_Debug_String
Get_Mono_Chr	Set_Mono_Cur_Pos
Get_Mono_Cur_Pos	Test_Reenter
In_Debug_Chr	Validate_Client_Ptr
Asychronous VxD Services
BlockDev_Command_Complete	VPICD_Get_Complete_Status
BlockDev_Send_Command	VPICD_Get_IRQ_Complete_Status
DOSMGR_Get_DOS_Crit_Status	VPICD_Get_Status
PageFiIe_Read_Or_Write	VPICD_Phys_EOI
VPICD_Call_When_Hw_Int	VPICD_Physically_Mask
VPICD_Clear_Int_Request	VPICD_Physically_Unmask
VPICD_Convert_Handle_To_IRQ	VPICD_Set_Auto_Masking
VPICD_Convert_Int_To_IRQ	VPICD_Set_Int_Request
VPICD_Convert_IRQ_To_Int	VPICD_Test_Phys_Request
VPICD_Force_Default_Behavior	VTD_Update_System_Clock
VPICD_Force_Default_Owner

Chapter 3

The VMM implements two memory managers. The V86MMGR VxD manages memory for V86-mode applications, including Expanded Memory Specification (EMS) and Extended Memory Specification (XMS), and the Memory Manager (MMGR) provides services such as GDT/LDT management, global heap management, physical memory management, protected mode address translation, and V86 page management, including V86 address mapping and allocation.

If you are writing a virtual display device or writing a VxD for a device requiring contiguous physical memory (such as devices using DMA transfers), you need to implement some form of memory management. Additionally, certain memory management implementations in your VxD such as memory mapped devices may require knowledge of the way the 80386 implements memory management using page tables.

VMM Memory Mangement Services

All memory in the system is allocated by the memory manager. This includes large allocations for VMs as well as a small heap available to VxDs requiring dynamic memory allocation.

While each VM has its own memory and linear address space, any VM that is presently executing is also mapped into the first megabyte of the linear address space. The MMGR performs this mapping on each task switch by updating the page tables to reflect the new mapping of the lower linear address space. Figure 3.1 shows a possible memory configuration with multiple VMs.

Figure 3.1: VMM Memory Map

The MMGR can provide per-VM data to a VxD. When a VxD initializes, it can request a number of bytes of control block data. The MMGR returns an offset from the VM handle, which is reserved for your VxD's control block area at the same offset in each VM control block. The following 'C' code sample shows how a VxD control block is allocated and assigned a pointer.

//Allocate part of VM control block for VDD usage

 dwVidCBOff=vmmAllocateDeviceCBArea(sizeof(VDDCB),0);
 if (dwVidCBOff==NULL){
  vmmDebugout("VDD ERROR: Could not allocate control block area!\r\n");
  vddFatalMemoryError();
  return FALSE;
 }
 pSysVMCB=(PVDDCB)(hVM+dwVidCBOff);

VMM allocates a control block containing vital information for each VM and is located at the zero offset from the VM handle. VMM's control block has the following structure:

//VM control block structure (VMM)

typedef struct tagVMMCB{
 DWORD	CB_VM_Status
 DWORD	CB_High_Linear
 DWORD	CB_Client_Pointer
 DWORD	CB_VMID
}VMMCB, *PVMMCB;

Thus, given a VM handle, a VxD can obtain the VM's ID using the following method:

DWORD dwVMID;

 dwVMID=((PVMMCB)hVM)->CB_VMID;

The low memory (interrupt vector table, BIOS & DOS data, and so forth) for each VM is located in high linear address space along with the rest of the memory for that VM. It is preferable to access VM memory using the high linear addresses, as these will not change. If a task switch occurs during memory reads or writes to a low linear address, your VxD may access an invalid address.

Translation Services

The MMGR provides an address translation API. While registers are preserved when making a ring transition between V86 mode and flat 32-bit mode, a pointer using a real-mode segment and offset is meaningless in protected mode. A number of macros in VMM.INC use MMGR services to convert the parameters in the client VM's registers automatically.

Client_Ptr_Flat is a macro that sets up a call to the Map_Flat service:

	Client_Ptr_Flat esi,DS,DX

which expands to:

	push	eax
	mov	ax,Client_DS*100h + Client_DX
	VMMCall	Map_Flat
	mov	esi,eax
	pop	eax

The actual address mapping magic is performed in VMM's Map_Flat service. The following algorithm is used by Map_Flat to map the pointer to a 32-bit flat offset:

	mov	esi,[ebp.Client_EDX]
	mov	eax,[ebp.Client_DS]
if (VM is V86 mode)
	shl	eax,4
	movzx	esi,si		;zero high order offset
	add	eax,esi
	add	eax,[ebx.CB_High_Linear]
else (VM is prot. mode)
 if (!32-bit)
	movzx  esi, Si
 eax = _Selector_Map_Flat( hVN, [ebp.Client_DS],  0
 if (eax != -1)
	add  eax, esi
 if (eax < 1 MB + 64KB)
	add  eax,[ebx.CB_High_Linear]
endif

The translation APIs are often used when accessing memory specified through V86 or PM APIs. Dual-mode (combination V86 and PM) APIs accessing application-provided buffers can be easily implemented using the Map_Flat service as demonstrated here:

;VSIMPLED_Get_Info, PMAPI, RMAPI
;
;DESCRIPTION:
;	This function is used to get information about the
;	VSIMPLED configuration.
;ENTRY:
;	Client_ES = selector/segment of VSIMPLEDINFO structure
;	Client_BX = offset of VSIMPLEDINFO structure
;EXIT:
;	IF carry clear
;	    success
;	    Client_AX = non-zero
;	    Client_ES:BX ->filled in VSIMPLEDINFO structure
;	ELSE carry set
;	    Client_AX = 0
;USES:
;	Flags, EAX, EBX, ECX, ESI, EDI

BeginProc VSIMPLED_API_Get_Info

	Assert_Client_Ptr ebp

	Trace_Out "VSIMPLED_API_Get_Info: called"

	Client_Ptr_Flat edi, ES, BX
	cmp      edi, -1
	je       SHORT GI_Fail

	lea     esi, [gVxDInfo]
	mov	ecx, size VSIMPLEDINFO
	cld
	shr	ecx, 1
	rep	movsw
	adc	cl, cl
	rep	movsb

	mov	[ebp.Client_AX],1	;success
	clc
	ret

GI_Fail:
	Debug_Out "VSIMPLED_API_Get_Info: FAILED!!"
	mov	[ebp.Client_AX],0	;failed
	stc
	ret

EndProc VSIMPLED_API_Get_Info

Page Allocation

Allocation of memory can be accomplished using either the _HeapAllocate or _PageAllocate VMM services. In most cases, using the heap allocation services is sufficient for your VxD and may make implementation easier than using the page allocation services. To allocate memory using the heap services use the following code:

	VMMCall	_HeapAllocate, <cbSize,dwFlags>
	or	eax,eax
	jz	SHORT Alloc_Failed
	mov	pDataBlock,eax

VMM allocates the memory on a doubleword boundary, but the cbSize parameter does not have to be dword aligned. The VxD is responsible for making sure that it stays within the bounds of the memory block, because VMM does not provide protection against accessing memory beyond the allocated range. The memory allocated by this service is fixed, and frequent allocating and freeing of memory may fragment the heap. Also, the memory block is not page-locked and may not be present when accessed. PageSwap VxD resolves the not-present fault so your VxD can continue with memory accesses.

If you require page-locked memory and are using the heap management services, the service _LinPageLock can be implemented. This avoids the possibility of VMM discarding the physical memory between accesses by a VxD. However, because physical memory is a limited resource, you should only use this service in cases where page-locked memory is vital to your implementation.

_HeapGetSize, _HeapReAllocate, and _HeapFree are used to determine the block size and to reallocate and free the memory block, respectively. Using _HeapReAllocate may cause the address of the block to change, and VxDs must not rely on the possibility of the address remaining constant. _HeapReAllocate can preserve the contents of the old block by copying the contents to the new block. The following flags are defined for use with this service:

HeapNoCopy	Do not copy the contents of the existing block.
HeapZeroInit	Initialize the new bytes in the heap to zero.
HeapZeroReInit	Fill all bytes in the block with zero.

MMGR also provides low-level memory management services, allowing a VxD to allocate memory within a physical address range, to perform allocations within physical boundary constraints (not crossing 64k or 128k boundaries), and to allocate memory visible to all VMs or to only a single VM. Additionally, the page-fault handler for the allocated pages can be redirected to a specific handler in your VxD. (See the next section for more information on hooked pages.)

Allocation of pages with physical boundary restrictions and/or physical address limitations can only be performed during initialization. The following example demonstrates allocating a buffer for use with a DMA device:

;VSIMPLED_Allocate_DMA_Buffer
;
;DESCRIPTION:
;	This function allocates a buffer suitable for DMA transfers.
;	It attempts to allocate enough contiguous pages to hold the
;	requested size. If the request fails, the size is halved
;	until all allocation attempts have failed.
;ENTRY:
;	EAX = Desired size (in KB) of the DMA buffer to allocate.
;	  This size cannot be exceed 64.
;EXIT:
;	IF carry clear
;	  EAX = memory handle of the memory block allocated
;	  EBX = _physical address_ of memory block
;	  HCX = actual size in _bytes_ of memory block allocated
;	  EDX = _ring 0 linear address_ of memory block
;	ELSE carry set
;	  EAX = EBX = ECX = EDX = 0
;USES:
;	Flags, EAX, EBX, ECX, EDX

BeginProc VSIMPLED_Allocate_DMA_Buffer

	cmp	eax,64
	jle	SHORT ADB_Start

	Debug_Out "Requested size #EAX too big!"
	mov	eax,64
ADB_Start:
	add	eax,3		;round up to get
	shr	eax,2		;# of pages
ADB_Allocate_DMA_Buffer_Loop:
	mov	ebx,eax		; EBX = # of pages to allocate
				; (examples:        3     7     11
				;                 12K   28K    44K
	dec     eax             ; # pages -  1    10b  111b  1011b
	bsr     cx, ax          ; max power  of 2   1     2      3
	inc     cl              ; shift cnt         2     3      4
	mov     eax, 1
	shl     eax, cl         ; mask + 1       100b 1000b l0000b
	dec     eax             ; mask            11b  111b  1111b
				; alignment       16K   32K    64K
	mov	ecx, ebx

	Trace_Out "pages=#ECX alignment=*EAX"

; EAX = alignment mask for allocation
; ECX = number of pages to allocate
	push	ecx
	VMMcall	_PageAllocate, <ecx,PG_SYS,0, eax,\
			0, 0FFFh, ebx,\
			<PageUseAlign + PageContig + PageFixed>>
	pop	ecx
	or	eax, eax
	jnz	short ADB_Success

	Trace_Out "Allocation failed! pages=#ECX"
	mov	eax, ecx
	shr	eax, 1
	jnz	short  ADB_Loop

	xor	ebx, ebx
	xor	ecx, ecx
	stc
	ret

ADB_Success:

	shl	ecx,12		; pages-->bytes

;Returns:
;  EAX = memory handle of the memory block allocated
;  EBX = _physical address_ of memory block
;  ECX = size in _bytes_ of memory block allocated
;  EDX = _ring 0 linear address_ of memory block
	clc			; success
	ret

EndProc VSIMPLED_Allocate_DMA_Buffer

Hooked Pages and Page Faults

Hooked pages are allocated with PageAllocate, using the PG_HOOKED attribute. This form of memory management is most commonly used in virtual display drivers to manage multiple VMs that access video display memory. A range of V86 pages is assigned to the VxD and then hooked using the _Assign_Device_V86_Pages and Hook_V86_Page services, respectively. V86 pages can be assigned globally (global to all VMs) to a device at any time, provided that the page is not already assigned. V86 page assignment to a specific VM can only be performed after device initialization, again with the restriction that the page is not already assigned to a device.

To hook V86 pages, a range of pages is first assigned to the VxD:

//Buffer used for reserving pages
DWORD aVMPagesBuf[9];

 vmmGetDeviceV86PagesArray(NULL,&aVMPagesBuf,NULL);
 if (aVMPagesBuf[0xA0/32] & 0xFF00FFFF){
  vmmDebugOut("VDD ERROR: Pages already allocated\r\n");
  vmmFatalError(szVDD_Str_CheckVidPgs);
  return FALSE;
 }
 if (!_AssignDeviceV86Pages(0xA0,16,NULL,NULL)){
  vmmDebugOut("VDD ERROR: Could not allocate pages\r\n");
  vmmFatalError(szVDD_Str_CheckVidPgs);
  return FALSE;
 }
 if (!vmmAssignDeviceV86Pages(0xB8,8,NULL,NULL)){
  vmmDebugOut("VDD ERROR: Could not allocate pages\r\n");
  vmmFatalError(szVDD_Str_CheckVidPgs);
  return FALSE;
 }

The V86 pages are then directed to a page fault handler:

//Put an .ASM front end on the page-fault procedure.
 pVDD_PFault=VMWRAP_ThunkV86PHProc(VDD_PFault);
 if (pVDD_PFault==NULL){
  vmmDebugout("VDD ERROR: Could not thunk VDD_PFault!\r\n");
  vmmFatalError();
  return FALSE;
 }

//Hook graphics pages
 for (i=0; i<16; i++)
  vmmHookV86Page(0xA0+i,pVDD_PFault);

//Hook text pages
 for (i=0; i<8; i++)
  vmmHookV86Page(0xB8+i,pVDD_PFault);

During the Create_VM message processing, the V86 pages are marked as not available (not present and not writeable), using the _ModifyPageBits service:

 vmmModifyPageBits(hVM,0xA0,16,~P_AVAIL,NULL,PG_HOOKED,NULL);
 vmmModifyPageBits(hVN,0xB8,8, ~P_AVAIL,NULL,PG_HOOKED,NULL);

Note that it is necessary to specify the PG_HOOKED in the type parameter of the _ModifyPageBits service when clearing any of the PG_PRES, PG_USER or PG_WRITE bits.

After the initialization is complete, any read or write access of the hooked pages causes a page fault. The page fault handler is called with the faulting page number and the handle of the VM, causing the fault. It is the responsibility of the page fault handler to map memory into the page to resolve the fault or terminate the virtual machine. To map physical memory into the faulting page, use the following code:

//dwPhysPage is the physical page allocated using
//_PageAllocate with PG_HOOKED
 vmmPhysIntoV86(dwPhysPage,hVM,uFaultPage,nPages,0);

Under some circumstances (such as low memory or other memory mapping error), it may be more desirable to allow the VM to continue without crashing the VM. In these cases, the system null page is assigned to this linear page:

 vmmMapIntoV86(VMM_GetNulPageHandle(),hVM,uFaultPage,1,0,0);

The system null page is guaranteed to contain invalid information for any given VM. Do not rely on its contents for further processing in your VxD.

The VDD uses these techniques to allow multiple VMs to access the video display hardware and maintain separate virtual displays for virtual machines. It is also possible to simulate ROM in a virtual machine using hooked pages. When the page fault occurs, map the pages using _PhysIntoV86 and clear the P_WRITE bit using _ModifyPageBits.
Note, however, that when the VM restarts, the instruction causing the fault also restarts. If the VM was performing a write operation, a page fault would occur immediately. To resolve this loop, you would need to modify the VM client registers to point the IP to the instruction following the faulting instruction.

Note: A sample VxD demonstrating these hooked memory techniques can be found in the C/VMEMTRAP directory on the enclosed diskette. Also, C/VDDVGA is a good source of memory management sample code.

Examining Page Table Entries

A VM can determine whether pages in the linear address space have been accessed and whether data has been written on these pages by examining the page table entries (PTEs) using VMM's _CopyPageTable service. The VDD uses this technique to determine which pages have been accessed and need to be updated in the virtual display of a windowed MS-DOS box.

A linear address in a paging operating system such as VMM is decoded shown in Figure 3.2. Each PTE is 4 bytes in length and contains the access bits and physical address of the page. To examine the PTEs of the first megabyte of the active virtual machine, use page numbers in the range 0 to 10Fh. Page numbers of other virtual machines are computed using the CB_High_Linear field in the control block of the respective VM.

Given a pointer to a memory block in a VM, a VxD can use the Map_Flat service to translate this address to a flat offset. Shifting this address right by 12 gives you the page number. To determine if pages in a hooked V86 range have been accessed or if data has been written to these pages use the following code:

	VMMCall	CopyPageTable, <guHookedPagesStart,\
				guNumHookedPages,\
				<OFFSET32 aPageBuf>,0>
	mov	eax,guNumHookedPages

Check_Accessed_Or_Dirty:
	test	dword ptr aPageBuf[ecx],P_ACC or P_DIRTY
	jz	SHORT Next_Page
	Trace_Out "Page #ECX of hooked range is dirty or has been accessed"

Next_Page:
	loop	Check_Accessed_Or_Dirty

Figure 3.2: Decoding a linear address to a physical address

Allocating Selectors

A VxD can allocate selectors in the GDT or in a VM's LDT using the _Allocate_GDT_Selector and _Allocate_LDT_Selector services. Two descriptor double-words are required when allocating selectors. VMM provides the _BuildDescriptorDWORDs service to generate these double-words:

	VMMCall	_BuildDescriptorDWoRDs,<dwLinAddr,cbSize,\
			RW_Data_Type,0,0>
	VMMCall	_Allocate_GDT_Selector,<edx,eax,0>

The following equates are useful when building descriptor double-words:

;Common definitions for segment and control descriptors

D_PRES		segment is present in memory
D_NOTPRES	segment not present

D_DPL0		descriptor privilege level definitions
D_DPL1
D_DPL2
D_DPL3

D_SEG		segment descriptor (application type)
D_CTRL		control descriptor (system type)

D_GRAN_BYTE	limit in byte granularity
D_GRAN_PAGE	limit in page granularity

D_DEF16		default operation size is 16 bits (code)
D_DEF32		default operation size is 32 bits (code)

;Definitions specific to segment descriptors

D_CODE		code segment
D_DATA		data segment
D_RX		if code, readable
D_X		if code, executable only
D_W		if data, writeable
D_R		if data, read only
D_ACCESSED	segment accessed bit

;Useful segment definitions

RW_Data_Type	present R/W data segment
R_Data_Type	read-only data segment
Code_Type	code segment

Instance Pages

The MMGR manages instance data for VMs. Instance data is a range in V86 address space that VMM maintains separately for each VM. It is used frequently for MS-DOS and some TSRs.

For example, if an MS-DOS device driver maintains an input buffer, it may be useful to have the buffered input directed to the VM that was active when the buffer was filled. In this case, the VxD would query the device driver for the buffer address and maximum size and add an instance data area as shown here:

//Define instance data for instance data manager

INSTDATASTRUC Instance_Area={
  NULL,NULL,NULL,NULL,ALWAYS_Field};

//Specify instanced area as provided by DOS driver.
 Instance_Area.dwInstLinAddr=pInputBuffer;
 Instance_Area.dwInstSize=dwBufferSize;
 if (!VMM_AddInstanceItem(&Instance_Area,0)) goto DI_FatalError;

Mapping Memory into Multiple VMs

When writing VxDs for use with "Windows-aware" TSRs, it may be necessary to allocate a block of memory that is global to all VMs, that is, a memory block with a V86 address mapped to the same physical memory in all VMs. The _Allocate_Global_V86_Data_Area service performs this type of allocation as shown here:

//Allocate a global V86 data area of 512 bytes
 gdwGlobalArea=vmmAllocateGlobalV86DataArea(512,GVDADWordAlign);
 if (gdwGlobalArea==NULL){
  vmmDebugout("Failed to allocate global V86 data area!\r\n");
  return FALSE;
 }
 vmmTraceOutParam("Allocated global area at #EAX\r\n",gdwGlobalArea);

The _Allocate_Global_V86_Data_Area service accepts the following flags:

GVDADWordAlign	Aligns the block on a doubleword boundary.
GVDAHighSysCritOK	Informs the services that the VxD can handle a block that is allocated from high MS-DOS memory, such as UMBs or XMS. (Win 3.1 only)
GvDAInquire	Returns the size in bytes of the largest block that can be allocated, given the requested alignment restrictions. (Win 3.1 only)
GVDAInstance	Creates an instance data block, allowing the VxD to maintain separate blocks for each VM.
GVDAPageAlign	Aligns the block on a page boundary.
GVLAParaAlign	Aligns the block on a paragraph boundary.
GVDAReclaim	Unmaps the physical pages in the block when mapping the system null page into the block. The physical pages are added to the free list when this value is specified. Only applies to blocks allocated on a page boundary. If this flag is not specified, it is up to the virtual device to reclaim these pages.
GVDAWordAlign	Aligns the block on a word boundary.
GVDAZeroInit	Fills the allocated block with zeros.

In the VMEMTRAP sample, an unassigned V86 area is located and assigned to the virtual device. Pages are allocated for each new VM and "instanced" pages are simulated, using hooked V86 pages and a page-fault handler. Using the _AllocateGlobalV86DataArea service specifying the GDVAInst accomplishes the same thing in a single service call, with the exception that a specific V86 range cannot be specified. The
VMEMTRAP sample on the enclosed diskette is designed to demonstrate the techniques necessary to manage contention of memory mapped devices.

_AllocateGlobalV86DataArea has limitations. For example, you cannot hook the page fault handler or modify the page bits of the V86 linear range returned by this service. Windows 3.x does not provide an interface to allow VxDs to monitor access of these pages other than viewing the page table entry access bits. A virtual device must provide an additional interface to manage VM contention of these pages using software interrupts or the VxD's API.

Page Protection

As stated in the preceding section, VMM's support for monitoring access to a given V86 address space is limited. Page protection can be implemented with pages assigned to a device using the _Assign_Device_V86_Pages service, but these pages are usually only available when memory is not already mapped into the reserved ROM addresses. Because of upper memory blocks (UMBs) implemented by most 386 memory managers, this region is usually already claimed by VMM. Also, the normal accessible regions of V86 memory (between _GetFirstV86Page and _GetLastV86Page) are off limits to a VxD using the API provided by VMM.

An unsupported method of providing page protection is to modify the page table entries (PTEs) directly and hook the Invalid_Page_Fault handler. The PTE contains the page frame address in the upper 20 bits (4k page aligned), and the lower 12 bits provide access restriction and accessed and/or dirty information.

Entry 0 in the page directory contains the physical address of the page table for the V86 address space of the active VM. By modifying these page table entries, you can modify the access rights to a given page in V86 address space.

You must use caution when accessing the page tables directly. Modifying not-present page tables or incorrectly modifying page access bits will cause the system to crash. In other words, "Ok, here's your weapon, first point it at your foot before pulling the trigger!"

Page protection is risky business when it is not directly supported by the host operating system, but some implementations require such information about how a VM is behaving. Take note!! You can guarantee that anything that you do now to provide this mechanism may not be supported in future releases of Windows. Use this information at your own risk and version bind your code to the Microsoft Windows 3.1 VMM.

Figure 3.3: Possible design of TSR to VxD communication

The VGLOBALD sample on the enclosed diskette demonstrates the allocation of a global V86 data area that would be suitable for a TSR and VxD to use for communication in multiple VMs. If you run this sample under the debugging version of WlN386.EXE you should notice that, when new VMs are created and the System VM does not have access to the pages that are hooked using this page protection scheme, VMM will "gripe" about the not-present page within the V86 page range. You may decide to modify the page table entries to match WIN386 expectations before creating a new VM.

V86MMGR

V86MMGR provides an interface for VxDs to map protected-mode data buffers to V86-interfaces. When a virtual device translates an API which transfers data using pointers to data blocks from protected mode applications to DOS-mode device drivers, it needs to implement services provided by V86MMGR to translate these buffers to a V86 addressable memory. Also, DOS device drivers that update buffers asynchronously require memory to be mapped into global V86 address space.

For example, Int 21h commonly uses buffers referenced by DS:DX. The DOSMGR virtual device provides automatic buffer translation for most of these APIs by hooking Int 21h and translating the protected mode addresses so that DOS can understand the request without additional work required by the protected-mode application. Additionally VNETBIOS provides buffer mapping for NetBIOS data packets using V86MMGR services. These buffers are updated as the result of interrupt processing.

V86MMGR provides two types of services: buffer mapping and buffer translation. The mapping services update the page tables in all VMs so that the buffer is in global V86 space. The translation services copy a buffer to a V86 copy buffer and use the copy buffers address to communicate with the DOS device driver code. The mapping services should be used only when the buffers will be updated asynchronously. Do not use the mapping services in place of the translation services to avoid copying the buffers data- it is faster to copy data to and from a translation buffer than to map a buffer into multiple virtual machines.

V86MMGR does not directly support the mapping or translation of buffers referenced by pointers within a structure. The VxD is responsible for translating or mapping the buffer using V86MMGR services; it updates the structure to contain a valid V86 pointer and then passes the call to the DOS device driver.

When a VxD requires V86MMGR services, it must inform V86MMGR how many pages are required by using the V86MMGR_Set_Mapping_Info service. This service call must be made during initialization, preferably during Sys_Critical_Init processing. Alternatively, the VxD can call this service during Device_Init, if the VxD has an Init_Order less than V86MMGR_Init_Order.

When a call to the DOS device has been intercepted by the VxD, the VxD should determine whether the call is from V86 mode or protected mode. When a V86 call is trapped, buffer translation is not necessary, but mapping for asynchronously updated buffers may be necessary if the buffer is not located in global V86 address space determined by using the _TestGlobalV86Mem service.

To map pages to DOS addressable memory, a VxD calls V86MMGR_Map_Pages with the linear address and number of bytes to map. The returned linear address is guaranteed to be in the first megabyte and in global V86 address space. A map handle is also returned by this service. When the mapping region is no longer required, it is freed using the V86MMGR_Free_Page_Map_Region service with the map handle that was returned by V86MMGR_Map_Pages.

To translate a protected-mode buffer to V86 addressable memory, a VxD calls V86MMGR_Allocate_Buffer with the linear address of the buffer to translate and the number of bytes to allocate. If specified, this service copies data to the new buffer. Translation buffers are allocated in a "stack" fashion. In other words, the last buffer allocated must be the first buffer freed. When the translation buffer is no longer required, the V86_Free_Buffer service is used.

The following code fragment demonstrates how a software interrupt buffer is translated from a protected-mode to a real-mode driver:

; On entry Client_DS:Client_DX points to a buffer that is
; filled asynchronously and needs to be mapped globally.
; Eat the PM interrupt and reflect it to V86 mode.
; When the DOS device driver has completed the data
; transfer, the pages must be unmapped using the
; V86MMGR_Free_Page_Map_Region service.

BeginProc PM_Translate

	pushad
	test	[ebx.CB_VM_Status], VMStat_PM_Exec
	jz	SHORT PT_Bail

	VMMCall	Simulate_Iret
	Map_Flat esi, DS, DX
	movzx	ecx,[ebp.Client_CX]
	VxDCall V86MMGR_Map_Pages
	mov	hPageMap,esi
	shl	edi,12
	shr	di,12

; Simulate the interrupt to V86

	Push_Client_State
	Begin_Nest_V86_Exec
	mov	[ebp.Client_DX],di
	shr	edi,16
	mov	[ebp.Client_DS],di
	mov	eax,Trapped_INT
	VMMCall	Exec_Int
	VMMCall	End_Nest_Exec
	Pop_Client_State
	clc
	jmp	SHORT PT_Exit

PT_Bail:
	Debug_out "Failure: Call not from protected mode!"
	stc

PT_Exit:
	popad
	ret

EndProc PM_Translate

V86MMGR provides a number of macros to define a script for use with the V86MMGR_Xlat_API service. A VxD defines a translation script in its data segment using these translation macros and calls the V86MMGR service to execute the script. This provides the VxD with a way to reduce the code size of V86 translation services and to use the optimized routines in V86MMGR.

The translation scripts are terminated by Xlat_API_Exec_Int or Xlat_API_Jmp_To_Proc. When the V86MMGR_Xlat_API service executes one of these commands, control returns to the VxD after the command has been executed. The following sample code demonstrates the use of these macros to translate a null-terminated string for a call to a DOS device driver:

; This code demonstrates a simple translation of a NULL
; terminated string in DS:SI to a local V86 buffer.

VxD_DATA_SEG
Xlat_ASCIIZ_Script:
	Xlat_API_ASCIIZ		ds, si
	Xlat_API_Exec_Int	60h
VxD_DATA_ENDS

VxD_CODE_SEG
BeginProc Translate_Int60h_Buffer

      mov   edx,OFFSET32 Xlat_ASCIIZ_Script
      VxDJmp V86MMGR_Xlat_API

EndProc Translate_Int60h_Buffer
VxD_CODE_ENDS

Chapter 4

V86/PM VxD API

The Faulting Mechanism and API Dispatch
The Client Register Structure
Examining and Modifying Information of the Active VM
Creating a Dual-Mode API
Callbacks and Hooking Existing DOS Devices

A VxD can export an API to protected-mode and V86 mode applications, extending the capabilities of a Windows or MS-DOS driver using supervisor code. For example, the VCD provides an interface to the Windows communications driver (COMM.DRV) to acquire a COM port. The COMM driver queries the VCD for the availability of a given port. If the port is in use by an MS-DOS application, the VCD returns failure. This API allows the COMM.DRV to provide intelligent information regarding the availability of COM ports to the calling application and provides a mechanism to manage device contention.

A VxD declares the API support by defining API procedure entry points in the DDB (see Chapter 1). In the following example, VSIMPLED_V86_API_Proc and VSIMPLED_PM_API_Proc procedures are the entry points for the API from V86 mode and protected mode, respectively. Additionally, the VxD must declare the device ID, as supplied by Microsoft.

Declare_Virtual_Device	VSIMPLED,\
			VSIMPLED_MAJOR_VER,\
			VSIMPLED, MINOR_VER,\
			VSIMPLED_Control_Proc,\
			VSIMPLED_Device_ID,\
			Undefined_Init_Order,\
			VSIMPLED_V86_API_Proc,\
			VSIMPLED_PM_API_Proc

An application acquires the entry point of the VxD by using Int 2Fh with AX=1684h and BX=VxD_Device_ID:

; Obtain the VxD entry point, if NULL, VxD is not present.
	mov	ax,1684h		; get VxD API entry point
	mov	bx,VSIMPLED_Device_ID
	int	2fh
	mov	word ptr dwVxDEntry[0],di
	mov	word ptr dwVxDEntry[2],es

When this entry point is called by the application, the call is dispatched to the VxD, where it processes the request and returns control to the calling application.

Prior to requesting the VxD entry point from VMM, the application should first determine whether Windows/386 (VMM) is present. A Windows application can use the GetWinFlags() API. A DOS application needs to use Int 2Fh, AX=1600h interface to determine whether VMM is present:

	mov	ax,1600h		;Enhanced Windows Check
	int	2fh
	test	al,7fh			;VMM (Win386) present?
	jz	Not_Win386

The Faulting Mechanism and API Dispatch

If calling ring-0 VxD code directly from ring 3 seems too good to be true, you should be interested in how this call is dispatched to the VxD. When the Int 2Fh request is processed, the VMM allocates a callback address in the VM's address space. When the VM calls this address, the code generates a fault, a ring transition results, and the fault is dispatched to VMM's fault handler.

VMM determines the operation mode of the VM by testing the status flags in the VM control block. It determines whether the call was made from V86 or protected mode and then dispatches the call at ring 0 to the appropriate handler, as declared in the DDB.

The Client Register Structure

When the API entry points are called, the EBP register points to the Client_Register_Structure (CRS):

typedef struct tagCRS_32{
 DWORD	Client_EDI;
 DWORD	Client_ESI;
 DWORD	Client_EBP;
 DWORD	dwReserved_1;		//ESP at pushall
 DWORD	Client_EBX;
 DWORD	Client_EDX;
 DWORD	Client_ECX;
 DWORD	Client_EAX;
 DWORD	Client_Error;		//DWORD error code
 DWORD	Client_EIP;
 WORD	Client_CS;
 WORD	wReserved_2;		//(padding)
 DWORD	Client_EFlags;
 DWORD	Client_ESP;
 WORD	C1ient_SS;
 WORD	wReserved_3;		//(padding)
 WORD	Client_ES;
 WORD	WReserved_4;		//(padding)
 WORD	Client_DS;
 WORD	wReserved_5;		//(padding)
 WORD	Client_FS;
 WORD	wReserved_6;		//(padding)
 WORD	Client_GS;
 WORD	wReserved_7;		//(padding)

 DWORD	Client_Alt_EIP;
 WORD	Client_Alt_CS;
 WORD	wReserved_8;		//(padding)
 DWORD	Client_Alt_EFlags;
 DWORD	Client_Alt_ESP;
 WORD	Client_Alt_SS;
 WORD	wReserved_9;		//(padding)
 WORD	Client_Alt_ES;
 WORD	WReserved_10;		//(padding)
 WORD	Client_A1t_DS;
 WORD	wReserved_11;		//(padding)
 WORD	Client_Alt_FS;

 WORD	wReserved_12;		//(padding)
 WORD	Client_Alt_GS;
 WORD	wReserved_13;		//(padding)
} CRS_32, *PCRS_32

The parameters to the API call, as set by the calling application, are contained in the CRS, and the current VM handle is in EBX.

A VxD usually defines a jump table to the specific API functions that perform the requested action and return the results to the API handler that reflects the results in the CRS. The following example code demonstrates how functions are dispatched from a VxD API procedure entry point:

; DEVICE  DATA

VxD_DATA_SEG

DOSXFER_PM_Call_Table LABEL DWORD
	dd	OFFSET32 DOSXFER_Get_Version
	dd	OFFSET32 DOSXFER_PM_Enable_CallBacks
	dd	OFFSET32 DOSXFER_PM_Copy_Data

Max_DOSXFER_PM_Service   equ    ($ - DOSXFER_PM_Call_Table) / 4

VxD_DATA_ENDS

; EXPORTED  API

BeginProc DOSXFER_PM_API_Proc, PUBLIC

	Trace_Out "In DOSXFER_PM_API_Proc"

	VMMCall	Test_Sys_VN_Handle
IFDEF DEBUG
	jz	SHORT @f
	Debug_Out "DOSXFER_PM_API_Proc not from SYS VM"
@@:
ENDIF
	jnz	SHORT DOSXFER_PM_Call_Bad
	movzx	eax,[ebp.Client_DX]		; function in DX
	cmp	eax,Max_DOSXFER_PM_Service
	jae	SHORT DOSXFER_PM_Call_Bad
	and	[ebp.Client_EFLAGS],NOT CF_Mask	; clear carry
	call	DOSXFER_PM_Call_Table[eax*4]	; call service
	jc	SHORT DOSXFER_PM_API_Failed
	ret

DOSXFER_PM_Call_Bad:
IFDEF DEBUG
	Debug_Out "Invalid function #EAX on DOSXFER_PM_API_Proc"
ENDIF

DOSXFER_PM_API_Failed:
	or	[ebp.Client_EFLAGS],CF_Mask	; set carry
	ret

EndProc DOSXFER_PM_API_Proc

Examining and Modifying Information of the Active VM

Changes made in the CRS by the API handler are reflected to the VM when VMM returns control. This is the primary communication channel between code executing in the VM and the API handlers. VMM defines three structures for the CRS: One references the registers with 32-bit definitions (EAX), another for 16-bit registers (AX), and the last for 8-bit register access (AH and AL).

Modification of the client registers is made easy using these structure definitions:

;Copy the data structure to the VM and return the results
;of the function.
;EBX = VM handle, EBP = -> CRS

	Client_Ptr_Flat edi,ES,DI
	lea	esi,gDataStruc
	mov	ecx,size DATASTRUCT
	shr	ecx,1
	rep	movsw
	adc	cl,cl
	rep	movsb
	mov	[ebp.Client_CX],size DATASTRUCT
	mov	[ebp.Client_AX],1			; SUCCESS!
	and	[ebp.Client_EFlags],NOT CF_Mask		; clc

A VxD may also update a buffer referenced in the CRS by obtaining a flat address using the mapping services discussed in Chapter 3.

Creating a Dual-Mode API

By setting both the V86 and PM API entry points in the DDB to the same handler, a VxD can provide the same services to all VMs and reduce the amount of code of duplicate dispatch functions. To determine the operating mode of the calling VM, the VxD queries the execution status of the VM using the status flags of the VM control block. By testing CB_VM_Status for VMStat_PM_Exec, a VxD can determine whether a VM is calling from V86 or protected mode:

;Determine the execution mode of the VM.

	test	[ebx.CB_VM_Status],VMStat_PM_Exec
	jz	SHORT API_VM_In_V86
	test	[ebx.CB_VM_Status],VMStat_PM_Use32
	jz	SHORT API_VM_In_PMl6

API_VM_InPM32:
	Debug_Out "VM calling from 32-bit protect mode."
	ret


API_VM_In_V86:
	Debug_Out "VM calling from V86 mode."
	ret

API_VM_In_PMl6:
	Debug_Out "VM calling from 16-bit protected mode."
	ret

Note: In Windows 3.x, calling VxD procedures through VxD API calls from 32-bit code segments in the System VM can cause unexpected results when the offset of the return address of the calling routine is greater than 0xFFFF. This is a problem with the way that VMM determines the "32-bitness" of the calling application. The System VM is flagged for 16-bit protected mode operation, because Krnl386.EXE is responsible for the switch to protected mode when the Windows GUI is started. Whether 32-bit segments are allocated within the System VM and code within these segments calls VxD APIs, VMM determines that the calling application is 16-bit because of the VM flags. The return address is assumed to be 16 bits and is truncated. This is also a problem for protected-mode software interrupts hooked by a VxD. The only current work-around is to guarantee that the code calling the VxD has a return address with an offset less than 0xFFFF.

Callbacks and Hooking Existing DOS Devices

Callbacks are used indirectly when defining a VxD API. However, a VxD can also allocate a callback entry point that, when called by a VM, switches control to the associated callback procedure in the VxD.

Callbacks can be used to simulate DOS devices that return a pointer to a jump table by allocating a global V86 table and stuffing the address of the callback allocated using Allocate_V86_Call_Back service into this table. A segment and offset are returned that directs any calls to this routine to the VxDs callback procedure. The CRS reflects the current state of the VM when the callback entry point was called by the VM. A VxD can also provide a "chaining" interface to hooked software interrupts by using these services.

A VxD with "carnal" knowledge of a DOS device driver can intercept calls to this device by using the Install_V86_Break_Point service. This service patches the memory at the requested address with a call to the break point. When the break point is executed, the VxD can process the VM request as necessary and then return control by "bumping" the IP to the next instruction or by using Simulate_Far_Jmp to move the Client_CS:Client_IP to the correct address.

Chapter 5

Nested Execution

Simulating Software Interrupts
Calling Windows Functions from a VxD
Calling Code in a TSR at Ring 0

The nested execution services of VMM provide a controlled environment in which a VxD can cause a redirection of the execution path in a VM. A VxD saves the client registers, begins a nested execution block forcing a VM into V86 or protected mode, calls the necessary services to set up stack frames, and then resumes the VM execution. When the VM returns, the nested execution block is ended and the client registers are restored. Using this technique, a VxD can force the execution of code in TSRs, DOS applications, and even Windows procedures.

When calling routines in a VM other than the current VM, you may need to schedule a VM event to force a specific VM to become active. You may also need to determine the execution status of the VM and wait for critical sections to be completed, interrupts to be enabled, and so on. In these cases, you can use the Call_Priority_VM_Event service and begin the nested execution when the event is processed.

Simulating Software Interrupts

As demonstrated in Chapter 3, a VxD can simulate software interrupts to a VM using the Simulate_Int or Exec_Int services. Simulated interrupts are subject to being trapped by other VxDs and will respond exactly as if a VM executed the software interrupt in application code. Additionally, a VxD that has hooked a protected-mode interrupt can affect the caller's stack to "eat the interrupt" in protected mode by using a non-nested Simulate_Far_Iret and then reflect it to V86 mode by using nested execution services.

Note that when a VxD simulates calls to a VM and the execution has returned to the VxD, the VxD must copy the results from the CRS before restoring the client's state:

;Simulate a software interrupt to the current VM

	Push_Client_State
	VMMCall Begin_Nest_V86_Exec
	mov     [ebp.Client_AX], 4257h  ; specific function
	mov     [ebp.Client_BX], 4C57h  ; subfunction
	mov     eax, 60h
	VMMCall Simulate_Int
	VMMCall Resume_Exec
	VMMCall End_Nest_Exec
	movzx   eax,[ebp.Client_AX]     ; get return value
	Pop_Client_State

What magic occurs in this code that allows a VxD to simulate an interrupt call in a VM? The Push_Client_State macro allocates space on the stack and copies the current CRS to this block. Begin_Nest_V86_Exec modifies the VM state so that the execution block occurs in V86 mode. Simulate_Int builds an IRET frame and modifies the client's stack and CS:(E)IP to call the interrupt handler. Resume_Exec forces VMM to complete event processing and then resumes the execution of the VM. When the VM completes the execution block, control returns to the VxD and the End_Nest_Exec restores the VM's execution state. The Pop_Client_State macro restores the client's registers, as saved on the stack.

Calling Windows Functions from a VxD

The techniques used to simulate software interrupts to a VM can be extended to call functions in the System VM. There are a few restrictions when calling Windows functions or functions provided by Windows DLLs:

The function must be able to handle reentrancy. Many Windows functions are not reentrant. PostMessage() and its derivatives are safe, as are a few other Windows multimedia services.
The code segment of the function must be present. The Windows Kernel does not support not-present segment faults when reentered. Because a VxD can not determine when the Windows Kernel is executing code, the segment must always be present (or non-discardable).
If DOS or BIOS is used for paging, the function code must be page-locked in memory. Because DOS and BIOS are not reentrant, a page-fault cannot be resolved if DOS or BIOS code is currently executing in any VM.

The safest segmentation for a function called by a VxD is in a FIXED code segment of a DLL. Calling application code is dangerous and is not recommended.

To call Windows functions, you must use a helper application or DLL to provide the procedure address to the VxD. The VxD can then use the nested execution services to simulate a far call to the procedure in the System VM. If a VM context switch is required (if the current VM is other than the System VM), the VxD must schedule a VM event to call the procedure. The following code sample calls the Windows PostMessage() function from a VxD assuming the PostMessage function pointer was obtained from the application or DLL:

;VSIMPLED_NotifyApp
;
;This routine notifies the Windows application through a
;call to the PostMessage() API.
;ENTRY:
;    EDX:contains the lParam of the message
;USES:
;    FLAGS

BeginProc VSIMPLED_NotifyApp, High_Freq
	VMMCall Test_Sys_VM_Handle
	je      SHORT VSIMPLED_PostEvent
NA_Schedule:
	push    ebx
	mov     eax, High_Pri_Device_Boost
	VMMCall Get_Sys_VM_Handle
	mov     ecx, PEF_Wait_For_STI OR PEF_Wait_Not_Crit
	mov     esi, OFFSET32 VSIMPLED_PostEvent
	xor     edi, edi
	VMMCall Call_Priority_VM_Event
	pop      ebx
	ret
EndProc VSIMPLED_NotifyApp

;VSIMPLED_PostEvent
;
;Called by the priority VN event dispatch routine or
;directly if System VM was already active.
;
;ENTRY:
;    EBX: The system VM handle
;    EBP: Client register structure
;    EDX: Reference data
;USES:
;    EAX, EDX, FLAGS

BeginProc VSIMPLED_PostEvent
	Trace_Out "In VSIMPLED_PostEvent"
	cmp     lpPostNessage, 0            ; Q: ptr == NULL?
	je      SHORT PE_Exit               ; Y: can't call
	Push_Client_State
	VMMCall Begin_Nest_Exec
	mov     ax, NotifyWnd               ; handle to window
	VMMCall Simulate_Push
	mov     ax, NotifyMsg               ; notification msg
	VMMCall Simulate_Push
	xor     ax, ax
	VMMCall Simulate_Push               ; wParam is NULL
	mov     eax, edx
	shr     eax, 16
	VMMCall Simulate_Push               ; lParam is ref data
	mov     eax, edx
	VMMCall Simulate_Push
	movzx   edx, WORD PTR [lpPostMessage]
	mov     cx, WORD PTR [lpPostNessage + 2]
	VMMCall Simulate_Far_Call           ; call PostMessage()
	VMMCall Resume_Exec
	VMMCall End_Nest_Exec
	Pop_Client_State
PE_Exit:
	ret
EndProc VSIMPLED_PostEvent

Calling Code in a TSR at Ring 0

In Windows 3.1, the VPICD added services that allow a Windows driver to provide interrupt service routines callable at ring 0. This means a Windows device driver to provide a common code base for hardware interrupt servicing. This technique can be implemented by other VxDs to call routines in a VM directly from ring 0, as shown in Figure 5.1.

Figure 5.1: Possible design of calling a TSR directly (at ring 0) from a VxD

The technique to call TSR code from ring 0 is actually quite simple. A VxD provides an API that allows a V86 or PM application to register a procedure as a "direct" callback procedure. Ring 0 16-bit GDT selectors are built to access code and data of the callback procedure. When the required event occurs, the VxD calls the callback procedure by setting up a far return frame, including a 32-flat far return address to a return-to-flat procedure and a 16:16 far return address to a return-from-16 procedure in the VxD. The VxD then performs a far return kicking out to the 16-bit code in the TSR. When the TSR has completed processing, the far return kicks back to the retun-from-16 procedure in the VxD. The last remaining issue is to return to 32-flat model by using a final far return to the return-to-flat procedure.

This method makes some assumptions of the way TSRs are loaded in the system:

The TSR is loaded before Windows is started and is therefore global to all VMs.
The GDT selectors are based on the low linear address of the TSR. Because the TSR is global in all VMs, this mapping must remain constant in all page tables.
If the code was specific to a VM, a priority VM event would be required to make the VM active before calling the code directly at ring 0.
Using this scheme, the stack is provided by VMM and is a Use32 segment. Stack parameter passing is not valid unless the TSR uses 32-bit references to the stack (ESP and EBP). The TSR code should not attempt to change SS.

The following code fragments demonstrate the technique of calling TSR code (16-bit code) at ring 0. In Sys_Critical_Init the GDT selectors used for the call to the TSR are allocated. For this sample, a global timeout is used to initiate the calls to the TSR.

;VCALLTSR_Sys_Critical_Init

;DESCRIPTION:
;    Allocates necessary GDT selectors.
;ENTRY:
;    EBX = handle to Sys_VM
;    EDX = reference data from real-mode init
;EXIT:
;    Carry clear if no error, otherwise set if failure.
;USES:
;    Flags

BeginProc VCALLTSR_Sys_Critical_Init
	Trace_Out "VCALLTSR: Sys_Critical_Init"
	pushad
; Note:
; An assumption is made that CS:0 is the base of the TSR.
; Since we don't have a segment size, we'll assume 1 page,
; but this could be handled by using a pointer to a structure
; within the TSR obtained from Exec_Int instead of using
; Real_Mode_Init to gather the information.
	mov     eax, edx
	movzx   edx, ax
	mov     dwTSR_Ring0_EIP, edx
	shr     eax, 16
	shl     eax, 4
	push    eax                          ; save address
	VMMCall _BuildDescriptorDWORDS, < eax, <P_SIZE \
			<Code_Type + D_DPL0>, \
			D_DEF16,\
			BDDExplicitDPL >
	VMMCall _Allocate_GDT_Selector, < edx, eax, 0 >
	or      eax, eax
	jnz     SHORT SCI_GotCSSel
	pop     eax
	jmp     SHORT SCI_Failure
SCI_GotCSSel:
	mov     dwTSR_Ring0_CS, eax
	pop     eax                          ; restore address
	VMMCall _BuildDescriptorDWoRDs, < eax, <P_SIZE -
			<RW_Data_Type + D_DPL0>,\
			D_DEF16,\
			BDDExplicitDPL >
	VMMCall _Allocate_GDT_Selector, < edx, eax, 0 >
	or      eax, eax
	jz      SHORT SCI_Failure
	mov     dwTSR_Ring0_DS, eax
	VMMCall _BuildDescriptorDWoRDS, < <OFFSET32 VCT_Switch>,\
			VCT_Switch_Size,\
			<Code_Type + D_DPL0>,
			D_DEF32,\
			BDDExplicitDPL >
	VMMCall _Allocate_GDT_Selector, < edx, eax, 0 >
	or      eax, eax
	jz      SHORT SCI_Failure
	mov     wTSR_Switch_To_Flat_CS, ax
	mov     eax, 500                     ; 500 ms timeout
	xor     edx, edx                     ; no data
	mov     esi, OFFSET32 VCALLTSR_TimeOut
	VMMCall Set_Global_Time_Out
	mov     hTimeout, esi
        popad
	clc
	ret
SCI_Failure:
; Free any allocated selectors and exit
	mov     eax, dwTSR_Ring0_CS
	or      eax, eax
	jz      SHORT SCI_Failure_TryDS
	VMMCall _Free_GDT_Selector, <eax, 0>
SCI_Failure_TryDS:
	mov     eax, dwTSR_Ring0_DS
	or      eax, eax
	jz      SHORT SCI_Failure_TryF1at
	VMMCall _Free_GDT_Selector, <eax, 0>
SCI_Failure_TryFlat:
	movzx   eax, wTSR_Switch_To_Flat_CS
	or      eax, eax
	jz      SHORT SCI_Failure_Exit
	VMMCall _Free_GDT_Selector, <eax, 0>
SCI_Failure_Exit:
        popad
        stc
        ret
EndProc VCALLTSR_Sys_Critical_Init

When the timeout procedure is called, the stack frames are created to call the TSR code directly. When the TSR returns the VxD unwraps the stack to get back to 32-bit flat model:

;VCALLTSR_TimeOut

;DESCRIPTION:
;    Event handler for global timeout. Calls TSR code directly
;    from ring 0.
;ENTRY:
;    EBX = Current VN handle
;    ECX = additional ms since timeout
;    EDX = reference data
;    EBP = &CRS
;EXIT:
;    Reschedules time-out.
;USES:
;    All registers.

BeginProc VCALLTSR_TimeOut
	pushad
	mov     hTimeout, 0                         ; clear handle
	Trace_Out "Setting up stack frames to call TSR."
; This stack frame is so we can get back to flat model.
	push    cs                                  ; save CS
	mov     eax, OFFSET32 VCALLTSR_Back_To_Flat
	push    eax                                 ; save EIP
; This stack frame will get us back to 32-bit code in
; the VxD and is addressable via 16:16 for the TSR.
	push    ds                                  ; save off DS
	push    dwTSR_RETF_From_16
; This is the stack frame used to get us to the TSR
; code. Additionally, DS is setup with a R/W pointer
; to the same base address.
        mov     eax, dwTSR_Ring0_DS
        mov     ds, ax
        push    cs:dwTSR_Ring0_CS
        push    cs:dwTSR_Ring0_EIP
        retf                                        ; go to the TSR
VCT_Switch:
	pop     ds                                  ; restore DS
	retf                                        ; return to flat

VCT_Switch_Size equ ($ - VCALLTSR_Switch_To_F1at) - 1

VCALLTSR_Back_To_Flat:
        Trace_Out "Back in flat model. Return from TSR"
; Reschedule time out event
	mov     eax, 500                            ; 500 ms timeout
	xor     edx, edx                            ; no data
	mov     esi, OFFSET32 VCALLTSR_TimeOut
	VMMCall Set_Global_Time_Out
	mov     hTimeout, esi
        popad
	ret
EndProc VCALLTSR_TimeOut

Chapter 6

I/O Trapping

Trapping and Dispatching I/O
Device Contention Management
Simulating Hardware

I/O protection is a powerful feature provided by the 80386/80486 chipset. When the Current Privilege Level (CPL) is less than or equal to the I/O privilege level (IOPL), the following instructions can be executed:

in, out, ins, outs
cli, sti, [interrupt flag modifying] popf

If CPL is less than or equal to IOPL in protected mode, the processor allows the I/O operation to proceed. If CPL is greater than IOPL or if the processor is operating in virtual 8086 mode, the I/O permissions bitmap (IOPM) is used to determine whether access to the port is allowed. Because MS-DOS VMs run in virtual 8086 mode and a Windows application has a CPL of 3 (for Windows 3.1) and IOPL is 0, the I/O permissions bitmap is always used in these cases to determine whether access to the port is valid.

VMM keeps a copy of the IOPM for each VM (it is associated with the TSS and other task information). VxDs can enable or disable access to ports by modifying the IOPM using VMM services. Also, it is possible to trap ports in one VM and allow access to the hardware directly in another VM.

The Install_IO_Handler and Install_Mult_IO_Handlers services install handlers that are called when the GP fault handler has determined that I/O to the associated port has caused the fault. VMM provides the Enable_Local_Trapping, Enable_Global_Trapping, Disable_Local_Trapping, and Disable_Global_Trapping.

Trapping services to modify the IOPM of virtual machines to enable and disable access to the I/O ports.

I/O trapping is the primary method used to manage device contention. By allowing only one VM access to a hardware device address space, the VxD can manage accesses by other VMs. For cases of contention, a VxD can simulate the device I/O and submit the actual hardware request when the hardware is free, ignore the hardware access, and return as though the hardware did not exist, or crash the VM attempting to access the hardware.

A VxD can simulate hardware that does not exist by virtualizing the device using a finite state machine (or other similar method) and returning the appropriate information to the requesting application.

Trapping and Dispatching I/O

To trap I/O addresses, a VxD uses the Install_IO_Handler or Install_Mult_IO_Handlers services of VMM. These services are only available during device initialization.

These services associate a callback (or table of callbacks) with an I/O port (or table of I/O ports). By default, global trapping is enabled, any access to the trapped ports causes a fault, and the associated callback procedure is called.

An I/O table has the following format:

VxD_IDATA_SEG

Begin_VxD_IO_Table VTRAPIOD_Port_Table

	VxD_IO  TRAP_IO_IDX, VTRAPIOD_10_Index_Reg
	VxD_IO  TRAP_IO_DATA, VTRAPIOD_10_Data_Reg

End_VxD_IO_Table VTRAPIOD_Port_Table

VTRAPIOD_Port_Table_Entries equ (($-VTRAPIOD_Port_Table)-\
    (SIZE VxD_IOT_Hdr)) / (SIZE VxD_IO_Struc)

VxD_IDATA_ENDS

This table uses offsets from the base I/O address as the port address. When the base address of the hardware has been determined, the VxD can update the I/O table and install the handlers:

;VTRAPIOD_Device_Init
;
;DESCRIPTION:
;    Non critical system initialization procedure.
;ENTRY:
;    EBX = Sys VM Handle
;EXIT:
;    CLC if everything's A-OK, otherwise STC
;USES:
;    Flags.
BeginProc VTRAPIOD_Device_Init
        Trace_Out "VTRAPIOD: Device_Init"
        pushad
; Build an I/O port table for Install_Mult_IO_Handlers
; using the base address.
        mov     ecx, VTRAPIOD_Port_Table_Entries
        mov     esi, OFFSET32 VTRAPIOD_Port_Table
	mov     edx, VTRAPIOD_Base_10
DI_Install_IO_Handlers:
	mov     edi, esi                 ; save a copy in EDI
	add     esi, (size VxD_IOT_Hdr)
DI_Bump_IO_Loop:
        add     [esi.VxD_IO_Port], dx    ; add port base to offset
	add     esi, (size VxD_IO_Struc)
	loop    DI_Bump_IO_Loop
; Tell VMM to trap ports.
        VMMcall Install_Mult_IO_Handlers
ifdef DEBUG
        jnc     SHORT DI_Exit
	Debug_Out "VTRAPIOD: cannot trap ports!!"
endif
DI_Exit:
        popad
	ret
EndProc VTRAPIOD_Device Init

When an I/O port within the given range has been accessed, the fault handler dispatches to the associated I/O handler. For this example, the index register simply stores the index if valid (on write) or returns the current index (on read):

;VTRAPIOD_IO_Index_Reg
;
;DESCRIPTION:
;    Handles IO trapping.
;    This is a virtual R/W index register.
;ENTRY:
;    EBX = VM Handle.
;    ECX = Type of I/O
;    EDX = Port number
;    EBP = Pointer to client register structure
;EXIT:
;    EAX = data input or output depending on type of I/O
;USES:
;    FLAGS

BeginProc VTRAPIOD_IO_Index_Reg, High_Freq
	Dispatch_Byte_Io Fall_Through, <SHORT IIR_Out>
	mov     al, bIndex
	clc
	ret
IIR_Out:
	cmp     al, VTRAPIOD_Max_Index
	ja      SHORT IIR_Exit
	mov     bIndex, al
IIR_Exit:
	clc
	ret
EndProc VTRAPIOD_IO_Index_Reg

The one drawback with this simple I/O trapping interface is that there is a single global virtual device. Multiple VMs can simultaneously (well, almost simultaneously) access this device and may inadvertently affect the processing of another VM by switching the index register while a different VM is updating an indexed data register. This is commonly referred to as device contention, and this VxD must be improved to properly handle contention between VMs. The next below discusses this topic in greater detail.

Note: The VTRAPIOD sample in the ASM\VTRAPIOD directory of the enclosed diskette demonstrates I/O trapping and dispatching techniques.

Device Contention Management

When multiple virtual machines attempt to access the same hardware interface and device contention is not handled by a VxD, the VMs probably interact with the hardware in such a way that all the hardware sees is gibberish.

To avoid these problems, a VxD implements one of the following methods of device contention:

A VxD can completely virtualize the hardware interface, buffer the requests, and submit them when the hardware is free.
A VxD can allow only one VM to access the hardware at a time. The hardware will not be visible to other VMs until the hardware is released by the owner.
The VM can be terminated for attempting to access the hardware. (Not the most user-friendly or recommended method.)

The most commonly used method is to allow only one VM to access the hardware at a time. Other VMs cannot access the hardware until it has been released by the owner.

To implement this form of device contention, all I/O ports for the hardware device are trapped. When a VM accesses a trapped port, the handler routine checks to see whether the device has been assigned to a VM. If a contention is detected, the VxD may display a warning message using the Shell VxD's API and then return with carry set for all reads and writes to the hardware. If there is no current owner, the VxD assigns the device to the VM and disables the I/O trapping for the VM using the Disable_Local_Trapping service. When the VM terminates or when the hardware is explicitly released by the VM, the VxD re-enables the trapping for the VM, using the Enable_Local_Trapping service, and clears the owner status of the hardware.

The following sample code is contention management in its simplest form:

;VCONTEND_Check_Owner
;
;DESCRIPTION:
;    Checks the current VM owner; if none, assigns
;    device to VM. If the VN is an owning VM, returns
;    carry clear, otherwise it returns carry set.
;ENTRY:
;    EBX = VM Handle.
;EXIT:
;    CLC if owner OK, or STC if contention
;USES:
;    FLAGS

BeginProc VCONTEND_Check_Owner, High_Freq
	push    eax
	mov     eax, hOwnerVM
	or      eax, eax
	jz      SHORT CO_Assign_To_VM
	cmp     eax, ebx
	jne     SHORT CO_Failure
CO_Success:
	pop   eax
	clc
	ret
CO_Assign_To_VM:
	mov      hOwnerVM, ebx
	jmp      SHORT CO_Success
CO_Failure:
	pop      eax
	stc
	ret
EndProc VCONTEND_Check_Owner


;VCONTEND_IO_Index_Reg
;
;DESCRIPTION:
;    Handles IO trapping.
;    This is a virtual R/W index register.
;ENTRY:
;    EBX = VMM Handle.
;    ECX = Type of I/O
;    EDX = Port number
;    EBP = Pointer to client register structure
;EXIT:
;    EAX = data input or output depending on type of I/O
;USES:
;    FLAGS

BeginProc VCONTEND_IO_Index_Reg, High_Freq
	call    VCONTEND_Check_Owner
	jc      SHORT IIR_Exit
	Dispatch_Byte_IO Fall_Through, <SHORT IIR_Out>
	mov     al, bIndex
	clc
	ret
IIR_Out:
	cmp      al, VCONTEND_Max_Index
	ja       SHORT IIR_Exit
	mov      bIndex, al
	clc
IIR_Exit:
	ret
EndProc VCONTEND_IO_Index_Reg

Note that with this method of contention management, the hardware remains in the state the last owning VM left it in. You may decide to define an initial state for a VM in the VM control block and update the state when the VM releases the hardware. When a VM acquires the hardware, the state would he copied from the VM's control block to the hardware.

Simulating Hardware

As demonstrated in the preceding code fragments, it is possible to simulate (or virtualize) hardware through the use of trapped I/O interfaces. The Windows 3.1 Device Driver Kit contains sources to VxDs that simulate hardware such as the Virtual DMA Device and the Virtual COMM Device. You should investigate these sources for examples of more complex interfaces.

VxDs can use these techniques to translate common hardware interfaces to new or improved hardware interfaces and maintain the backward compatibility of the older platforms for MS-DOS applications.

To fully virtualize a hardware interface, your VxD may need to incorporate IRQ virtualization and/or DMA virtualization. These topics are covered in Chapters 7 and 8, respectively.

Note: The VCONTEND sample in the ASM\VCONTEND directory on the enclosed diskette demonstrates the virtualization of a simple hardware interface and manages contention between multiple virtual machines.

Chapter 7

IRQ Virtualization

Default VPICD Handling
IRQ Virtualization and Sharing
Dispatching IRQs to a VM
Servicing Interrupts in a VxD
Bimodal Interrupt Handlers

The Virtual Programmable Interrupt Controller Device (VPICD) provides an interface to hook (virtualize) lRQs, query information about the state of a hooked IRQ, simulate hardware interrupts to VMs, share interrupts, and handle interrupts in the System VM with a single ISR interface using the bimodal interrupt interface.

During initialization, the VPICD configures the PICs (slave and master), hooks the IDT entries, and establishes default handling for non-virtualized IRQs. The PICs are virtualized to all VMs. When a VM masks an interrupt, it is communicating with the VPICD and does not perform I/O directly to the PIC. VPICD provides services to affect the physical state of the PICs. It is strongly recommended that VxDs use this interface to change the physical state of a virtualized IRQ.

lRQ virtualization is recommended for hardware devices that use hardware interrupts as a form of communication with device drivers. There are several reasons for this recommendation:

IRQ virtualization is a requirement for proper device contention management.
Some devices require immediate interrupt servicing. Interrupt latency caused by non-virtualized interrupt handling in an ISR in either a TSR or Windows device driver do not satisfy this requirement.
VPICD's default IRQ handling is sometimes inappropriate for devices that intend to be "Windows-GUI-only" oriented. IRQs that are unmasked prior to starting Windows are designated as "global". Global interrupts are not appropriate for this implementation.

The most common complaint of interrupt processing under Windows is the interrupt latency issue introduced by simulating interrupts to VMs. Additionally, you may be interested in monitoring interrupt response from a hardware device before simulating the interrupt to a VM. In these cases, IRQ virtualization is required.

Default VPICD Handling

Before discussing IRQ virtualization in detail, we need to explain the default operation of VPICD when an interrupt is not virtualized. By default, all IRQs are "virtualized" by VPICD. If the interrupt was unmasked prior to starting Win386 (or the special case of IRQ9), the default owner is global. Otherwise, no default owner exists.

The default hardware interrupt procedure (Hw_IntProc) simulates an interrupt to the current VM if the IRQ is unowned. When the IRQ is global, VPICD simulates the interrupt to the current critical section owner or the current VM, if there is no critical section owner. Also, interrupts simulated for global IRQs are nested in the VM until the nesting has been "unwound", but non-owned interrupts are always simulated to the current VM in all circumstances. When an interrupt is simulated to a VM (by a default IRQ handler or using the VPICD_Set_Int_Request service), the VM priority is boosted and the IRET procedure is hooked to notify the IRET procedure when the interrupt has been completed. These events only occur when the IRQ is not nested.

End-of-Interrupt results when the VM issues an EOI to the virtual PIC. The default EOI handler clears the virtual interrupt request and performs a physical EOI using the VPICD_Clear_Int_Request and VPICD_Phys_EOI services respectively.

By default each unowned or global interrupt procedure has a timeout of 500 ms. A VM timeout is scheduled to watch the interrupt processing time in a VM. If the ISR in the VM does not service the interrupt within the specified timeout period, VPICD continues execution as though the ISR had issued an IRET. The timeout is canceled when the VM issues an IRET (or the last IRET in a nested block).

VPICD simulates a level-triggered PIC. That is, when a virtual EOI occurs another interrupt will be simulated immediately unless the virtual interrupt request has been cleared by the VPICD_Clear_Int_Request service.

IRQ Virtualization and Sharing

IRQ Virtualization

A VxD can change the default behavior of interrupt processing by virtualizing the IRQ using the VPICD_Virtualize_IRQ service. The VxD fills the following structure and calls this service to obtain an IRQ handle:

VPICD_IRQ_Descriptor STRUC
	VID_IRQ_Number          dw      ?
	VID_Options             dw      0
	VID_Hw_Int_Proc         dd      ?
	VID_Virt_Int_Proc       dd      0
	VID_EOI_Prac            dd      0
	VID_Mask_Change_Proc    dd      0
	VID_IRET_Proc           dd      0
	VID_IRET_Time_Out       dd      500
VPICD_IRQ_Descriptor ENDS

Some of the elements of this structure require further detail:

VID_Hw_Int_Proc contains a pointer to the procedure called when hardware interrupts occur for the specified IRQ. (Required)
VID_Virt_Int_Proc contains a pointer to the procedure called when interrupts are simulated to the VM for this IRQ. (Optional)
VID_EOI_Proc contains a pointer to the procedure called when the hardware interrupt service routine in the VM issues an EOI to the PIC. (Optional)
VID_Mask_Change_Proc contains a pointer to the procedure called when the VM changes the mask status of the IRQ on the PIC. (Optional)
VID_IRET_Proc contains a pointer to the procedure called when the VM IRETs (or the last IRET of a nested block) from the simulated interrupt. This procedure is also called when a timeout occurs when a VM is vicing an interrupt. (Optional)
VID_IRET_Time_Out is the timeout value for a VM to service an interrupt. When the timeout occurs, VPICD reacts as though the VM issued an IRET with the exception that the interrupt has not been physically serviced. (Optional, default is 500 ms)

A VxD must virtualize an interrupt during device initialization. It is recommended that the VxD virtualize the interrupt during Sys_Critical_Init if you are using IRQ 9 to avoid problems introduced when interrupts occur between the Sys_Critical_Init and Device_Init control messages.

The following sample code demonstrates the use of VPICD services to virtualize an IRQ:

;INIT DATA

VxD_IDATA_SEG
VIRQD_IRQ_Descriptor VPICD_IRQ_Descriptor <,,\
			OFFSET32 VIRQD_Hw_Int_Proc,,
			OFFSET32 VIRQD_EOI_Proc,,,>
VxD_IDATA_ENDS

;INIT CODE

VxD_ICODE_SEG

;VIRQD_Device_Init
;
;DESCRIPTION:
;    Non critical system initialization procedure.
;ENTRY:
;    EBX = Sys VM handle
;EXIT:
;    CLC if everything's A-OK, otherwise STC
;USES:
;    Flags.

BeginProc VIRQD_Device_Init
        Trace_Out "VIRQD: Device_Init"
	push    eax
	push    edi
        mov     edi, OFFSET32 VIRQD_IRQ_Descriptor
        mov     [edi.VID_IRQ_Number], VIRQD_Interrupt
        VxDCall VPICD_Virtualize_IRQ
ifdef DEBUG
	jnc     SHORT @F
	Dehug_out "VIRQD: Unable to virtualize IRQ"
	jmp     SHORT DI_Exit
else
        jc      SHORT DI_Exit
endif
        mov     hVirtIRQ, eax
DI_Exit:
	pop     edi
	pop     eax
	ret
EndProc VIRQD_Device_Init
VxD_ICODE_ENDS

When the hardware interrupt occurs, the following procedures simulate the interrupt to the current VM and clear the interrupt when the ISR issues an EOI to the virtual PIC:

;==================================
;  HARDWARE INTERRUPT PROCEDURES
;==================================

VxD_LOCKED_CODE_SEG

;VIRQD_Hw_Int_Proc
;
;DESCRIPTION:
;    Hardware interrupt handler. Called by VPICD.
;ENTRY:
;    EAX = IRQ handle
;    EBX = current VN handle
;EXIT:
;    CLC if processed, STC otherwise.
;USES:
;    Flags.

BeginProc VIRQD Hw_Int_Proc, High_Freq
	Trace_Out "<i"
        VxDCall VPICD_Set_Int_Request
        clc
        ret
EndProc VIRQD_Hw_Int_Proc

;VIRQD_EOI_Proc
;
;DESCRIPTION:
;    Hardware interrupt handler. Called by VPICD.
;ENTRY:
;    EAX = IRO handle
;    EBX = current VN handle
;EXIT:
;    Nothing.
;USES:
;    Nothing.

BeginProc VIRQD_EOI_Proc, High_Freq
	Trace_Out "i>"
	VxDcall VPICD_Clear_Int_Request
	VxDCall VPICD_Phys_EOI
	ret
EndProc VIRQD_EOI_Proc

VxD_LOCKED_CODE_ENDS

Note that services called during the processing of the Hw_Int_Proc procedure must be declared asynchronous (see Chapter 2 for a complete list of asynchronous services). If a VxD requires the use of a non-asynchronous service to continue interrupt processing, the VxD must schedule a global event to continue. The debug version of WIN386.EXE notifies you when you attempt to call a non-asynchronous service during interrupt processing. Heed the warnings of VMM, lest your ignorance cause the system to crash.

Shared IRQ Procedures

If the hardware platform supports shared interrupts (Micro Channel Architecture) or the device is using an ISA shared interrupt strategy, the IRQ can be virtualized specifying the VPICD_Opt_Can_Share flag in the VID_Options element of the VPICD_IRQ_Descriptor structure. When the hardware interrupt is dispatched to the Hw_Int_Proc, the VxD should determine whether the interrupt was generated by the associated hardware device and, if so, process the interrupt and return with carry clear. If the interrupt was not generated by the supported hardware, the VxD should return immediately with carry clear. VPICD will continue to walk the shared interrupt list until a VxD responds with carry set.

Note that the VxD cannot assume that subsequent calls to other callback procedures specified in the IRQ descriptor structure are the result of an interrupt for the associated hardware device. The VxD should set a flag when it has simulated an interrupt to a VM and test against this flag when notifications from VPICD are processed. When the VxD processes the EOI_Proc it should clear the flag, perform the necessary EOI procedures, and then return.

Dispatching IRQs to a VM

The example below demonstrates a very simple IRQ virtualization. The VIRQD_Hw_Int_Proc simply sets the interrupt request for the current VM and returns. When the ISR performs an EOI to the PIC, the VIRQD_EOI_Proc clears the interrupt request and performs a physical EOI.

When a VxD requests an interrupt for a VM using the VPICD_Set_Int_Request service, the interrupt simulation may not occur immediately. There are several conditions that do not allow an interrupt to be simulated immediately:

Interrupts are disabled in the VM.
The virtual IRQ is masked in the VM.
A higher priority virtual IRQ is already in service.
The virtual machine is suspended.

In these cases, the interrupt is simulated as soon as the conditions are met.

Note that using VPICD_Set_Int_Request does not guarantee that an interrupt will be simulated to a VM. For example, if a VM has masked and never unmasks the IRQ, the interrupt will not be simulated. Additionally, a call to VPICD_Clear_Int_Request before the interrupt has been simulated prevents the VM from receiving the interrupt.

The example also does not demonstrate proper techniques when processing hardware interrupts for device contention management. The VIRQD_Hw_Int_Proc should be expanded to first determine whether an owner VM exists and then simulate the interrupt to that VM, as follows:

;VIRQD_Hw_Int_Proc
;
;DESCRIPTION:
;    Hardware interrupt handler. Called by VPICD.
;    Simulates the interrupt to the hardware owner or
;    to the current VM if unowned.
;ENTRY:
;    HAX = IRQ handle
;    EBX = current VM handle
;EXIT:
;    CLC if processed, STC otherwise.
;USES:
;    EBX, Flags.

BeginProc VIRQD_Hw_Int_Proc, High_Freq
        Trace_Out "<i"
	cmp     hownerVM, 0
	je      SHORT HIP_SetIt
	mov     ebx, hownerVN
HIP_SetIt:
        VxDCall VPICD_Set_Int_Request
        clc
        ret
EndProc VIRQD_Hw_Int_Proc

Servicing Interrupts in a VxD

To reduce the interrupt latency of servicing a hardware device contained in ISR code of a VM, a VxD can service interrupts directly during processing of the Hw_Int_Proc procedure. In cases where a steady stream of data is processed, the VxD should buffer the information from the hardware device and provide the information to the owning VM in chunks.

A Hw_Int_Proc for servicing an interrupt directly might be similar to this:

;VIRQD_Hw_Int_Proc
;
;DESCRIPTION:
;    Hardware interrupt handler. First, EOI the PIC
;    so we avoid missing another IRO generated by the
;    device. Call a procedure elsewhere in the VxD to
;    service the hardware device and then return.
;ENTRY:
;    EAX = IRQ handle
;    EBX = current VM handle
;       Interrupts are disabled.
;EXIT:
;    CLC if processed, STC otherwise.
;USES:
;    EBX, Flags.

BeginProc VIRQD_Hw_Int_Proc, High_Freq
	Trace_Out "<i>"
	VxDCall VPICD_Phys_Eoi
	call    VIRQD_Service_Hardware
	clc
	ret
EndProc VIRQD_Hw_Int_Proc

In this example, VIRQD_Hw_Int_Proc does not set the interrupt request for the VM. The VIRQ_Service_Hardware procedure may set an interrupt request to the owning VM when a threshold has been reached. This is strictly depended by the requirements of your hardware and the maximum amount of CPU load you wish to generate. The VxD could also use some other form of communication to a driver in a VM, such as nested execution or updating global memory buffers.

Additionally, the VIRQ_EOI_Proc would not perform a physical EOI of the PIC. Its only requirement would be to clear the interrupt request status for the VM if simulated interrupts are used to communicate with the VM's device driver. Note that interrupt simulation is an expensive procedure. Ring transitions and VM context switches are often a result of interrupt simulation, and reducing simulated interrupt generation will help reduce the total burden of the CPU.

Bimodal Interrupt Handlers

Bimodal interrupt handlers are a new feature of the Windows 3.1 VPICD that allows a Windows device driver (or DLL) to service interrupts without waiting for VPICD to simulate an interrupt to the System VM and can avoid the associated delays of VM focus changes and VM event processing. Interrupt latency can be reduced using these services while maintaining a common code base for the ISR under Standard and Enhanced Mode Windows. Note that servicing interrupts directly in a VxD (as discussed in the preceding section) yields minimal interrupt latency.

The following services are available through the PM API of the VPICD to install and remove bimodal interrupt handlers:

sorry, illustration missing

The VPICD API can only be accessed via the protected mode API entry point. It is not available to V86 VMs. To access the VPICD API, a VM obtains the API entry point:

VPICD_Device_ID           EQU    0003h
VPICD_API_Get_Ver         EQU    0000h
VPICD_Install_Handler     EQU    0001h
VPICD_Remove_Handler      EQU    0002h
VPICD_Call_At_RingO       EQU    0003h

        xor     di, di
        mov     es, di
        mov     ax, 1684h                 ; get API entry point
        mov     bx, VPICD_Device_ID       ; of the VPICD
        int     2fh
        mov     word ptr lpVPICDEntry, di
        mov     word ptr lpVPICDEntry + 2, es
        mov     ax, es
        or      ax, di
        jz      SHORT No_VPICD_API

Under Windows 3.0, the VPICD entry point will be NULL, because it does not support any API functionality. If the entry point is not NULL, VPICD's version can be obtained:

Get_VPICD_Version:
        mov     ax, VPICD_API_Get_Ver
        call    dword ptr lpVPICDEntry
        jc      SHORT VPICD Error
        cmp     ax, 30Ah
        jbe     SHORT VPICD_Error

A DLL installs and removes a bimodal IRQ handler using the VPICD_API_Install and VPICD_API_Remove functions respectively:

Install_Bimodal_Handler:
        les     di, lpBIS              ; pointer to BIS struct.
        mov     ax, VPICD_Install_Handler
        call    dword ptr lpVPICDEntry
        jc      SHORT VPICD_Error

Remove_Bimodal_Handler:
        les     di, lpBIS              ; pointer to BIS struct.
        mov     ax, VPICD_Remove_Handler
        call    dword ptr lpVPICDEntry
        jc      SHORT VPICD_Error

In these routines, the Bimodal_Int_Struc (BIS) is referenced. This structure has the following format:

Bimodal_Int_Struc STRUC
	BIS_IRQ_Number      dw    ?
	BIS_VM_ID           dw    0
	BIS_Next            dd    ?
	BIS_Reserved1       dd    ?
	BIS_Reserved2       dd    ?
	BIS_Reserved3       dd    ?
	BIS_Reserved4       dd    ?
	BIS_Flags           dd    0
	BIS_Node            dw    0
	BIS_Entry           dw    ?
	BIS_Control_Proc    dw    ?
			    dw    ?
	BIS_User_Mode_API   dd    ?
	BIS_Super_Node_API  dd    ?
	BIS_User_Node_CS    dw    ?
	BIS_User_Node_DS    dw    ?
	BIS_Super_Node_CS   dw    ?
	BIS_Super_Node_DS   dw    ?
	BIS_Descriptor_Count dw   ?
Bimodal_Int_Struc ENDS

The field definitions of this structure are detailed as follows:

BIS_IRQ_Number	VPICD installs a bimodal interrupt for the IRQ specified by this field when the `VPICD_Install_Handler` API is called.
BIS_VM_ID	Contains the current VM ID when the interrupt handler specified by `BIS_Entry` is called.
BIS_Next	Currently not used by the Windows 3.1 VPICD.
BIS_Flags	Must be set to zero.
BIS_Mode	Set to 0 to indicate user mode or 4 to indicate supervisor mode. This value can be used as an offset to obtain the appropriate user-mode or super-mode BIS API handler. (Set by VPICD when calling the procedures defined by the BIS_Entry and BIS_Control_Proc offsets.) mov bx, es: [di.BIS_Node] ; mode 0=user, 4=super call es: [bx] Edi.BIS_User_Node_API]
BIS_Entry	Specifies the offset of the ISR from the CS specified in the `BIS_User_Mode_CS` field. When VPICD calls the interrupt handler for interrupt servicing, ES:DI points to this structure. (Filled by caller for the call to VPICD_Install_Handler.)
BiS_Control_Proc	Specifies the offset of the control procedure from the CS specified in the `BIS_User_Mode_CS` field. The control procedure is currently not used by the Windows 3.1 VPICD, but should point to a dummy control procedure that performs a far return. (Filled by the caller for VPICD_Install_Handler.)
BIS_User_Mode_API	Specifies the far address of the user-mode API procedure entry point. (Filled by VPICD after a call to `VPICD_Install_Handler` API.)
BIS_Super_Mode_API	Specifies the far address of the supervisor mode API procedure entry point. (Filled by VPICD after a call to the `VPICD_Install_Handler` API.)
BIS_User_Mode_CS	Specifies the selector of the user-mode code segment of the interrupt handler. The `BIS_Entry` and `BIS_Control_Proc` offsets must be relative to the code selector specified by this field. (Filled by caller for `VPICD_Install_Handler`.)
BIS_User_Mode_DS	Specifies the selector of the user-mode data segment of the interrupt handler. The `Bimodal_Int_Struc` structure should be located in this segment. (Filled by caller for `VPICD_Install_Handler`.)
BIS_Super_Mode_CS	VPICD stores the GDT alias of the user-mode CS selector in this field after a call to `VPICD_Install_Handler`.
BIS_Super_Mode_DS	VPICD stores a GDT alias of the user mode CS selector in this field after a call to `VPICD_Install_Handler`.
BIS_Descriptor_Count	Specifies the number of EBIS_Sel_Struc structures immediately following the `Bimodal_Int_Struc` structure. VPICD creates a GDT alias for each of the selectors in the structures that follow.

EBIS_Sel_Struc STRUC
        EBIS_User_Node_Sel  dw    ?
	 		    dw    ?
        EBIS_Super_Node_Sel dw    ?
EBIS_Sel_Struc ENDS

EBIS_User_Mode_Sel  User mode selector

EBIS_Super_Mode_Sel GDT alias of selector created by VPICD after a call to
                    VPICD_Install_Handler.

VPICD automatically creates GDT aliases for the ISR code and data segments as specified in BIS_User_Mode_CS and BIS_User_Mode_DS, respectively. Additionally, the caller can request that VPICD create GDT aliases for a number of selectors specified by BIS_Descriptor_Count. The user-mode selectors are filled in an array of the EBIS_Sel_Struc structures immediately following the Bimodal_Int_Structure. The associated GDT aliases are returned in the EBIS_Super_Mode_Sel element of each of the EBIS_Sel_Struc structures. For example, the Windows 3.1 COMM driver uses this functionality to create CDT aliases of the receive and transmit queues. A DLL creates a Bimodal_Int_Struc and fills the appropriate fields. When the IRQ occurs, VPICD calls the ISR directly at ring 0, regardless of the current VM. On entry to the ISR, the CS is set to the GDT alias of the ISR code segment and ES:DI is set to the GDT alias of the Bimodal_Int_Struc. If this structure is located in the data segment, you can make the data addressable by moving ES into DS.

The ISR executes at ring 0 (CPL=0) through a 16-bit GDT code segment alias. As with calling TSR code directly from a VxD, the provided stack is a Use32 segment and parameter passing must reference the stack using 32-bits (ESP and EBP). The ISR cannot switch to a different stack unless a ring 0 stack selector is created. Note that a DLL cannot legally create such a selector.

The ISR must return from the procedure with a far return and carry clear if the IRQ was serviced or carry set if the IRQ was not serviced. When the ISR is called directly by VPICD, it must not manipulate the PIC directly. Instead, VPICD provides services through the BIS_Super_Mode_API procedure to perform these operations:

        BIH_API_EOI
        BIH_API_Mask
        BIH_API_Unmask
	BIH_API_Get_Mask   EQU   0003h
	BIH_API_Get_IRR    EQU   0004h
        BIH_API_Get_ISR
        BIH_API_Call_Back

BIH_API_EOI	Equivalent to calling `VPICD_Phys_EOI`.
BIH_API_Mask	Equivalent to calling the `VPICD_Physically_Mask` service.
BIH_API_Get_IRR	Equivalent to calling the `VPICD_Test_Phys_Request` service. Returns carry set if the physical interrupt request is set.
BIH_API_Get_ISR	Retrieves the in-service state of the IRQ. Returns with carry set if the IRQ is in service.
BIH_API_Call_Back	Uses the `Call_Priority_VM_Event` service to schedule an event for the target VM specified BX. When the event callback is processed, VPICD will use nested execution services to simulate a far call to the address specified by CX:DX.

The BIH_API_Call_Back procedure is useful for calling routines that do not have GDT aliases or that must be executed in a specific VM. A common use of this service is to call a routine in the driver that posts a message using the PostMessage() Windows API.

Note: VMM schedules event services to process the callback in the specified VM. The callback is not executed synchronously. A driver should not post more than one event without notification that the event has been processed. If multiple events are posted without verifying that outstanding callbacks already exist, the VMM event services may run out of resources and crash the system.

Chapter 8

Virtualized DMA

Physical State vs. Virtual State
DMA Virtualization
DMA Region Mapping
Avoiding VDMAD Interference

The Virtual DMA Device (VDMAD) provides services that allow a VxD to take control of a DMA channel. A VxD using these services can intercept the DMA requests and modify the VM state causing the VM to believe that the request completed. Also, it is possible to translate or modify the VM's request before the physical state of the DMA controller is updated. Additionally, by using these services, a VxD can add another level of hardware contention management or indirectly replace portions of VDMAD's default handling.

All DMA channels are virtualized by VDMAD to map DMA requests by drivers to the physical hardware. VDMAD validates the memory region supplied by the driver, and if necessary, allocates the region from an internal DMA buffer.

Certain restrictions imposed by the DMA controller require the region management of VDMAD¹:

The DMA controller can only understand contiguous physical memory addresses.
The DMA controller can not cross 64k boundaries, because the page register does not auto increment.
The DMA controller has an address limit of 16 MB.

¹For simplicity, this discussion only reference the hardware with the lowest common denominator, the 8253 DMA controller. Other controllers may support advanced features, but for proper coverage by your VxD, this controller interface constrains the functionality of the DMA interface.

VDMAD breaks up requests into partial DMA transfers to satisfy these requirements. DMA buffers submitted using the auto-init mode of the DMA controller cannot be broken; consequently, these requests must be submitted with regions adhering to the restrictions.

87 For this reason, auto-init-mode DMA requires special memory management on behalf of the device driver.

Note that this discussion does not cover advanced DMA topics, such as bus-mastering devices and DMA controllers supporting scatter-gather.

Physical State vs. Virtual State

As a VM programs the DMA controller, the controller's virtual state is updated, but state is not submitted to the hardware until the VM unmasks the channel. This is important to remember when you are debugging drivers using DMA. To display the channel status, use the debug version of Win386 supplied with VxD-Lite and query VDMAD.

After the VM has unmasked the channel, VDMAD attempts to lock the memory region, as programmed by the VM. If it is unsuccessful, VDMAD buffers the DMA transfer and modifies the DMA controller's physical state.

VDMAD uses the VPICD_Hw_Int_Proc service to provide a watchdog event to poll for the DMA controller's terminal count when non-auto-init-mode DMA transfers are requested. When the DMA controller has completed the request, the necessary buffers are updated (if a read operation was requested and buffers were allocated) and the VM's virtual DMA state is updated to reflect the completed transfer.

A VxD can modify the DMA controller's virtual and physical states using the VDMAD_Set_Virt_State and VDMAD_Set_Phys_State services, which are usually incorporated with a handle of DMA channel that has been virtualized by a VxD.

DMA Virtualization

A VxD uses DMA virtualization to add functionality to the base support of VDMAD. A VxD can use this virtualization to change the virtual state before the request is submitted to the hardware. To virtualize a DMA channel, a VxD uses the VDMAD_Virtualize_Channel service:

;Tell VDMAD that we want to know about this DMA controller.
        xor     eax, eax
	mov     [gdwDMAHandle), eax
	movzx   eax, gbDMAchannel
	mov     esi, OFFSET32 VSIMPLED_Virtual_DMA_Trap
	VxDCall VDMAD_Virtualize_Channel
	mov     [gdwDMAHandle], eax
	jc      SHORT VDC_Exit_Failure

When a VM has changed the virtualized DMA controller's mask state, it calls the supplied procedure, in this case VSIMPLED_Virtual_DMA_Trap. -t The VxD can modify the virtual state of the VM and then call the default handler, VDMAD_Default_Handler, to allow VDMAD to continue the region management as follows:

;VSIMPLED_Virtual_DMA_Trap
;
;DESCRIPTION:
;    Forces DMA_block_mode and then calls the default DMA handler.

BeginProc VSIMPLED_Virtual_DMA_Trap, High_Freq
        VxDCall VDMAD_Get_Virt_State
	test    dl, DMA_requested
	jz      SHORT VDT_Exit
	test    dl, DMA_masked
	jnz     SHORT VDT_Exit
; Force block mode DMA, channel is requested and
; unmasked by the VM.
	and     dl, NOT (DMA_mode_mask)
	or      dl, DMA_block_mode
	xor     dh, dh
	VxDCall VDMAD_Set_Virt_State
VDT_Exit:
        VxDCall VDMAD_Default_Handler
	ret
EndProc VSIMPLED_Virtual_DMA_Trap

If necessary, a VxD can handle the actual DMA buffer translation and program the physical state of the DMA controller. This type of virtualization requires the use of the VDMAD buffer copy and region management services (listed in Appendix A).

Additionally, a VxD can translate the DMA request to a replacement interface, such as those supplied by the PCMCIA hardware implementations. Again, the VxD must virtualize the DMA channel and process the notifications from VDMAD.

Although some of the buffer management details are discussed in the next section, you should investigate the VDMAD sources provided in the Microsoft Windows 3.1 Device Driver Kit for code samples and to develop a better understanding of the operation of VDMAD.

DMA Region Mapping

As already mentioned, the primary purpose of VDMAD is to buffer DMA requests and to map the regions to memory accessible by the DMA controller. DMA region mapping is automatically performed by VDMAD on a non-virtualized channel when the DMA channel is unmasked. A VxD virtualizing a DMA channel can use these services without additional code overhead simply by calling the VDMAD_Default_Handler. When a non-standard interface is implemented, some or all of the region mapping services of VDMAD will be needed.

To request a DMA buffer from VDMAD and copy information from a VM to this buffer, the VxD uses the VDMAD Reouest_Buffer and VDMAD_Copy_To_Buffer services:

;Request a buffer from VDMAD and copy from VM
;On entry, EAX is DMA handle, EBX is VM handle.

        VxDCall VDMAD_Get_Virt_State
        push    edx                    ; save mode for later
	push    ebx                    ; save VM for later
;ESI = linear address
;ECX = count
;DL/DH = mode/flags
        test    dl, DMA_requested
	jnz     SHORT Buffer_New
	test    dl, DMA_masked
	jnz     SHORT Buffer_CleanUp
	VxDCall VDMAD_Request_Buffer
	jc      SHORT Error_No_Buffer
;EDX now contains the physical address of the DMA buffer...
	test    dl, DMA_type_read
	jz      SHORT Dont_Copy
;EBX = buffer handle
;ESI = linear region
;ECX = size
;EDI = offset
        xor     edi, edi
        VxDCall VDMAD_Copy_To_Buffer
	jc      SHORT Error_Copy

To prepare the hardware state, the VxD updates the region information and programs the physical state to the DMA controller. The VxD starts DMA transfer by unmasking the channel:

Dont_Copy:
        pop     ebx
	VxDCall VDMAD_Set_Region_Info
	VxDCall VDMAD_Set_Phys_State
;Unmask the DMA channel to begin the transfer
	VxDCall VDMAD_UnMask_Channel

Note that these code fragments are very simple and incomplete. For instance, the VxD does not check to see whether the region can be locked by using the VDMAD_Lock_DMA_Region service before requesting the buffer from VDMAD.

When a DMA channel is unmasked using the VDMAD_UnMask_Channel service, the ownership of the DMA channel is assigned to the requesting VM. VDMAD sets up the watchdog event to modify the virtual channel state when the terminal count is reached for non-auto-init-mode transfers. When the watchdog event determines that the channel has reached terminal count, VDMAD virtually masks it. If the operation was a DMA write operation, the buffer is copied to the VM's linear address, as supplied with VDMAD_Set_Region_Info. The virtual count register is updated, the channel is physically masked, and the channel owner is set to NULL.

Avoiding VDMAD Interference

VDMAD always attempts to complete the DMA transfer when the channel has been unmasked by using the VDMAD_UnMask_Channel service. To completely control the DMA channel in your VxD, you can virtualize the DMA channel using a NULL handling procedure and then program the DMA controller directly from your VxD. VDMAD will continue to trap the I/O range for the controller but will not update the physical state. Alternatively, you can provide a virtual DMA handling procedure and program the controller directly by using the virtual controller state information as provided by VDMAD. When using this implementation, you must avoid VDMAD services that affect the physical state or make assumptions about the ownership of the channel. Also, you need to resolve contention by other VMs in your procedure. Consult the VDMAD sources for further details.

Chapter 9

VKD and Keyboard Processing

Hot Keys
Simulating Keystrokes to VMs

The Virtual Keyboard Driver (VKD) provides an interface to the keyboard that allows a VxD to trap for hot keys, simulate keystrokes into a VM, and simulate a paste operation from a supplied buffer into a VM. This interface can be used to force certain actions in a VxD or to serve as form of communication between a VxD and an active application in a VM.

Hot Keys

Hot keys are registered with the VKD through the VKD_Define_Hot_Key service. Hot keys are enabled and disabled on a per-VM basis using the VKD_Local_Enable_Hot_Key and VKD_Local_Disable_Hot_Key services when the LocalKey flag is specified, as follows:

;Define hot keys for ctrl-pgup and ctrl-pgdn

	mov	al, 49h			; page-up
	mov	ah, ExtendedKey_B
	ShiftState <SS_Toggle_mask + SS_Either_Ctrl>, <SS_Ctrl>
	mov     cl, CallOnPress + CallOnRepeat + Local_Key
	mov     esi, OFFSET32 VSIMPLED_Hot_Key_Handler
	xor     edx, edx
	xor     edi, edi
	VxDCall VKD_Define_Hot_Key
	JC      SHORT Exit_Failure
	mov     ghhkCtrlPgUp, eax
	mov     al, 51h			; page-down
	mov     ah, ExtendedKey_B
	Shiftstate <SS_Toggle_mask + SS_Either_Ctrl>, <SS_Ctrl>
	mov     cl, CallOnPress + CallOnRepeat + Local_Key
	mov     esi, OFFSET32 VSIMPLED_Hot_Key_Handler
	xor     edx, edx
	xor     edi, edi
	VxDCall VKD_Define_Hot_Key
	jc      SHORT Exit_Failure
	mov     ghhkCtrlPgDn, eax

To disable these keys by default, use the VKD_Local_Disable_Hot_Key service during the Sys_VM_Init and VM_Critical_Init message processing:

VSIMPLED_Sys_VM_Init LABEL NEAR
BeginProc VSIMPLED_VM_Critical_Init
	mov     eax, ghhkCtrlPgUp
	VxDCall VKD_Local_Disable_Hot_Key
	mov     eax, ghhkCtrlPgDn
	VxDCall VKD_Local_Disable_Hot_Key
	clc
	ret
EndProc VSIMPLED_VM_Critical_Init

Once a hot key has been enabled in a VM the VxD receives a notification from VKD whenever the hot key is pressed and processes it accordingly:

BeginProc VSIMPLED_Hot_Key_Handler
	push    eax
;Turn off hot key mode in case we_re going
;to expand this to force keys. Don_t want
;to be in hot key mode when forcing keys to a VM.

	VxDCall VKD_Cancel_Hot_Key_State
	cmp     al, 49h
	jne     SHORT HK_PgDn
;Ctrl-PgUp pressed...
	Trace_Out "Control-PgUp pressed in VM #EBX"
	jmp     SHORT HK_Exit
HK_PgDn:        ; Ctrl-pgDn pressed...
	Trace_Out "Control-PgDn pressed in VM #EBX"
HK_Exit:
	pop         eax
	ret
EndProc VSIMPLED_Hot_Key_Handler

Simulating Keystrokes to VMs

VKD provides services to force keys to a VM's keyboard buffer, so that the VM reacts as the key had been pressed on the physical keyboard. The buffer passed to the VKD_Force_Keys service contains actual keyboard scan codes, such as the "key down", "key repeat", and "key up" codes.

;This code snippet just forces PgDn and PgUp
;to the VM in place of Ctrl-PgDn and Ctrl-PgUp.

ForceKey_Buffer_Down label byte
	db          51h, 0D1h
ForceKey_Buffer_Down_Len equ $-ForceKey_Buffer_Down

ForceKey_Buffer_Up label byte
	db     49h, 0C9h
ForceKey_Buffer_Up_Len equ $-ForceKey_Buffer_Up

BeginProc VSIMPLED_Hot_Key_Handler
	push	eax
;Don_t want to be in hot key mode
;when forcing keys to a VM.
	VxDCall VKD_Cancel_Hot_Key_State
	cmp     al, 49h
	jns         SHORT HK_PgDn
; Ctrl-PgUp pressed...
	Trace_Out "Control-PgUp pressed in "N *EBX"
	mov     ecx, ForceKey_Buffer_Up_Len
	lea     esi, ForceKey_Buffer_Up
	jmp     SHORT HK_ForceEm
HK_PgDn:        ; Ctrl-PgDn pressed...
	Trace_Out "Control-PgDn pressed in VN #EBX"
	mov     ecx, ForceKey_Buffer_Down_Len
	lea     esi, ForceKey_Buffer_Down
BK_ForceEm:
	VxDCall VKD_Force_Keys
IFDEF DEBUG
	jnc     SHORT @F
	Debug_Out "VKD_Force_Keys failed!"
@@:
ENDIF
	pop     eax
	ret
EndProc VSIMPLED_Hot_Key_Handler

Using the force keys service is quite simple, but determining which scan codes to send is probably the most time-consuming part of using this interface. To make determining the scan codes simpler, I have created a simple utility that watches INT 9h and displays the keystrokes to the screen until you press the <ESC> key. The code for the KEYDISP utility can be found on the accompanying disk in the ASM\KEYDISP directory.

Chapter 10

Writing VxDs in C

Segment Attributes
A 'C'-callable Wrapper for VMM
VSIMPLED Sources in 'C'

The concept of writing VxDs in 'C' has been widely misunderstood. Writing VxDs in 'C' is not impossible -- on the contrary, you can do it without a great deal of grief. Forget everything anyone has every told you about writing VxDs in 'C' and open your mind. VxDs written in 'C' are the wave of the future, not just a passing fad.

VMM does not look in the object code of VxDs for magical embedded notations to determine whether the code was generated by a 'C' compiler or the magical MASM 5.10B assembler. When a good 386 32-bit 'C' compiler generates the necessary code, the LINK386 linker will link the objects and generate a proper executable, which can be called a VxD.

The main hurdle to overcome when writing VxDs in 'C' is that a great portion of VMM services require either parameter passing using registers or that the mystical dynalinking macro must be used to generate the code to call VxD or VMM services. Additionally, services declared by VxDs are created with tables hidden by the VMM.INC macros and the actual procedure entry points are renamed with a new prefix. But that doesn't mean that it's time to give up and return to assembly, only that you may not be able to write all of your VxD in 'C'. Some assembly may be required: I affectionately refer to this as MASM-tape. I'll provide the MASM-tape on the accompanying disk and some instruction and you can begin writing VxDs in 'C' almost immediately, assuming you have the rest of the necessary tools. I have been successful using the WATCOM C/386 V9.5 compiler to generate flat 32-bit code. The samples included on the diskette were created using this compiler.

The limitations and restrictions of writing a VxD in 'C' include the following:

Because most VxDs have been written in assembly, interfacing to these VxDs requires external procedures written in assembly.
Some of the debugging functionality (call logging, for example) is not available to VxD procedures written in 'C'.
Testing and debugging is more difficult, because you must rely on the compiler code generation instead of an assembler.

Segment Attributes

VxD segments require the following specific attributes:

All code and data segments are USE32 with the exception of the Real Mode Initialization segment, which is the only USEl6 segment in the VxD executeable (excluding the stub executable).
Initialization code and data are defined by the _ITEXT and _IDATA segments, respectively. These code segments have the segment class definition of ICODE.
Pageable code and data are defined by the _TEXT and _DATA segments and have a segment class definition of PCODE.
Locked code and data are defined by the _LTEXT and _LDATA segments and have a segment class definition of LCODE.
Because VxDs require flat model, the 'C' compiler in question must be able to generate 32-bit flat model code.

Most compilers support the #pragma code_seg and #pragma data_seg directives. The following directives will define the necessary segments and classes:

// code and data segment directives for init code
#pragma code_seg("_TTEXT", "ICODE")
#pragma data_seg("_IDATA", "ICODE")

// code and data segment directives for pageable code
#pragma code_seg("_TEXT", "PCODE")
#pragma data_seg("_DATA", "PCODE")

// code and data segment directives for locked code
#pragma code seg("_LTEXT", "LCODE")
#pragma data seg("_LDATA", "LCODE")

When developing the samples in 'C' for this book, I experienced problems with the WATCOM C/386 compiler using the #pragma code_seg directive and was forced to use command line options to define the segment and class names (see the sample makefiles for more information). Also, some 'C' compilers may not support multiple segment declarations in a single module. You may be required to create one module for initialization code and data, another for locked code and data and another for pageable code and data.

A 'C'-callable Wrapper for VMM

A VxD entry point is defined in the Device Declaration Block (DDB) as defined in Chapter 1. The DDB is exported using a .DEF file. A typical export is as follows:

EXPORTS
        VSIMPLED_DDB @1

In order to maintain compatibility with this naming convention, the compiler must not generate the 'C'-style underscore prefix. The WATCOM C/386 compiler provides an option for disabling this naming convention.

The DDB structure, as defined using 'C', is as follows:

#define DDK_Version 0x30A

typedef struct tagVxD_Desc_Block {
  DWORD  DDB_Next                  // VMM reserved field
  WORD   DDB_SDK_Version           // WMM reserved field
  WORD   DDB_Req_Device_Number     // Required device number
  BYTE   DDB_Dev_Major_Version     // Major device number
  BYTE   DDB_Dev_Minor_Version     // Minor device number
  WORD   DDB_Flags                 // Flags init calls complete
  BYTE   DDB_Name[8]               // Device name
  DWORD  DDB_Init_Order            // Initialization Order
  DWORD  DDB_Control_Proc          // Offset of control procedure
  DWORD  DDB_V86_API_Proc          // Offset of APT procedure
  DWORD  DDB_PM_API_Proc           // Offset of API procedure
  DWORD  DDB_V86_API_CSIP          // CS:IP of API entry point
  DWORD  DDB_PM_API_CSIP           // CS:IP of API entry point
  DWORD  DDB_Reference_Data        // Ref. data from real mode
  DWORD  DDB_Service_Table_Ptr     // Pointer to service table
  DWORD  DDB_Service_Table Size    // Number of services
} DDB;

The following example declares a DDB within a 'C' module:

#include <vmm.h>
#include "vsimpled.h"

#pragma data_seg("_LDATA", "CODE")
/*
 *     V I R T U A L    D E V I C E    D E C L A R A T I O N
 */

DDB VSIMPLED_DDB = { NULL,                 // must be NULL
		     DDK_Version,          // DDK_Version
		     VSIMPLED_Device_ID,   // Device ID
		     VSIMPLED_Major_Ver,   // Major Version
		     VSIMPLED_Minor_Ver,   // Minor Version
		     NULL,
		     "VSIMPLED",
		     Undefined_Init_Order,
		     DWORD) vmmwrapVxDcontrolProc,
		     NULL,
		     NULL,
		     NULL,
		     NULL,
		     NULL,
		     NULL,
		     NULL};

To provide an interface to the register parameters for VxD control procedures, an assembly wrapper is necessary. This procedure creates a 'C' stack frame and calls the associated procedure as defined in a dispatch table:

// This table is used by the vmmwrapVxDcontrolproc defined
// in "NMWRAP.ASM". It lists the messages and associated
// dispatch functions, it must be terminated with -1 and NULL.

DISPATCHINFO alpVxDDispatchProcs[] =
   { Create_VM,           VSIMPLED Create_VM,
     Sys_Critical_Init,   VSIMPLED_Sys_Critical_Init,
     Device_Init,         VSIMPLED_Device_Init,
     -1,                  NULL };

When the VxD control procedure is called by VMM, the vmmwrapVxDControlPrnc (provided by VMMWRAP.ASM) walks this table and dispatches the system message to the associated procedure. Note that vmmwrapVxDControlProc uses a linear search algorithm; consequently, the least-frequent system events should be located at end of the table. Some of the dispatch functions have slightly different prototypes, not listed here becausse the sample sources demonstrate their use and the VMMWRAP.ASM code is well documented.

The following code excerpt demonstrates a VxD initialization procedure as written in 'C':

#pragma data_seg("_IDATA", "ICODE")
/*	I C O D E  */

BOOL VSIMPLED_Device_Init(DWORD VM, PSTR pcmdTail, PCRS_32 pCRS) {
/*  Description:
 * This is a non-system critical initialization procedure.
 * IRQ virtualization, I/O port trapping, and VM control
 * block allocation can occur here.
 * Again, the same return value applies... TRUE for success,
 * FALSE for error notification.
 * Parameters:
 *     DWORD VM - System VM handle
 *     PSTR pcmdTail - pointer to WIN.COM's command tail
 *     PCRS_32 pCRS - pointer to System VM client register structure
 *
 *  History:   Date      Author          Comment
 *             3/ 9/93   BryanW          Wrote it.
 */
  vmmTraceOut("VSIMPLED_Device_Init\r\n");
  return TRUE;
} // end of VSIMPLED_Device_Init()

Wrapping VxD Services

As mentioned earlier, VxD service calls to other VxDs or VMMs use the Int 20h dynalink interface. Embedding this code throughout your VxD is inefficient, and some form of 'C' to assembly interface is necessary with some services because of register parameter passing.

VMMWRAP.ASM defines a large number of 'C' callable routines that convert stack parameters into the correct register parameter interfaces used by the various services and return the results of the service call. For example, the VMM service List_Create uses the ECX, EAX, and ESI registers to define a node size and flags and to return a handle to the list. It then becomes necessary to provide an C-callable interface:

;DWORD PASCAL vmmListCreate(UINT uNodeSize, UINT uFlags);
;DESCRIPTION:
;  Creates a new list structure.
;PARAMETERS:
;  UINT uNodeSize
;  UINT uFlags
;      Specifies the creation flags, it can be a
;      combination of the following values:
;          LF_Alloc_Error, LF_Async, LF_Use_Heap
;RETURN VALUE:
;  DWORD
;      handle to the list or NULL if failure

BeginProc vmmListCreate, PUBLIC
	uFlags  equ [ebp + 8]
	uNodeSize equ [ebp + 12]
	push    ebp
	mov     ebp, esp
	push    esi
	push    ecx
	mov     ecx, uNodeSize
	mov     eax, uFlags
	VMMCall List_Create
	pop     ecx
	mov     eax, esi
	pop     esi
	jnc     SHORT VLC_Exit
	xor     eax, eax
VLC_Exit:
        pop     ebp
	ret     8
EndProc vmmListCreate

A VxD in 'C' can then call this service as follows:

//Create a list with elements of the type NODE
hList = vmmListCreate (sizeof(NODE), 0);

Thunking Callbacks

A thunk is a piece of assembly code that fronts your 'C' procedure to map registers as passed by VMM to a 'C' stack frame and then calls your procedure. A thunk also converts the 'C' return value to the expected return value for the callback. Callbacks are used by VMM and other VxDs for notification and event processing. For example, when a V86 page is hooked, a page fault handler in the VxD is called to resolve the fault.

A thunk is created "on the fly" by a thunking procedure. Given a procedure address, a thunking procedure copies the base code, patches the necessary offsets, and returns a pointer to this piece of code. An advantage to using flat model code here is that a VxD can reference code and data with the same offset. Creating executable code with a simple heap allocation is easy, because selector restrictions are not an issue. For example, the following will create a procedure thunk for a generic VMM event callback:

;EVENTPROC PASCAL vmmwrapThunkEventProc(EVENTPROC pProc);
;DESCRIPTION:
;  Creates a procedure thunk for VxD generic event callbacks.
;PARAMETERS:
;  IDWORD pProc
;    pointer to callback procedure, must have the form:
;    VOID CDECL EventProc( DWORD hVM, DWORD dwRefData, PCRS_32 pCRS);
;RETURN VALUE:
;  EVENTPROC
;    pointer to thunk or NULL if failure

BeginProc vmmwrapThunkEventProc, PUBLIC
	pCRS    equ [ebp]
	pProc   equ [ebp + 8]
	push    ebp
	mov     ebp, esp
	call    Allocate_Procedure_Thunk
	jc      SHORT VEProc_Failure
	jmp     SHORT VEProc_CreateThunk

; Begin thunk code
EventThunk label  byte
	push    pCRS
	push    edx                         ; uPage
	push    ebx                         ; hVM
	call    $
EventThunkCallAddr equ $ -EventThunk
	add     esp, 12                     ; fixup for CDECL
	ret
EventThunkSize    equ     $-EventThunk
; End thunk code

VEProc_CreateThunk:
	push    ecx
	push    edi
	push    esi
; Copy the thunk...
	lea     esi, EventThunk
	mov     edi, eax
	mov     ecx, EventThunkSize
	cld
	shr     ecx, 1
	rep     movsw
	adc     cl, cl
	rep     movsb
;Fix it up...
	push    eax
	add     eax, EventThunkCallAddr
	mov     esi, eax
	sub     esi, 4
	sub     eax, pProc
	neg     eax
	mov     dword ptr [esi], eax
	pop     eax
	pop     esi
	pop     edi
	pop     ecx
	jmp     SHORT VEProc_Exit
VEProc_Failure:
	xor     eax, eax
VEProc_Exit:
	pop     ebp
	ret     4
EndProc vmmwrapThunkEventproc

To avoid page faults while executing thunk code, allocate a non-pageable memory block for a thunk table on the first call to Allocate_Procedure_Thunk. To simplify thunk allocation management, the allocation routine uses a fixed, maximum thunk size; this routine could be improved to be more memory efficient. The actual thunk code is embedded in the specific thunk allocation procedure. After the memory allocation for the thunk has been performed, the thunk code is copied and patched with the correct offset to the caller's provided procedure address. Thunks should be created only once per procedure, as follows:

/*NOTE!!! pVMEMTRAP_PFault is a global pointer to the
 * Page_Fault procedure thunk.
 */
if (!pVMEMTRAP_PFault)
  if (pVMEMTRAP_PFault = vmmwrapThunkV86PHProc(VMEMTRAP_PFault)) {
  }else{
    vmmDebugOut("Could not allocate Page_Fault thunk!\r\n");
    return FALSE;
  }
  vmmHookV86Page(wPage, pVNEMTRAP_PFault);
  return TRUE;

Service Tables

Service tables are best left to assembly. Although it is possible to create a service table using 'C', there are many restrictions:

Predefined services for replacement system drivers (such as VDD, VCD, etc.) almost always use register parameter passing. An assembly front end must be used for these procedures. There is no need to create a service table in C for these cases.
VMM.INC uses the `@` prefix for the actual procedure name. It also generates a service number for each of the listed services. Your "public" header file must provide these definitions and your service names must be distinct from the service number definitions.
VMM.INC creates debugging calls to watch for VMM reentrancy of non-asynchronous services. These services will not be available to your VxD's service procedures if they are written in 'C'.

A service table can be declared in 'C' as follows:

#define VSIMPLED_Get_Version (VSIMPLED_Device_ID) % 16 + 0x0000
#define VSIMPLED_Get_Info    (VSIMPLED_Device_ID) % 16 + 0x0001

DWORD CDECL I_VSIMPLED_Get_Version(VOID);
BOOL CDECL I_VSIMPLED_Get_Info(PINFOSTRUCT);

SERVICETABLE VSIMPLED_ServiceTable = {
  I_VSIMPLED_Get_Version,
  I_VSIMPLED Get_Info};

The service table must be located in the locked data segment. The DDB should be contain a pointer to service table and number of services declared.

If your VxD is replacing a standard VxD, such as the Virtual Display Driver, a service interface already exists. To support this interface and to allow the VxD service procedures to be written in 'C', the service entry points are thunked using a macro, such as the following to provide an interface to the register parameters:

Service_Thunk   MACRO   Service_Name, Type
IFNB <Type>
  IFIDNI <Type>, <ASYNC_SERVICE>
     BeginProc Service_Name, ASYNC_SERVICE
  ELSE
     %OUT ERROR: Service_Thunk <Type> parameter must be\
ASYNC_SERVICE or undefined
     err
  ENDIF
ELSE
  BeginProc Service_Name, SERVICE
ENDIF
  EXTRN   _&Service_Name:NEAR

IFDEF DEBUG
  Debug_Out `In &Service_Name'
ENDIF
       pushad
       pushfd
       push     esp
       cCall    _&Service_Name
       add      esp, 4
       popfd
       popad
       ret
  EndProc Service_Name
ENDM

The service thunks are defined as follows using the macro:

VxD_CODE_SEG
	Service_Thunk   VDD_Get_ModTime
	Service_Thunk   VDD_Set_HcurTrk
	Service_Thunk   VDD_Msg_ClrScrn
	Service_Thunk   VDD_Msg_ForColor
	Service_Thunk   VDD_Msg_BakColor
        Service_Thunk   VDD_Msg_TextOut
	Service_Thunk   VDD_Msg_SetCursPos
	Service_Thunk   VDD_Query_Access
        ; New services for 3.1
	Service_Thunk   VDD_Check_Update_Soon
VxD_CODE_ENDS

The service table is defined as usual:

	.xlist
	INCLUDE VMM.INC
PUBLIC VDD_Service_Table
Create_VDD_Service_Table EQU True
	INCLUDE VDD.INC
	list

Finally, a service procedure written in 'C' uses a pointer reference to the registers, as provided by the thunk, to access the parameters:

/*
 * VOID VDD_PIF_State
 * Description:
 *     Informs VDD about PIF bits for newly created "N.
 * Parameters:
 *     PREGS pRegs
 *     pRegs -> ebx = VM handle
 *     pRegs -> ax = PIF bits
 * Return (VOID):
 *     Nothing.
 */

VOID CDECL VDD_PIF_State(PREGS pRegs, PVDDCB pVMCB) {
  if (vmmTestSysVMHandle(pRegs -> ebx)) {
    wPIFSave = (WORD) pRegs -> eax;
  }else{
    pVMCB = (PVDDCB) (pRegs -> ebx + dwVidCBoff)
    if (pVMCB -> VDD_PIF != (WORD) pRegs -> eax) {
      pVMCB -> VDD_PIF = (WORD) pRegs -> eax;
      VDD_TIO_SetTrap( pRegs -> ebx, pVMCB);
    }
  }
} // VDD_PIF_State()

VSIMPLED Sources in 'C'

The VSIMPLED VxD introduced in Chapter 1 has been rewritten in 'C' to demonstrate some of the techniques discussed in this chapter:

VSDINIT.C

/*  Module: vsdinit.c
 *  Purpose:
 *      Init code and data for VSIMPLED.
 *  Development Team:
 *      Bryan A. Woodruff
 *  History:    Date      Author      Comment
 *		 3/14/93  BryanW      Wrote it.
 *
 *	  Copyright (c) 1993 Woodruff Software Systems.
 *		      All Rights Reversed.
 */
#include <vmm.h>
#include "vsimpled.h"

#pragma data_seg("_IDATA","ICODE")

/*		  I C O D E
 * BOOL VSIMPLED_Sys_Critical_Init
 * Description:
 *   On entry, interrupts are disabled. Critical initialization
 *   for this VxD should occur here. For example, we can read
 *   settings from VM's cached copy of the SYSTEM.INI and act
 *   set up our VxD as appropriate.
 *
 *   This procedure is called when the VxD_Control_Proc
 *   dispatches the Sys_Critical_Init notification from VMM.
 *
 *   We can notify VMM of success or failure by returning TRUE or
 *   FALSE.
 *
 * Parameters:
 *   DWORD hVM          System VM handle
 *   DWORD dwRefData	reference data passed from real-mode init
 *   PSTR pcmdTail      pointer to WIN.COM's command tail
 *   PCRS_32 pCRS       pointer to System VM client register structure
 *
 *  History:   Date       Author      Comment
 *		3/ 9/93   BryanW      Wrote it.
 *

BOOL CDECL VSIMPLED_Sys_Critical_Init(
  DWORD    hVM,
  DWORD    dwRefData,
  PSTR     pCmdTail,
  PCRS_32  pCRS) {

  vmmDebugOut("VSIMPLED_Sys_Critical_Init\r\n");
  return TRUE;
} // end of VSIMPLED_Sys_Critical_Init()

/*  BOOL VSIMPLED_Device_Init
 * Description:
 *   This is a non-system critical initialization procedure.
 *   IRQ virtualization, I/O port trapping, and VM control
 *   block allocation can occur here.
 *   Again, the same return value applies: TRUE for success,
 *   FALSE for error notification.
 * Parameters:
 *   DWORD hVM          System VM handle
 *   PSTR pCmdTail      pointer to WIN.COM's command tail
 *   PCRS_32 pCRS	pointer to System VM client register structure
 *
 * History:   Date       Author      Comment
 *             3/ 9/93   BryanW      Wrote it.
 */

BOOL CDECL VSIMPLED_Device_Init(
  DWORD    hVM,
  PSTR     pCmdTail,
  PCRS_32  pCRS) {

  vmmTraceOut("VSIMPLED_Device_Init\r\n");
  return TRUE;

) // end of VSIMPLED_Device_Init()

//  End of File: vsdinit.c

VSIMPLED.C

/*
 * Module: vsimpled.c
 * Purpose:
 *   A simple VxD written in  C'.
 * Development Team:
 *   Bryan A. Woodruff
 * History:   Date        Author      Comment
 *             3/ 9/93    BryanW      Wrote it.
 *
 *   Copyright (c) 1993 Woodruff Software Systems.
 *                  All Rights Reversed.
 */

#include <vmm.h>
#include "vsimpled.h"

#pragma data_seg("_LDATA", "CODE")

//       V I R T U A L    D E V I C E    D E C L A R A T I O N

DDB VSIMPLED_DDB = {NULL,                       // must be NULL
		    DDK_Version,                // DDK_Version
		    VSIMPLED_Device_ID,         // Device ID
		    VSIMPLED_Major_Ver,         // Major Version
		    VSIMPLED_Minor_Ver,         // Minor Version
		    NULL,
		    "VSIMPLED",
		    Undefined_Init_Order,
		    (DWORD) vmmwrapVxDControlProc,
		    NULL,
		    NULL,
		    NULL,
		    NULL,
		    NULL,
		    NULL,
		    NULL};

// This table is used by the vmmwrapVxDControlproc.
// It lists the messages and associated dispatch functions.
// It must be terminated with -1 and NULL.

DISPATCHINFO alpVxDDispatchprocs t =
  { Sys_Critical_Init,   VSIMPLHD_Sys_Critical_Init,
     Device_Init,         VSIMPtED_Device_Init,
     Create_VM,           VSI'APLED_Create_VM,
     -1,                  NULL};

/* BOOL CDECL VSIMPLED_Create_VM( DWORD hVM, PCRS_32 pCRS
 * Description:
 *   Notification when VMs (other than system VM) are created.
 * Parameters:
 *   hVM        VM handle
 *   pCRS       pointer to client register structure
 *
 * History:   Date       Author      Comment
 *             3/ 9/93   BryanW      Wrote it.
 */

IBOOL CDECL VSIMPLED_Create_VM( DWORD hVM, PCRS_32 pCRS) {

  vmmTraceout("VSIMPLED_Create_VM\r\n");

  return TRUE;

} // end of VSIMPLED_Create_VM()

//  End of File: vsimpled.c

Chapter 11

Using the Debugging Services

Debug Strings
Assertions
Extended Debug Commands

Debugging services are some of the most important, but least used, services of the VMM. The debugging services provide important feedback during the operation of your VxD. The debug version of WIN386, through the debugger interface, provides key information that can help you track down even the most difficult bugs. A better understanding of the debug services and VMM's debugging interface can save you time and frustration.

Debug Strings

The most commonly used macros are Debug_Out and Trace_Out which expand to calls to the Out_Debug_String service. Debug_Out also embeds an INT 1 in the code to cause a debugger break after displaying the string.

Debug trace strings are useful when you are tracking the last action before a crash or the watching execution path of code. Trace_Out is particularly well-suited to this. Debug_Out is most commonly used when an assertion fails or some other unexpected event occurs.

In Windows 3.1, the Mono_Out and Mono_Out_At macros call the Out_Mono_String service to display a string on the monochrome display. The Out_Mono_String service offers you a fast memory write so you don't have to wait for the serial port when using the WDEB386 debugger. This is excellemt for high frequency debug strings in such places as interrupt handlers.

The Queue_Out macro calls the Queue_Debug_String service, which queues a message string until it is retrieved by the lq command from the debugger interface. This is useful when multiple debug traces are occuring and scrolling from view. The Queue_Out macro lets you to record events and display them at your convenience.

Assertions

The DEBUG.INC header file includes a few useful assertions that are only available in a debug build of your VxD. Some of these services may not be available in the retail build of WIN386. See Appendix A for details.

Assert_VM_Handle	Verifies that the provided register or memory location contains a valid VM handle.
Assert_Cur_VM_Handle	Verifies that the provided register or memory location contains the current VM handle.
Assert_Client_Ptr	Verifies that the provided register or memory location points to the client register structure of the current VM.
Assert_Ints_Disabled	Verifies that interrupts are disabled.
Assert_Ints_Enabled	Verifies that interrupts are enabled.

Extended Debug Commands

Extended debug commands are available in the debug version of WIN386 through the .VMM command from a debugger prompt. The following menu appears when you invoke this command:

VMM    DEBUG       INFORMATIONAL              SERVICES
[A] System time
[B] Time-slice information/profile
[C] Dyna-link service profile information
[D] Reset dyna-link profile counts
[E] I/O port trap information
[F] Reset I/O profile counts
[G] Turn procedure call trace logging on
[H] V86 interrupt hook information
[I] PM interrupt hook information
[J] Reset PM and V86 interrupt profile counts
[K] Display event lists
[L] Display device list
[M] Display V86 break points
[N] Display PM break points
[O] Display interrupt profile
[P] Reset interrupt profile counts
[Q] Display GP fault profile
[R] Reset GP fault profile counts
[S] Toggle Adjust_Exec_Priority Log AND DISPLAY
[T] Reset Adjust_Exec_Priority Log info
[U] Toggle verbose device call trace
[V] Fault Hook information
Enter selection or [ESC) to exit:

The information available through this interface is quite extensive and specific to VMM. For example, the time slice command displays the following:

# VMs scheduled = 02
# idle VMs = 01
Time-Slice focus VM = 804A1000
Scheduled VM    = 804A1000
Time slice size = 00000014
Timer period     = 14
804A1000 background

Additionally, the following additional dot (.) commands are available in the debug version of VMM:

.VM [#]         Displays complete VM status
.VC [#]         Displays the current VMs control block
.VH             Displays the current VM handle
.VR [#]         Displays the registers of the current VM
.VS [#]         Displays the current VM's virtual mode stack
.VL             Displays a list of all valid VM handles
	        Toggles the trace switch
.S [#]          Displays short logged exceptions starting at #
.St [#]         Displays long logged exceptions
.LQ             Display Queue outs from most recent
.DS             Dumps the protected mode stack with labels
.HE [handle]    Displays Heap information
.ME [handle]    Displays Memory information
.MV             Displays VM Memory information
.MS PFTaddr     Display PFT info
.NF             Display Free List
.MI             Display Instance data info
.Mt LinAddr     Display Page table info for given linear address
.MP PhysAddr    Display ALL Linear addrs that map the given addr
.ND             Change debug MONO paging display
.NO             Set a page out of all present pages
.VMM            Menu VMM state information
.<dev name>     Display device specific info

One of the most useful commands is the exception tracing option. To turn tracing on, use the T command:

start tracing

stop tracing
exceptions logged = 00000C9D
00000C9D: OUT  804A1000 02 HI MMM  800E097E
00000C9C: 0050 804A1000 02 EI MMM  800E097E
00000C9B: 0006 804A1000 02 EI V86  2586:2230
00000C9A: OUT  804A1000 02 DI V86  C803:0A05
00000C99: 0006 804A1000 03 EI V86  2586:2230
00000C98: OUT  804A1000 03 DI V86  FFFF:0BEB
00000C97: 0006 804A1000 04 DI V86  265F:14A0
00000C96: OUT  804A1000 04 EI V86  D800:04A1
00000C95: 001A 804A1000 04 EI V86  D800:04A1 INT 1A   00000004
00000C94: OUT  804A1000 04 EI V86  D800:04A1
00000C93: 001A 804A1000 04 El V86  D800:04A1 INT 1A   0000008C
00000C92: OUT  804A1000 04 El V86  0486:0EF0
00000C91: 0050 804A1000 04 EI V86  0486:0EF0 INT 50   00000308
00000C90: OUT  804A1000 04 DI V86  1024:0F3C
00000C8F: 0013 804A1000 03 EI V86  FFFF:0BHB INT 13   00000308
00000C8E: OUT  804A1000 02 DI V86  BlAD:0031
00000C8D: 002A 804A1000 02 DI V86  BlAD:0031 INT 2A   00008200
00000C8C: OUT  804A1000 02 DI V86  C803:0A05
00000C8B: 0006 804A1000 02 EI V86  2586:2230
00000C8A: OUT  804A1000 02 DI V86  BlAD:0031
00000C89: 002A 804A1000 02 DI V86  BlAD:0031 INT 2A   00008200

The exception log shows 0xC9B exceptions during the short period that the system is allowed to run. To display details about an exception, use the sl command:

#.sl c8b
stop tracing
Show exception 00000C8B
00000C8B: 0006 804A1000 02 El V86  2586:2230

V86 Fault 0006  VM_Handle = 804A1000     00000C8B
AX=00007000 CS=2586  IP=00002230 FS=0000
BX=00000005 SS=0BCC  SP=00000190 GS=0000     TIME=00000096:1930
CX=0000001A DS=9E9B  SI=0000003F BP=0000201A

DX=0000001A  ES=0000  DI=00004000  FL=00033202

This fault occurred in V86 mode and was an invalid opcode (exception 6). To learn why an invalid opcode occur, we need to look at the disassembly:

#u &2586:2230
&2586:00002230 6380fc90      arpl   word ptr [bx+si+90fc],ax

Obviously, an arpl is not a valid V86 instruction. This arpl instruction is really a V86 break point. To demonstrate that this assumption is valid and to find the owner, we can use the M command (Display V86 break points) in the VMM debugging interface:

	       CS:IP    Hit Count Ref Data   Procedure
	   2586:2230    00002D76  00000031   @Resume_Exec + 2a

The owner of this break point is the the Resume_Exec service, which probably means that this fault was generated as the result of V86 nested execution in the VM.

As you can see, using of the debug version of W1N386 is essential to tracking down problems with your VxD. Some additional helpful debugging tips:

Always run the debug version of WIN386.EXE during your development and test cycle, however painful it may be. Although this version may be slower, it is much more informative than the retail version. The debug version of VMM will let you know when you've done bad things to the system.
Use the debug string services to output information during essential operations of your VxD. Watch for return codes and use Debug_Out when something unexpected occurs.
If you suspect that code in a particular VM is causing problems with your VxD, use the .VL and .VM commands to display the VM status and then set a break point at the current CS:IP. Restart the system and trace through the VM's code.
Become familiar with the P (step into) and T (trace into) commands of the WDEB386 debugger or similar commands in your favorite debugger. Watching the code as it executes (especially with nested execution) is essential to locating problems.
Never treat a system hang as the end of the world. Restart the system, turn on exception tracing, reproduce the problem, and break into the debugger. You should find that the exception tracing will assist in pin-pointing the problem. Once you become familiar with the fault sequences under normal operation of the system, you should be able to look at an exception log and find the areas of interest.
Load the symbols for the debugger, including WIN386.SYM and any core components of the Windows GUI that may be of interest, such as KRNL386.SYM. Once you've located an address that may be causing problems, you can locate the nearest symbol by using the LN (list near symbols) debugger command.

Chapter 12

VCOMMD Design Notes

Design
The Code
ComSysCritInit
Port Trapping
IRQ Trapping
Com_Api_Proc
VM Creation and Destruction
The Total VxD

Unfortunately, some of the best example programs are not themselves terribly usable. That holds for the example here: While it is useful as a teaching tool, I strongly recommend against actually using it in your system.

The following program virtualizes the COM1 port. One of the biggest problems with WIN386 today is the multitude of hardware cards, mostly used for communication of one type or another (modem, fax, network, tape, and so forth), that attempt to run without a VxD. I chose this topic in the hope that, by focusing on this particular problem, more hardware vendors will provide VxDs for their cards.

This driver does not fully replace the VCD. It virtualizes the COMM port and can be used instead of the VCD by DOS apps. However, it does not include the calls required to support Windows COMM drivers, so it cannot be used by Windows programs that talk to the Windows COMM API.

Design

To determine the goals of our COMM device, we need to virtualize the COMM port. If at all possible, we want to allow several applications to use the port simultaneously. Many applications should be able to read the state of the port and even set the communication parameters, even if they are not going to talk over the line.

We can fully virtualize all of the ports except for the actual data port. Because we cannot virtualize the actual data port, we have to make sure that only one application can talk on the line at any given time. If two try to talk at the same time, we have to let the user decide which application can use the port.

We also need to reflect interrupts into the proper VM, which is an expensive operation, so we want to make sure that we only do it if absolutely necessary. We can establish this by watching the value that the application writes to the Interrupt Enable Register and by trapping when the application does an EOI. Also, since emulation has so much overhead, we need to define a new interface that is directly callable from DOS, Windows, and other VxDs, is designed to allow block I/O (which is much faster than handling things on a byte-by-byte basis), and implements an open and close on the port so that we know when an app is done with the port. This eliminates the need to handle contention problems.

So, while we emulate to support existing applications, we also create a new API that works a lot more efficiently in a WIN386 world. If you write the only code that touches your card, then you should consider creating just the new interface. In this case, you still want to trap on your ports, so that other applications cannot write to them by mistake.

The Code

Declare_Virtual_Device sets up our VxD. RS232_DEVICE_ID is an identification number Microsoft has assigned to me personally; do not use it in any of your own VxDs. I use this same number for other VxDs I write about. The init order is set to VCD_InitOrder+1, so that RS232 loads before VCD, allowing us to get the IRQ and ports instead of VCD.

VidComlrq

VidComlrq is the data structure required by VPICD_Virtualize_IRQ to grab the IRQ. ComHwlnt is called on each IRQ that comes in. Because we reflect the IRQ into a VM, we need ComEoi. ComEoi is called when the VM does an EOI. We then do a VPICD_Phys_EOI.

Finally, when we are reflecting interrupts to a VM, we want to be careful to not use up all of their stack. Therefore, rather than simulating another IRQ when the VM does an EOI, we wait until their IRQ handler does an iret, completely unusing the stack, before we send in another one. We use ComIret, which is called after the VM does an iret to emulate the next pending IRQ.

When VPICD receives an interrupt, it masks the interrupt off and sends an EOI. It then reflects the IRQ to our VxD. When we do a VPICD_Phys_EOI, the VPICD unmasks the interrupt. This has two important ramifications. First, another interrupt can then occur immediately, and we can see it as soon as we unmask it. Second, if we never EOI, the interrupt is never unmasked, and we never see it again.

The Buffers

When a byte comes in on the data port, we want to read it before the next data byte overwrites it. A VM cannot always respond this quickly. While we usually must be able to reflect data to the VM as fast as it comes in, we can't do this on every byte, something like the argument on polling versus using an interrupt to handle an asynchronous line. Therefore, all reads and writes are done within the VxD using buffers. All port emulation read and writes also go to the buffers. Both the read and write buffers are circular buffers.

If the read and write pointers point to the same location, the buffer is empty. There is no buffer overrun check because a check would create the possibility of losing old or new data: If we ignore the problem, we lose old data. The result is the same: the program still runs but data is lost. (Granted, we lose more data this way, but if we lose any data, we are generally in trouble.) This eliminates the performance hit of checking the buffer size on each read and write.

The read buffer needs of three bytes for each data byte received. For each data byte, we first read the two status registers and store them. We then read the data byte and store it. We read the status bytes first so that the line status shows the data byte. By saving all three bytes, the calling application can get the status for each data byte.

Other Data

Next comes a number of jmp tables. These are used at various places within the code to quickly jmp to the proper function.

bInVmirq is a count of how many IRQs sent to the VM have not yet returned. Sending several at once is not a problem, as long as we don't overflow the VM's stack. This count should never go over 2.

bIntEnb holds the value of the Interrupt Enable Register as set by the VM that owns the port. Regardless of the value set, the hardware always has bits 0111b set. If the app in the VM has not set these bits, we do not want the performance hit of emulating an IRQ. Therefore, we use the values in bIntEnb to see whether we need to reflect an IRQ.

ComSysCritInit

We do all of our initialization during Sys_Critical_Init. This allows us to get on the IRQ and ports while no interrupts are occurring. We first use Allocate_Device_CB_Area to get some per-VM data. We can then access this data by adding the returned value to the VM handle.

Next we take over the eight COM1 I/O ports. If we cannot take over all of them, we return with carry set, which tells WIN386 not to load our VxD. If we don't own all of the ports, we are in conflict with another VxD (this is why VCD will fail to load if you load this VxD).

Following that, we take over IRQ4. In a commercial VxD, both the port numbers and the IRQ should be able to be overridden by values in system.ini. You can read system.ini by using Get_Profile_String. This allows you to change settings if the board is reconfigured. Once we have both the ports and the IRQ, we know we can run.

Now, we hook interrupts 21h, 23h, and 24h, so that we can take ownership of the port away from a VM if it terminates. While interrupts 23h and 24h do not guarantee that an app has terminated, an app can terminate in this manner.

Finally, we initialize the COM hardware, turning the interrupts on and enabling the transmit and receive interrupts.

Port Trapping

Trapping is where half the work of emulating the port occurs (the other half is the IRQ emulation). ComIoPortTrap is the common entry point. If the call comes from the VM that owns the port, the logic is quite simple.

First we call Emulate_Non_Byte_IO. If we get a request for non-byte I/O (word, dword, string), this macro breaks it into byte-sized calls. Since I don't foresee anyone actually using these calls, I use the emulate macro. If an app is likely to do a string of 512 bytes, you will want to handle it yourself. The overhead of Emulate_Non_Byte_IO is significant.

Next, we clear the direction flag. (If we don't we will get annoying, time-consuming intermittent bug.)

Then, if we don't take the jmp, we build the jmp vector offset. This takes into account the sizes of the read and write tables, as well as the specific values of ECX for reads and writes. We then jmp to the proper function, so that the ret from that function will take us indirectly back to WIN386. Any call, jmp, or ret flushes the on-board cache on the 386 & 486, so we want to minirnize these. Conditional jmps that are not taken do not flush the cache. That's why ComIoPortTrap has a single jmp for the common code path throughout this code. Generally, emulation code is never fast enough, so you do everything you can to speed it up.

If the calling VM doesn't own the port, we need to decide what to do. If no one owns the port, we can assign it to the calling VM. It would probably be better to assign the port to the first VM that accessed the data port; instead it is assigned to the first app to hit the port at all. We then initialize the port to the values we were holding in our instance data. If the app has written those values (while another app owned the port), it expects the hardware to be in a certain configuration. If someone else owns the port, we fake it, providing it is not a data read/write, by reflecting it back to the port-specific function which handles this. The one exception is I/O to 3F8h, when it is set to be the baud rate instead of the data port. That is handled in-line. If we have a data I/O and someone else owns the port, we have to decide who gets it. If the owner app used the new API, they keep the port. This not only gives apps an incentive to use the new API but leaves the API with the app that will free up its use as soon as it is done. Use a contention prompt when you think the owner may be done but are not sure.

Otherwise, we put up a contention MessageBox using SHELL_Resolve_Contention. This call puts up a box asking the user to pick between the two VMs by using their window titles to ID them (which usually both read MS-DOS Prompt). If the user picks the new one, the ownership is switched. The one that is not picked is marked as FAILED so we don't keep prompting every time it tries to read/write a byte.

In IoRead8 all input goes through the buffer. Therefore, the first thing we do is look for bytes in the buffer. If the buffer is empty, we return a 0; otherwise, we get the data byte from the buffer, inc the read pointer to the next set of data, and return the byte. Notice that we only take a conditional jmp if the pointer wrapped. This eliminates jmps from the common code path. We only get to IoRead8 if the DLAB bit is off (its the data byte). ComIoPortTrap handles virtualizing the low byte baud rate in 3F8h.

IoRead9 First we test to see whether we own the port. If not, we jmp to the end of the function to return the information from our instance data. On a write to 3F9h, we save these values so we return what the app expects. If DLAB is set, we read the port and return the value. If DLAB is not set, we return the value in bIntEnb so that the app receives the value it expects.

IoReadA is completely faked. We know which IRQ we sent down to the app and return the appropriate value. If we did not send an IRQ down, we either return 001b (receive IRQ) if we have data or return nothing if we do not. IoReadB and IoReadC, on the other hand, are both quite simple. If the app owns the port, we read from the hardware. If not, we read from the instance data.

IoReadD returns the line status. It tells us whether we can read or write a byte and whether there are any errors. If the calling app owns the port, we return data from the read buffer. If the read buffer is empty, we read the actual port. But if the calling app does not own the port, we return 00011110b which tells the app that the transmit buffer is full (the app cannot write), the receive buffer is empty (the app cannot read), and all error bits are on. This seems to be the best way to get the point across to the app that it is not going to have any luck with this port.

IoReadE is straightforward. If the calling app does not own the port, we use our instance data. If it does own the port, we get the data from the read buffer. If the read buffer is empty, we read from the hardware.

IoReadPort (used only for port F) just reads from the hardware if the calling app owns the port. If the caller does not own the port, it returns 0. This port is undefined for the 8250, so we can't virtualize it.

IoWrite

IoWrit8 copies the data to the write buffer and increments its pointer. Again, it uses two jmps if the pointer wrapped to avoid jmps when the pointer does not wrap. If the output buffer was empty, we call IrqTransmit to send the byte to the hardware.

IoWrit9, like IoRead9, is tricky. If the write is from an app that does not own the port, we copy the value to the instance data for that VM. We do this for both the interrupt enable and the high-baud registers (both of which use this port). We use the instance data for the line control register to determine whether DLAB is set. If the app owns the port, and it is writing to the interrupt enable register, we save the value in bIntEnb and then `or' it with 0011b. This forces an IRQ to receive empty and transmit full, which we need for our buffering code. We then write the byte to the hardware.

IoWritB and IoWritC are both quite simple. If the calling app does not own the port, we copy the value to the instance data for that VM. If the app does own the port, we write to the hardware. IoWritPort (used for ports A, D, E, and F) goes directly to the port if the calling app owns the port. Writing to these ports is undefined for the 8250, so we cannot virtualize it.

IRQ Trapping

We trap the IRQ for two reasons. First, we need to see the interrupts when the transmit buffer is empty or the receive buffer is full for our buffering. Second, we need to reflect the interrupts down to the app that owns the port if it has enabled the interrupts that come in. When the interrupt handler is called, interrupts are off. We want to turn them on as soon as possible, because there may be other IRQs. When we turn them on, our IRQ remains masked until we call VPICD_Phys_EOI, so we do not need to worry about being re-entered. Since we do not need interrupts off for any reason, the first instruction is an STI. On calls to us, the direction flag is in an unknown state. We clear it so that movs instructions will increment the pointers.

In ComHwlnt we determine the correct handler to call based on the value in port 3FAh. We use this value to determine which offset in IrqTabl to jmp to. We jmp so that the ret in the called function returns directly back to WIN386.

In IrqReceive we first go into a loop that reads the data port until it is empty. We loop because the 16550 has a 16-byte FIFO and we could get multiple bytes. Doing this in this loop is much faster than getting each IRQ individually. We read the status ports first so that the line status will show that we have a data byte. After reading in the data, we call VPICD_Phys_EOI, which causes the IRQ to be unmasked (remember, it has already been EOIed). Its critical to do this as soon as possible so that we can get to the next interrupt quickly. This separates talking to the port from virtualizing it.

Now we need to virtualize the IRQ down to the VM. We only do this if we are not already in the middle of reflecting an IRQ. We also make sure we have data in our buffer. Finally, we don't reflect it if the app didn't turn on that interrupt. We then call VPICD_SetIntRequest, which attempts to reflect the IRQ immediately, otherwise it will reflect it as soon as possible.

Finally, if we have set up a callback function, we set tip an event to call the app back. We need to set up an event because we received the IRQ as an asynchronous event, limiting what we can do. We may not even be in the proper VM (remember, a VxD is always running in a VM, but which particular VM it is running on can change). If a fast response is critical, you may want to use Critical_Section_Boost instead of Cur_Run_VM_Boost.

IrqTransmit works basically the same way as IrqReceive. IrqModemStaus and IrqLineStatus are used merely to reflect the interrupts down to the VM. Our driver itself doesn't care about these.

VmCallBack is very simple. We pass a parameter in EAX which is the appropriate value in port 3FAh, letting the called app know whether the callback is due to a non-empty receive buffer or an empty transmit buffer. We then put the callback address in CX:EDX and use the Simulate_Far_Call to set up the stack and Resume_Exec to make the call. Don't forget the Client_State and Nest_Exec calls; without them it will not work.

ComEoi is called when the app does an EOI sends an EOI to the PIC. We have to call VPICD_ClearIntRequest to end the IRQ in that VM.

CoIntRet is called after the IRQ handler in a VM has completed the iret call in the interrupt handler called when we called VPICD_Set_Int_Request. At this point we call VPICD_SetIntRequest if we have data in our buffers and the app wants the IRQs. We do it here so that we do not eat up the app's stack by having IRQs come in on top of each other.

Com_Api_Proc

Com_V86_API_Proc and Com_PM_API_Proc are the entry points when a real or protected mode app calls us via the int 2F call. In the initial functions, we have to convert any pointers to flat 32-bit pointers. We then jmp to Com_API_Proc. Corn_API_Proc copies the values for ECX and EDX that the app passed us to ECX and EDX and then calls the appropriate function. On return, it copies EAX, ECX, and EDX back to the client area on the stack, so that, on return, the calling app gets these return values. The actual calls here are simple. ComOpen and ComClose give apps a way to ask for the port and relinquish it when they are done. This eliminates the need for a contention MessageBox and for guessing when an app is done with the port.

ComRead and ComWrite essentially copy their data from and into the buffers and return. Doing read/writes of blocks of data is faster than emulating on a byte-by-byte basis and avoids buffer overruns.

VM Creation and Destruction

ComVmTerminate is called every time a VM is terminated. When a VM owns the port, it obviously will not need it any more, so we clear the ownership and call-back address.

ComVmCreate is called every time a VM is created (except the system VM). On creation, we set the instance data to 1200,n,8,1.

ComInt21 and ComInt23_24 are used to determine when to take away ownership of a port. If a program exits, we want to take away its ownership. An app can end with to an int 23 or int 24. It can also end with an int 21, function 4Ch, 31h, or 00h. We take away ownership on an EXEC call.

The Total VxD

When you first look at the total VxD it may seem overwhelming. But if you break it into its component pieces, it becomes easy. The trick is to build the pieces one at a time.

First, build the core code that will talk to the hardware. Once you get this to work, decide which is more critical, the new API or the emulation, and build in that part. Then, build the other. As you do this, you need to keep a couple of things in mind. First, it is absolutely critical that your VxD performs all communication to the physical hardware. Do not let even the smallest part of it be handled directly by an application. For example, port 3FFh is undefined for the 8250. My VxD emulates it and only allows the app that owns the port to access it, rather than assuming that no one will access it. By the same token, port 3FBh is called very rarely, and I probably could have not trapped it. In that case, another VM could have written to it, changing the behavior of the port, and I would never know. Thus, you handle all of the hardware from your VxD for both speed and security reasons.

Create a new API using the direct call in capability. It is much more efficient than trapping ports, interrupts, an so on. While you will still emulate the old API, you will have a much more efficient approach for new code. Also, try to minimize the number of times you have to make calls. Don't make calls to write one byte at a time -- have a call to write a block of data. In most situations, you can write 1 to 4 K as quickly as one byte.

Your emulation must average a certain speed, depending on what you are doing. However, if at 9600 baud the buffers in this VxD slowly fill up, its average speed is slower than 9600 baud. Your either have is to make your emulation faster or live with the limits. Generally you should find that there is only so much you can do to speed up emulation. Emulating a port is a big hit, and emulating an IRQ is a gigantic hit. Compared to real mode, emulation speed versus actual hardware speed is a difference in orders of magnitude. However, in this case, all is not lost. First, you can also trap software interrupts, which is faster than trapping ports and generally eliminates the need for IRQ emulation. In the example of this driver, we could trap int 14h. Unfortunately, most applications don't use int 14h, but we could be faster with those that do. Second, in the case of the this VxD, while we talk to a 8250, we could emulate a 16550 with a FIFO buffer. On an IRQ, an app can read multiple bytes, eliminating the IRQs for all those bytes. By the same token, just because you are written for a specific device does not mean you can't emulate another device more efficiently.

Chapter 13

Now it's time to look at how you can use VxDs to pull tricks in the real world. We'll use Win-Link as an example. As with many real-world projects, I had several reasons for writing this program.

The first part arose when I was having lunch with a number of other authors shortly before the launch of Windows 3.1. They complained that Windows was not 32-bit and was not pre-emptively multi-tasked, while OS/2 was. I immediately set about to refute this. Although little known at the time, Windows 3.1 did have support in it for 32-bit programs. Granted it was minimal and required assembler at first but it did exist (and it is what Win32 uses).

But that left OS/2 as the pre-emptively multi-tasked O/S. So I pointed out that the DOS boxes were pre-emptively multi-tasked under Windows. If a Windows app could talk to a DOS app in a DOS box and have the DOS app do the heavy work, then the Windows app would essentially be multi-tasked.

It made an interesting argument. Almost everyone at lunch was willing to concede that a Windows app could be multi-tasked. But it made me wonder how this could be implemented.

At the same time, there were a couple of features of Windows 3.1 that I found frustrating. When I am in a DOS box and type the name of a Windows program, it tells me that I need Windows to run it. Well, what does it think is running? When typing in the name of a Windows EXE from a DOS box, I want it to run that EXE. I also found the title of DOS boxes a little less than desirable. ALT-TABing through five windows, all called MS-DOS Prompt, usually did not tell me which DOS box was running Brief. I wanted the name of the program. And while I was at it, I had one more pet peeve: You can only print from one DOS box or Windows at a time. The DOS boxes don't spool their printing, they are dedicated to it until the printing completes. Yet Windows has a nice spooler. Everything was there I just wanted the DOS boxes to print to the Windows spooler. Then all the DOS boxes could print simultaneously and do it quickly to the spooler.

Out if this came Win-Link, so named because it linked Windows and DOS applications. Win-Link is essentially two programs in one. First, it provides Interprocess Communication between Windows and DOS boxes as well as shared memory. Second, it extends the User Interface of Windows by (1) launching Windows applications (and additional DOS boxes) from a DOS box, (2) listing the running program as the title of a Windows DOS box, and (3) sending all printer output from DOS boxes to the Windows spooler.

Implementing this was a killer. First of all, a number of the major concepts had not been tried before. While everything should have worked, only one implementation that actually did. In addition, there were a mynah of little details necessary to getting it right. Because the code intercepted calls in every DOS box and made asynchronous calls to Windows, every detail had to be right or the entire system would hang, or worse.

This chapter lays out the basic capabilities of the program to give you a clear picture of what the code is trying to accomplish. Then it details the specific logic used to implement each of these pieces, building on the previous pieces where appropriate. Finally, it walks through and explains the actual code. This chapter does not try to teach you anything general about writing VxDs. Instead, by concentrating on the specifics of a piece of real-world code that pulls a number of interesting hacks, you can learn from it by example.

The System

How does a Windows or DOS app know which DOS box it wants to send a message to? When a DOS box is launched, there is no way to identify it, so each DOS app must register itself with Win-Link when it starts up and unregister itself when it is exiting. An application can also make a call to get the VM handle for an application based on its ID. Therefore, a Windows or DOS application can launch a DOS application and keep polling until it finds the registered application (it needs to keep polling because the new DOS box needs enough time slices to start tip and execute the app to the point it registers itself).

We know how a Windows app can launch a DOS box. However, how does a DOS app launch another DOS box (as opposed to spawning a process)? We add a call allowing a DOS app to launch another DOS app. The parameters are similar to spawning, but instead of spawning in the same VM, Win-Link starts a new VM that runs the app.

Next we need a way to pass messages back and forth. On the Windows side we already have a system, so we merely give DOS boxes a way to call PostMessage. In the other direction, and for between DOS boxes, we have our own message queue. It has three calls, MsgPost to post a message to a VM, Msgpeek to look at a message sent to a VM, and MsgRead to read a message posted to a VM. Unlike Windows messages, these messages cant send pointers, because they are in different address spaces. So we provide two ways to pass blocks of data between VMs. MsgMemCopy copies data from memory in one VM to memory in another VM. MsgMemCopy automatically knows whether the each of the VMs is in V86 or protected mode and interprets the segment/selector appropriately. There are calls to allocate and free LDTs/GDTs for memory in a VM. While real-mode DOS applications cannot access these selectors, Windows apps as well as protected-mode DOS apps can. So a DOS app can pass a LDT to the Windows app to some of its memory. Then both applications can access the memory. These calls give applications a way to communicate with each other between VMs.

Two other sets of calls are provided to DOS applications. Win-Link provides a call to let a DOS application set its Window title. For example, when Brief running having B is preferable to MS-DOS Prompt. Brief - [filename.c] is even nicer. Win-Link also provides a set of calls for printing. While DOS printer output is captured fairly efficiently, again all Win-Link can show for a print job is the name of the application printing the job. By adding a call to open the job, the application can display the name of the document being printed in the Windows spooler. Also, Win-Link generally has to guess when a print job has ended. This can be fixed by adding a call at the end of a job.

Finally, there are the DOS calls Win-Link intercepts. Win-Link intercepts all EXEC calls. On these calls Win-Link determines whether the program being executed is a Windows application. If so, Win-Link checks it against a list of files to execute as DOS apps. If the application is not on that list, Win-Link executes the program from Windows instead of from DOS.

The exception list is there for two reasons. There is no way to differentiate between bound OS/2 applications and Windows applications, so any bound OS/2 app must be on the exception list. Also, some applications have a complete DOS app as their Windows dos-stub program, and you may wish to run the DOS stub. Win-Link intercepts all output sent to LPT1 via int 17h. We do not intercept print I/O directly to the port, nor do we intercept printers on other ports. But all output written to LPT1 at the DOS level eventually gets to int 17h so that output is intercepted. Printing a file performed via the PRINT command or programmatically using PRINT's int 2Fh calls is also intercepted. But printing a file is intercepted at the command level, so that just the file name is passed to Win-Link, which is much more efficient than intercepting the calls to int 17h. When a file prints, the file name is the job name in the Windows print spooler. When a file prints to int 17h, the name of the program is the name of the job. When a program uses the Win-Link call to name a print job, it will be the name the program gave it.

EXEC, TERMINATE, and some other calls are tricked to determine the name of the program running in the DOS box. This name is then matched against a list, which expands predefined names to different names. For example, B changes to Brief. This name is then set as the title of the Window for the DOS box.

The Approach

Win-Link is composed of three parts: (1) Win-Link, a Windows application, (2) Win-IPC, a VxD, and (3) raw.drv, a printer driver. Win-Link and Win-IPC provide the functionality we need. A VxD cannot make Windows calls and a Windows app cannot make VxD calls, so the two programs work together. Raw.drv is needed for printing because many printer drivers in Windows do not implement the PASSTHROUGH escape call.

The primary data structure is called VMDATA and is in both win_link.h and win_ipc.inc. One of these structures exists for each VM, including the system VM. These are set up in a linked-list so that Win-Link or Win-IPC can walk through all the VM's instances of the structure. This gives the VxD full access, with little effort, to any VM data. In addition, the first element is a LDT selector:offset that points to the structure, valid in the system VM. This provides an easy way for Win-IPC to give Win-Link a pointer to the structure for any VM.

In general, Win-IPC or Win-Link changes values in this structure and then sends a message to the other telling it what to look at in the structure. Following is a brief description of each element of the structure.

VmData struc
      VmLdt         dd    0
      VinHandle     dd    0
      Prnsem        dd    0
      MsgSem        dd    0
      TimeHdl       dd    0
      LinkNext      dd    0
      LdtNext       dd    0
      pPsp          dd    0
      MsgGet        dd    0              ; Next Message to read
      NsgPut        dd    0              ; Next free spot
      MsgLast       dd    0              ; Next == free -> empty
      PrntNum       dw    0
      hDc           dw    0
      iPrnErr       dw    0
      iStr          dw    0
      _hW_I1_d_g5   dd    0
      Bufcnt        dw    0
      PrntBuf       db    SIZE_PRNT_BUF dup (0)
      sxtra         db    0, 0
      MsgArr        db    ((size DosMsg) * MAX_DOS_MSG) dup (?)
      sPsp          db    9 dup (0), 0
      sProgName     db    31 dup (' '), 0
      sTitle        db    80 dup (0)
      sExec         db    129 dup (0), 0
      sCmdLine      db    129 dup (0), 0
      sPrntStr      db    129 dup (0), 0
VmData ends

VmLdt, a pointer to the structure in this VM. The LDT pointer is only valid in the context of the system VM (not the VM this structure is for).
IpC, VmHandle is the hVM, as defined by WIN386 for this VM. This value is needed by a number of the VxD functions.
PrnSem and MsgSem are semaphores created for the life of the VM. PrnSem is used for handling int 17h printing, and MsgSem is used to implement an internal SendMessage mechanism (the public interface only supports PostMessage). These semaphores exist for the life of the VM because they are frequently used.
TimeHdl is used when a time-out intercepting int 17h printing is set. This value is non-zero only when a timer event has been set. F:
LinkNext is a flat 32-bit offset to the next VMs VMDATA structure. This value can be used in Win-IPC in any VM to walk to the next VM's structure.
LdtNext is a selector:offset LDT pointer valid in the system VM only. This value can be used in Win-Link to walk to the next VM's structure.
pPsp is a flat 32-bit offset to the PSP of the application presently running in that VM. As a flat 32-bit pointer it is only accessed by Win-IPC.
MsgNext, MsgFree, and MsgEnd are flat 32-bit offsets into MsgArr. They are used to track the queue of messages posted to DOS VMs. MsgNext is the location of the next message to read. MsgFree is the location where the next message will be written (that is, an available location). MsgEnd points to the byte after the end of MsgArr.
PrntNum is the number of bytes presently in PrntBuf. When this value exceeds SIZE_PRINT_BLOCK, the data in PrntBuf is written to the spooler.
hDc is the printer DC for the data presently being redirected from int 17h to the Windows print spooler. This value is 0 if there is presently nothing to print (and therefore no DC open).
iPrnErr is the value returned when an app in a VM calls int 17h to get the LPT status.
iStr is the listbox index of this VM's print job. Each print job is listed in the Win-Link dialog box, and this value is used to delete the job when it has completed printing.
hWnd is the handle to the Window for this DOS box. Determining this is not an exact science, and the handle may be wrong. It is also initially 0 until a guess can be made as to its value.
wFlags is a bitbag of a number of flags. These flags set which of the interception capabilities (such as, exec Windows apps from DOS or print redirection to the spooler) are on.
BufCnt is used when data is sent to the print spooler. The first ? bytes of the buffer are the length of the data in the rest of the buffer. Therefore, we don't pass the address of PrntBuf. Instead, we set BufCnt to the value of PrntNum and pass the address of BufCnt.
PrntBuf holds that data intercepted from int 17h. If every byte intercepted by Win-IPC were posted to Win-Link, the overhead of the message posting would bring the system to its knees. Therefore, once 1K of data has been intercepted. Win-Link is notified to write the data to the spooler.
MsgArr holds the messages posted to DOS VMs. These messages are held until read by the app in a DOS VM. This is a static array, once it is out of space no more messages may be posted until some are read.

The following elements are used to pass data for certain messages. This data is only considered valid between the time when the message is sent to when it is processed. The data is placed here instead of in the message because pointers cannot be passed in a message.

sPsp is a zero-terminated string of the program name in the selected PSP in this VM. This string is pulled from the MCB of the PSP and is here because Win-Link cannot access pPsp.
sProgName is the name of the VM set by the Register call. A DOS app Registers itself to name a VM and another DOS or Windows app, then finds the hVM of the registered DOS app by searching for the named VM.
sTitle is the title to set for this VM's window. This is the value pulled from sPsp or passed when a DOS app sets its title. Translations made by Win-Link (such as B to Brief) are handled by Win-Link when it receives the message telling it to use this value.
sExec is the file presently being exec'ed. This is used if a program is a Windows executable and a message is then passed to Win-Link to exec the program. Win-Link determines whether the program is on the list of programs not to exec from Windows. This is also used when DOS apps are launched by creating a new DOS box.
sCmdLine is the command line for sExec. The command lines are kept separate because at times Win-IPC and Win-Link need to know only the file name.
sPrntStr is the name of a file sent to PRINT to be printed.

Handling VM Creation

Before getting into how we implement any specific piece of Win-IPC/Win-Link, we need to discuss what we do on VM creation. Creation is the platform on which we can provide all our capabilities.

When creating the system VM, we _Allocate_Device_CB_Area for the VMDATA structure for each VM and interrupt we need to intercept (17h, 21h, 23h, 24h, & 2Fh).

BeginProc winIpc_Sys_Critical_Init
; Allocate per-VM instance data
	VMMCall _Allocate_Device_CB_Area, <size VmData, 0>
        cmp     eax, 0
	je      short scil0                 ; No memory - do nothing
	mov     [CbvmData], eax
        and     [SysFlags], not MEM_OFF
; Set up the System VM data
        mov     eax, ebx
	call    GetVmData
	mov     [esi.VmHandle], ebx
        VMMcall Get_Sys_VM_Handle           ; Save System VM
	mov     [SysVM], ebx
scil0:  clc
	ret
EndProc winIpc_Sys_Critical_Init

BeginProc WinIpc_Dev_Init
; Hook interrupts
	mov     eax, 17h                    ; Sit on int 17
	mov     esi, OFFSET32 WinIpc_Int_17
        VMMcall Hook_V86_Int_Chain
        mov     eax, 21h                    ; Sit on int 21
        mov     esi, OFFSET32 WinIpc_Int_21
        VMMcall Hook_V86_Int_Chain
        mov     eax, 23h                    ; Sit on int 23
        mov     esi, OFFSET32 WinIpc_Int_23
        VMMcall Hook_V86_Int_Chain
        mov     eax, 24h                    ; Sit on int 24
	mov     esi, OFFSET32 WinIpc_Int_24
        VMMcall Hook_V86_Int_Chain
        mov     eax, 2Fh                    ; Sit on int 2F
        mov     esi, OFFSET32 WinIpc_Int_2F
        VMMcall Hook_V86_Int_Chain
        clc
        ret
EndProc WinIpc_Dev_Init

For each additional VM created we do a little more. First, we need to initialize VMDATA by performing the following steps:

We zero out all the data (thereby handling all elements that need to be set to 0).
We set the VmHandle (it is EBX on entry) and MsgNext, MsgFree and MsgEnd. We can now accept messages posted to this VM.
We set pPsp to the PSP for the VM. This way we know that pPsp is valid in the rest of our code.
We create the MsgSem and PrnSem semaphores. This allows us to assume these exist in the rest of our code as well as avoid the processor overhead of constantly creating and freeing them.
We create a LDT selector:offset to point to the VMDATA structure that is good in the system VM.
We insert this VMs VMDATA structure into the linked list of all the VM's VMDATA structures. We do this for both LinkNext and LdtNext.

BeginProc WinIpc_VM_Create
        test    (SysFlags], MEM_OFF
        jnz     vmcl0                ; Turned off - do nothing
; Get & zero-fill VmData
        mov     eax, ebx
        call    GetvmData
        mov     edi, esi
        xor     eax, eax
        mov     ecx, (size VmData) / 4
        rep     stosd
; Init VmData
	mov     [esi.VmHandle], ebx
	lea     ecx, [esi] .MsgArr
	mov     [esi].MsgGet, ecx
	mov     [esi].MsgPut, ecx
	mov     eax, MAX_DOS_MSG - 1
	mov     edx, size DosMsg
	mul     edx
	add     eax, ecx
	mov     [esi] .MsgLast, eax
; Get the PSP (via SDA) location
	Push_Client_State
	VMMcall Begin_Nest_Exec
	mov     [ebp.Client_AX), 5D06h
	mov     eax, 21h
	VNNcall Exec_Tnt
	movzx   edx, [ebp.Client_DS]
	shl     edx, 4
        movzx   eax, [ebp.Client_SI]
	add     edx, eax
	add     edx, [ebx.CB_High_Linear]
	add     edx, 10h
	mov     [esi].pPsp, edx
        VMMcall End_Nest_Exec
	Pop_Client_State
; Set up Msg semaphore
        xor     ecx, ecx
	VMMcall Create_Semaphore
	jc      vmcl0
        mov     [esi].MsgSem, eax
; Set up Prn semaphore
	VMMcall Create_Semaphore
	jc      vmclO
	mov     [esi].PrnSem, eax
; Create LDT so Win-Link can access structure
SizeVmData EQU (size VmData)
        VMMcall _BuildDescriptorDWORDs <esi, SizeVmData, EW_Data_Type,\
	                D_GRAN_BYTE, 0>
	VMMcall _Allocate_LDT_Selector <[SysVm], edx, eax, 1, 0>
	rol     eax, 16
	mov     [esi.VmLdt], eax
; Build linked-list
; Do this last so we are only in the list if 1) We are all
; filled in & 2) We were able to set up semaphores, etc.
        mov     edi, esi
	mov     eax, [sysvm]
	call    GetVmData
	mov     eax, [esi.LinkNext]
	mov     [edi.LinkNext], eax
	mov     [esi.LinkNext], edi
	mov     eax, [esi.LdtNext]
	mov     [edi.LdtNext], eax
	mov     eax, [edi.VmLdt]
	mov     [esi.LdtNext], eax
; ... see next listing
; We now send a msg to set the title. We do this here
; so we get the message before another VM is created; we
; just grab the first free VM in Windows.

        PostPm  [SysVm], [Syswndi, MSG_DOS_TITLE, 0, [edi.VmLdt]
vmcl0:  clc
	ret
EndProc WinIpc_VM_Create

At this point we still have two remaining tasks before we are fully ready for the new VM. The easy one is setting the title of the DOS box. The difficult one is, determining the handle of the Window for this VM and we can't set the title until we know the hWnd. Be warned that the method covered here is not completely foolproof. It seems to work about 98 percent of the time. It runs into trouble largely when a bunch of DOS boxes are launched in a row, so that we have several hVM <-> hWnd resolutions pending.

Implementation

We start implementation by posting a message to Win-Link telling it to set the title for this VM. It does this by posting MSG_DOS_TITLE to Win-Link. However, if hWnd is NULL, Win-Link (in the function DosTitle) performs some special processing. This processing exists only for this first call to DosTitle:

; We now send a msg to set the title.
	PostPm [SysVm], [Syswnd], MSG_DOS_TITLE, 0, [edi.VmLdt]
vmcl0:  clc
	ret
EndProc WinIpc_VM_Create

If we are running under Windows 3.1, we set a hook and post a message back to Win-IPC. We cover what this does in a moment because it has no effect until we complete the rest of the processing in DosTitle.

We next walk through all Windows whose class is tty (the class of all DOS box windows). We also check that this window is a DOS box, although this may be merely paranoia on my part. Once we find a tty window, we check whether it is already registered to another of our VMs. If so, we keep looking. If not, we assume that it belongs to this VM. If you are following along in the code you'll notice we also passed in a NULL text string and you will set a potentially wrong hWnd to the title. However, because the string is NULL, the text will not be set. DosTitle actually is two separate functions wrapped in one for historical reasons. I originally attempted to get the hWnd by other means.

// We walk the list of top windows looking for one of class tty
  hWnd = FindWindow ("tty", NULL);
  while (hWnd) {
// See if its a DOS box
    GetClassName (hWnd, sBuf, 5);
    if (lstrcmp(sBuf,"tty")) goto NextWin;
    if (!IsWinOldApTask(GetWindowTask(hWnd))) goto NextWin;
// See if we already have this one
    fpVmOn = fpVmData;
    do {
      if (fpVmon->hWnd == hWnd) goto NextWin;
      if (!(fpVmOn = fpVmOn->LdtNext)) break;
    } while (fpvmon != fpVmData);
// We have it!
    fpVmData->hwnd = hWnd;
// Get the next window
Nextwin:
    hWnd = GetWindow (hWnd, GW_HWNDNEXT);
  }
// We failed
  fpVmData->hWnd = (EWND) -1;

Now we have a hVM == hWnd pairing. But this was merely a guess. This is where the hook comes in. We have hooked all messages being sent to any window; a very expensive hook but quite necessary. We then posted a message to Win-IPC. The message causes _MsgShellEvent in Win-IPC to be called. In _MsgShellEvent we make a VxD call to SHELL_Event. SHELL_Event allows us to send a Windows message to a DOS box window by specifying its hVM, which we do know. So we post a message with a constant in uMsg to ID the message and the selector to VmData (we make use of the fact that all our LDT pointers have an offset of 0) in wParam. In our hook filter proc we look for any message with this message number. When we see it, we set that hWnd as the hWnd for our VM. Finally, we post a message to ourselves. When we receive this message we remove the hook. Once the hook is removed, we no longer impose any overhead on the system. We have the correct hWnd unless someone else sent the same message number between the time we installed the hook and the time SHELL_Event got the message back to us. We now have our hWnd and are initialized for the VM just created.

// ... in DosTitle
  if (uVer >= 0x030A)
    if (iEookCnt++ == 0)
      hhookMsgFilterHook = SetWindowsHook (wH_GETMESSAGE,
       (HOOKPROC) lpfnMsgFilterProc);
      PostHessage (hDlg, MSG_EVENT_ON, 0, fpvmnata->VmHandle);

// ... In main DlgProc
  case MSG_EVENT_ON:
    dShellEvent (lparam);
    break;
  case MSG_EVENT_OFF:
    if (--iEookCnt == 0)
    UnhookWindowsHook (WH_GETMESSAGE,
     (HOOKPROC) lpfnMsgFilterProc);
    break;

// HOOK Call-backs
LRESULT CALLBACK  export __loadds MsgFilterFunc (int nCode, WORD
 wparam, DWORD lparam

  if (((MSG __far *) lparam)->message == 0x6969)
    HandleEvent (iParam);
    return 0;

void __loadds HandleEvent (long lParam)
  VMDATA  far *pVmData;
  pVmData = PTR (((MSG __far *) lparam)->wparam, 0);
  if (!  SelOk ((void  far *) pVmData, sizeof (VMDATA)))
    return;
  pVmData->hWnd = ((MSG __far *) lparam)->hwnd;
  PostMessage (hMainDlg, MSG_EVENT_OFF, 0, 0);

; WIN_IPC.386 dShellEvent
MsgShellEvent proc
        push    ebx
        mov     eax, [ebp.Client_ECX]
        mov     ebx, eax
        call    GetVmData
        mov     ecx, 6969h
        movzx   eax, word ptr [esi.VmLdt + 2]
        xor     esi, esi
        xor     edx, edx
        VxDcall SHELL_Event
	pop     ebx
        ret
endp

Registering DOS Apps

We now need to determine which VM is running our DOS app. To do this Win-IPC provides a call in which a DOS app passes a name to our VmData structure. Another app can Query and Win-IPC will walk the VmData structs to find the one with the matching Iname.

The implementation of this is simple enough that no code is shown, but can be found in the source code on the book's disk. However, it is a critical piece; you can't talk to a DOS app until you know its hVM, and the Register/Query calls provide a means to determine the hVM.

Internal Message Passing

Message posting is the most difficult part of the system. This section discusses how Win-Link and Win-IPC post and send messages to each other. The next section will discuss how applications can post messages, and that functionality makes use of the basic message passing. However, this section only discusses the internal messaging used by Win-Link and Win-IPC.

Win-Link to Win-IPC

When messages pass from Win-Link to Win-IPC a Windows application is calling a VxD. This is always safe if it wasn't Windows would not be receiving any time slices. All messages from Win-Link to Win-IPC are sent as opposed to posted. This is because it is much easier to send than to post and there is no need for posted messages. All parameters are passed in registers. Win-Link then calls the far-call address it received when it initially called int 2Fh with AX=1684h. This calls the entry point in Win-IPC with these registers set.

; EAX: uMsg = Message to post to Win-IPC
; ECX: lParaml = first long param
; EDX: lParam2 = second long param

CallVxd MACRO uMsg, lparaml, lParam2
        mov     ecx, lParaml
	mov     edx, lParam2
	mov     eax, uMsg
        xor     ebx, ebx
        call    dword ptr [WinIpcAddr]
ENDM

This gets a message to WinIPC_PM_Api_Proc in Win-IPC. A jump table is used to go to the handler for the specific message passed in. Because this is also the entry point other Windows applications use to call Win_IPC the procedure first checks to make sure the passed-in message legit number for Windows application. It does this by using the message number as an offset into the table PrnOkTable, which is a table of bytes. If a byte is 0, then the message is not legal; if it is -1, it is legitimate. At the same time the procedure also makes sure that the message number is within the range of handled messages.

BeginProc WinIpc_PM_API_Proc
        movzx   eax, [ebp.Client_AX]
        cmp     eax, [NumPmOk - 1]
        ja      short pap10
        and     eax, 0FFh
        mov     al, [PmOkTable + eax]
        cmp     al, 0
        je      short pap20
pap10:  call    DefMsgProc
	ret
pap20:  mov     [ebp.ClientAX], ERR_UNKNOWN_MSG
	ret                     ; exit error
EndProc WinIpc_PM_API_Proc

DefMsgProc is even simpler. It first looks to see if Win-IPC is on. If the flag MEM_OFF in SysFlags is set, the Win-IPC is turned off. In this case, DefMsgProc does nothing and refuses to handle any messages. DefMsgProc then jmps to the appropriate handler from MsgDispTable. This is a quick way to get to the correct message. We jump instead of call because that saves us a ret when we are done.

DefMsgProc  proc
	test    [SysFlags], MEM_OFF     ; Are we running?
	short   dmp20
	mov     [ebp.Client_AX], ERR_NO_VM_MEMORY
	ret

dmp20:  movzx   eax, [ebp.Client_AX]    ; Get the message
	jmp     [MsgDispTable + 4 * eax]
DefMsgProc  endp

Whichever function is called then executes and returns. When it returns, the return goes back to Win-Link, with the return value passed in AX.

Win-IPC to Win-Link

We want to post messages to Win-Link whenever possible so that we can be in Win-IPC when Windows is in a non-reentrant state. As a matter of fact, almost any time we are in Win-IPC, Windows, and therefore Win-Link, is in a non-reentrant state. This means we cannot make a call to Win-Link from Win-IPC. There is one exception to this rule. PostMessage in Windows was specifically designed to be fully re-entrant. So the one connection we have from Win-IPC to Win-Link is the ability to call PostMessage.

There is still one minor concern. We do not want to call PostMessage if the Windows VM is in the critical section or has interrupts off. This is not an absolute requirement, but it is part of being a good neighbor. Taking the time to post a message while a Windows app (or DLL, more likely) is in a critical section can delay that application enough to cause it major harm -- and bring the system down. We also have to wait until the Windows VM can be scheduled. An immediate call would go into the current VM, which quite possibly is not the Windows VM. Therefore, when LinkMsgProc returns, the message may not yet have been posted. So we have to get a temporary structure to hold our message until we can post it to Windows. Otherwise, the message could be overwritten as soon as LinkMsgProc returned.

SendMessage

The function LinkMsgProc is used for both posting and sending messages. The following code is an abbreviated version showing just those parts relevant to PostMessage. The parameter checking is not displayed here, either. For a full discussion of the code, see the discussion of SendMessage that follows.

LinkMsgProc proc
; Get a VmMsg struct
dmp70:  mov     cx, [VmMsgAlloc]
	mov     edi, [VmMsgOff]
	mov     eax, [ebp.Client_EBX]
dmp80:  xchg    [edi.Handle], eax
	cmp     eax, 0
	je      short dmp90
	xchg    [edi.Handle], eax
	add     edi, size VmMsg
	loop    dmp80
	mov     [ebp.Client_AX], ERR_MSG_FULL
	ret

; edi points to a VMMSG struct
dmp90:  mov     eax, [ebp.Client_EAX]   ; save message
	mov     [edi.lParam1], eax
	mov     eax, [ebp.Client_EDX]
	mov     [edi.lParam2], eax
	mov     eax, [ebp.Client_ECX]
	mov     [edi.lWndMsg], eax
	mov     [edi.VmOff], esi

; lets generate the call-back
	mov     eax, Low_Pri_Device_Boost
	push    ebx
	mov     ebx, [esi.VmHandle]
	mov     ecx, PEF_Wait_For_STI or PEF_Wait_Not_Crit
	mov     edx, edi
	mov     esi, OFFSET32 HandleCallBack
	VMMcall Call_Priority_VM_Event
	pop     ebx

	mov     edx, [edi.Rtn]		; rtn regs & Client_regs
	mov     [ebp.Client_EDX], edx
	mov     eax, ERR_NONE
	mov     [ebp.Client_EAX], eax
	ret
LinkMsgProc  endp

This code has not necessarily posted a message. It has merely saved it in the structure and set up a call to HandleCallBack. If the Windows VM had interrupts on and was not in a critical section, HandleCallBack was called before Call_Priority_VM_Event returned. Either way, HandleCallBack has been, or shortly will be, executed.

HandleCallBack first pushes the client state so it can modify the VM's registers. It then moves the message values to the client registers on the stack. These are the values the registers will have when Resume_Exec is called. HandleCallBack then sets up a nested execution call to _dMsgProc in Win-Link. This code makes a call to PostMessage to get the message posted. On return from Resume_Exec, the message is posted, assuming that there was room in the queue for it. Finally, the VMMSG struct is marked as free and the client registers are taken off the stack. When HandleCallBack returns, it has returned the VM to its original state.

HandleCallBack proc
        Push_Client_State
	mov     edi, edx                ; Get pointer
	mov     eax, [edi.lParam1]      ; Set up registers
	mov     [ebp.Client_EAX], eax
	mov     eax, [edi.lParam2]
	mov     [ebp.Client_EDX], eax
	mov     eax, [edi.lWndMsg]
	mov     [ebp.Client_ECX], eax
	mov     [ebp.Client_EBX], edi

	mov     edx, [SysCallBack]
	mov     cx, dx                  ; Call the sucker
	shr     edx, 16
	VMMcall Begin_Nest_Exec
	VMMcall Simulate_Far_Call
	VMMcall Resume_Exec
	VMMcall End_Nest_exec
	mov     eax, [ebp.Client_EAX]	; save rtn value
	mov    [edi.Rtn], eax
	mov    [edi.Handle], 0		; Mark VmMsg avail
	Pop_Client_State
	ret
HandleCallBack endp

Win-Link

On the Win-Link side, the message has to be posted via the Windows PostMessage API. This is not as trivial as merely passing our parameters to PostMessage. Unfortunately, in a number of send messages we need to pass two DWORDs as well as a WORD. Since the standard Windows message does not have this capacity, we have to build it in. Because we use the same code to post and send, we must build into post also. Also, Win-Link maintains another array of message strucs that hold the incoming message. The actual message posted to Win-Link is a pointer to this structure.

_dMsgProc proc far
	push    si
	push    ds
	push    bp
	push	0
	mov	bx, sp

	push    ax
	push    cx
	mov     ax, _DATA
	mov     ds, ax
	mov     cx, NUM_MSG
	mov     si, offset _DATA:MsgData

mp10:   mov     ax, 0FFFFh
	xchg    ds:[si.InUse], ax
	cmp     ax, 0
	je      mp20
	add     si, size VXDMSG
	loop    mp10
	pop     cx
	pop     ax
	jmp     mp30
mp20:
	pop     cx
	pop     ax
	mov     dword ptr ds:[si.mWnd], ecx
	mov     dword ptr ds:[si.mwParam], eax
	mov     ds:[si.mlParam], edx
	mov     ds:[si.mEDI], ebx

	push    ds:[MainWnd]
	push    MSG_WIN_IPC
	push    0
	push    ds
	push    si

	call    PostMessage

mp30:   add     sp, 2
	pop     bp
	pop     ds
	pop     si
	ret
dMsgProc endp

This pushes the message into the Windows message queue. We have to look at what happens when it pops out the other end.

For this we look at the function MainDlgProc in win_link.c. Again, we abbreviate it to show just the PostMessage code. We find that we post a plain old Windows message, so we go hack into the message queue.

  case MSG_WIN_IPC:
    pVxdMsg = (VXDMSG _far *) lParam;

    //Lots of SendMessage code...

    PostMessage (pVxdMsg->hwnd, pVxdMsg->uMsg, pVxdMsg->wParam,
     pVxdMsg->lParam);
    pVxdMsg->Inuse = 0;
    break;

This is not necessarily the best way to handle a post; but it works.

SendMessage to Win-Link

To get from SendMessage to Win-Link, we merely add two additional pieces to the puzzle. First, in LinkMsgProc we block on a semaphore after posting the message. This semaphore is then unblocked by a call Win-Link makes after the message has been processed. Because of this semaphore, it is critical that we do not send a message from the Windows VM. If we do we will block the Windows VM, and if the Windows VM is blocked it will never execute the code to unblock the semaphore.

The second addition to the code involves returning a value. The main reason to call SendMessage instead of PostMessage is that you need to know the return value from SendMessage. So we start with LinkMsgProc again. We add a semaphore, block on after setting an event to HandleCallBack, and destroy the semaphore when we have unblocked. We create and destroy the semaphore on a per-message basis for two reasons. First, there can be multiple SendMessages, so we can't use a single semaphore. Second, a SendMessage is a pretty rare event, so the overhead is not a killer.

The handle to the semaphore is included in the message structure. The handle is needed by Win-Link to make a call back to Win-IPC, telling it to unblock that semaphore. We first check to see whether IPC is turned on or off. If it is turned off we do not accept any messages. Then we check to see whether we are sending a message from a Windows app to a Windows app. There is no reason for that to go through us, so we don't allow it. Next we get the VmData struct for the receiving VM. GetVmData returns a pointer to VmData in ESI. This also assures us that we are sending a message to a VM that exists.

We now check to make sure we have an address to call in the Windows VM to get to PostMessage. The flag IPC_OFF should be set if this is NULL, but I like to be paranoid in cases like this. We then go into the code we saw before to get a VMMSC struct. This struct holds our passed-in message parameters, the semaphore we use to block, and the return value from the SendMessage call. This data is allocated to this message until the semaphore is unblocked at the end of ListMsgProc.

LinkMsgProc proc
; We have a message to post_send.
; We can't send a msg from Windows to Windows!!

dmp40:  test    [SysFlags], IPC_OFF	; Are we running?
	jz      short dmp50
	mov     [ebp.Client_AX], ERR_NO_WIN_APP
	ret
dmp50:
	cmp     ebx, [SysVm]		; Win Msg to WinMsg?
	jne     short dmp60
	cmp     ebx, [ebp.Client_EBX]
	jne     short dmp60
	mov     [ebp_Client_AX], ERR_WIN_TO_WIN
	ret
dmp60:
	mov     eax, [ebp_Client_EBX]	; Get destination VM
	call    GetVm
	jc      short dmp65
	call    GetVmData

	cmp     [SysCallBack], 0
	jne     short dmp70
dmp65:  mov     [ebp.Client_AX], ERR_UNKNOWN_VM
	ret

; Get a VmMsg struct
dmp70:	mov     cx, [VmMsgAlloc]
	mov     edi, [VmMsgOff]
	mov     eax, [ebp.Client_EBX]
dmp80:  xchg    [edi.Handle], eax
	cmp     eax, 0
	je      short dmp90
	xchg    [edi.Handle], eax
	add     edi, size VmMsg
	loop    dmp80
	mov     [ebp.Client_AX], ERRMSG_FULL
	ret

Here is where we start to differentiate because we are sending a message. First we create a semaphore, and this value is stored in our VMMSO structure. Following that, we set up the rest of the structure and then set up an event to call HandleCallBack, just as we did in PostMessage.

dmp90:  test    [ebp.Client_EAX], FLAG_SEND_MSG ; send?
	jz      short dmp110
	xor     ecx, ecx		; Set up a semaphore
	VMMcall Create_Semaphore
	jnc     short dmp100
	mov     [ebp.Client_AX], ERR_NO_SEMAPHORE
	ret

dmp100: mov     [edi.SendSem], eax
dmp110: mov     eax, [ebp.Client_EAX]	; save message
	mov     [edi.lParam1], eax
	mov     eax, [ebp.Client_EDX]
	mov     [edi.lParam2], eax
	mov     eax, [ebp.Client_ECX]
	mov     [edi.lWndMsg], eax
	mov     [edi.VmOff], esi

; lets generate the call-back
	mov     eax, Low_Pri_Device_Boost
	push    ebx
	mov     ebx, [esi.VmHandle]
	mov     ecx, PEF_Wait_For_STI or PEF_Wait_Not_Crit
	mov     edx, edi
	mov     esi, OFFSET32 HandleCaliHack
	VMMcall Call_Priority_VM_Event
	pop     ebx
	mov     edx, [edi.Rtn]		; rtn regs & Client_regs

The rest of the function is send-specific. The semaphore is blocked to stop LinkMsgProc from returning until after the semaphore is unblocked. In the meantime, before or after the semaphore is blocked, HandleCallBack calls Win-Link, which processes the message. When the message has been processed, Win-Link makes a call to Win-IPC, passing the semaphore and return value. This call in Win-IPC sets the return value in the VMMSG struct and clears the semaphore.

The end result of this is that when WaitSemaphore returns, the return value of the SendMessage is in EDI.Rtn. All that is left to do is to destroy the semaphore, free up the VMMSU struct, and return the result from SendMessage.

Note that the value is returned in DX. AX is always the status returned from the call so that you can differentiate between a 1 returned from SendMessage and an error code of 1.

	test    [ebp.Client_EAX], FLAG_SEND_MSG ; send?
	jz      short dmp130

dmp120: mov     eax, [edi.SendSem]
	mov     ecx, Block_Svc_Ints or Block_Enable_Ints
	VMMcall Wait_Semaphore		; block until sent

	mov     eax, [edi.SendSem]
	VMMcall Destroy_Semaphore	; destroy it
	mov     edx, [edi.Rtn]		; rtn regs & Client_regs
	mov     [edi.Handle], 0		; Mark VmMsg avail
dmp130: mov     [ebp.Client_EDX], edx
	mov     eax, ERR_NONE
	mov     [ebp.Client_EAX], eax
	ret
LinkMsgProc endp

So what happens differently in HandleCallBack? Nothing! There is a different code path for a SendMessage to a VM other than the system VM, but a SendMessage to the system VM is identical to a PostMessage. The same goes for _dMsgProc in Win-Link. Which brings us to MainDlgProc. I have shown the full code for handling a message from Win-IPC, but the part executed when we send a message from Win-IPC to Win-Link is the part that creates the SendDlg struct and passes that. So all the messages we send to Win-Link are sent from the MSG_WIN_IPC case back to MainDlgProc, with all the variables passed in a struct that lParam points to. The return value to be passed back is set in that struct. When the internal SendMessage call returns, we call dPostMsg, passing the return value and a pointer to the VMMSG struct that is holding the sent message on the Win-IPC side. This call sets the return value in VMMSU and clears the semaphore. Finally, the VXDMSG struct is freed. At this point the message has been processed, but we still need to go back to Win-IPC, pass the return value, and clear the semaphore.

  case MSG_WIN_IPC:
    pVxdMsg = (VXDMSG _far *) lParam;

    if (pVxdMsg->wFlags & 0x0001) {

      if (pVxdMsg->hWnd != hDlg)
	lRtn = SendMessage (pVxdMsg->hWnd, pVxdMsg->uMsg,
	 pVxdMsg->wParam, pVxdMsg->lParam);
      }else{
	SendDlg.lParam = pVxdMsg->lParam;
	SendDlg.wParam = pVxdMsg->wParam;
	SendDlg.lRtn = 0;
	SendMessage (pVxdNsg->hWnd, pVxdMsg->uMsg, 0,
	 (long) (LPVOID) &SendDlg);
	lRtn = SendDlg.lRtn;
      }
      dPostMsg (_MSG_SEND_RTN, lRtn, pVxdMsg->lEDI);
    }else{
      PostMessage (pVxdMsg->hWnd, pVxdMsg->uMsg, pVxdMsg- >wParam,
       pVxdMsg->lParam);
    }
    pVxdMsg->InUse = 0;
    break;

The message MSU_SEND_RTN works its way through the dispatching code and ends up at _MsgSendRtn. _MsgSendRtn checks to make sure the passed-in pointer is good, then places the return value in VMMSU and clears (signals) the semaphore. This causes the Block_Semaphore in LinkMsgProc to return with the original SendMessage call.

_MsgSendRtn proc
; Check edi (points to VmMsg, good handle)
	mov     edi, [ebp.Client_EDX]
	mov     ecx, [VmMsgOff]
	cmp     edi, ecx
	jb      short msr10

	mov     eax, size VmMsg
	mul     [VmMsgAlloc]
	add     eax, ecx
	cmp     edi, eax
	jae     short msr10
	cmp     ebx, [edi.Handle]
	jne     short msr10

; Its ok - save the rtn value & turn semaphore off
	mov     eax, [ebp.Client_ECX]
	mov     [edi.Rtn], eax
	mov     eax, [edi.SendSem]
	VMMcall Signal_Semaphore

msr10:  ret
MsgSendRtn endp

We have thus sent a message from Win-IPC to Win-Link. Definitely not a trivial undertaking, hut not terribly complicated or convoluted.

Other Design Considerations

PostMessage is coded to be totally re-entrant, but it does have one blind spot: PostMessage itself is not re-entrant. In other words, you can call PostMessage when any other code in Windows is being executed, but you cannot call PostMessage when PostMessage is executing.

The only time this comes up is when you post a message in an interrupt handler in your VxD and while the message is being posted, another interrupt comes in so that you post again. Using PostMessage under these conditions causes the first message to disappear. This is not a good idea anyway -- you would probably max out the message queue under such a design.

You need to make sure that any memory touched by Win-Link while in _dMsgProc is locked down in physical memory. Again, because we can call this at any time, the code and data used cannot be swapped out to disk. If it were, you would use whatever happened to be there instead or fault, depending on the state of the system at the time. That is why Win-Link locks down its code and data when it starts. It is not necessary to lock the entire program down (I did it because Win-Link is small model), but it is critical that every byte of code and data that you touch at this time is locked down.

Message Passing Between VMs

Message passing between applications takes three forms: (1) DOS app to Windows app, (2) Windows App to DOS app, and (3) DOS app to DOS app (between different VMs). And after trying several approaches to this kind of message passing, I settled on allowing only posting, not sending. This eliminates all the re-entrancy problems that send messages cause. In addition, Win-IPC does not call a DOS box with a message. A DOS box has to poll. This is less efficient but is a lot safer. And with shared memory you can add code to set a flag before posting a message.

DOS to DOS, Windows to DOS

Posting to a DOS app involves three functions, MsgPost, MsgPeek, and MsgRead in WinIpc.asm. To post a message to a DOS app the calling app will call MsgPost. MsgPost will then place the passed parameters in a DOSMSG struct that is held in an array in the VmData for the receiving VM. This is an array of a set size (just like a Windows app), so the first test is to make sure that space exists in the array. If it does not, the post will fail. If there is room, the message is stored in the structure and the structure pointer MsgLast is incremented to the next slot. We have now posted the message to the queue.

If the DOS app receiving the message calls MsgRead, it is blocked on a semaphore. We signal the semaphore to free it up. If MsgRead has not been called yet, it is called to read the message. Because we already signaled the semaphore, when MsgRead calls Block_Semaphore it returns instantly.

Finally, we boost the execution priority of the receiving VM. The theory behind this is that this VM has been waiting for the message. We now want to give it a boost so it can get started processing the message. Depending on your application, you may prefer not to include this step. It gives you a faster response but makes Windows freeze for a moment. In the following code fragment I have removed the part that handles messages posted to a Windows app. This is the code that handles posting to a DOS app.

MsgPost proc
mp10:   call    GetVmData                  ; ESI = VmData of dest VM
; Do we have room in the message array???
; NO if Write == Read-1 OR (Read == MsgArr
; AND Write == last element)
	mov     eax, size DosMsg
	mov     edi, [esi].MsgGet
	sub     edi, eax
	cmp     edi, [esi].MsgPut	; Write == Read-1?
	je      short mp90		; YES

	lea     edx, [esi].MsgArr
	cmp     [esi].MsgGet, edx	; Read == MsgArr?
	jne     short mp20		; NO
	mov     eax, [esi].MsgLast
	cmp     eax, [esi].MsgPut	; AND Write == last
	je      short mp90

; OK we can store it
mp20:   mov     edi, [esi].MsgPut
	mov     ax, [ebp.Client_CX]
	mov     [edi].dWnd, ax
	mov     ax, [ebp.Client_CX]
	mov     [edi].dMsg, ax
	mov     ax, [ebp.Client_DI]
	mov     [edi].dwParam, ax
	mov     eax, [ebp.Client_EDX]
	mov     [edi].dlParam, eax

; inc free, roll it if past end
	add     [esi].MsgPut, size DosMsg
	mov     eax, [esi].MsgLast
	cmp     [esi].MsgPut, eax
	jbe     short mp30
	lea     eax, [esi].MsgArr
	mov     [esi].MsgPut, eax

; Signal read we have a message
mp30:   mov     eax, [esi.Msgsem]
	VMMcall Signal_Semaphore

; Boost the execution priority of the guy we call
; so it gets the message ASAP.
	mov	eax,Low_Pri_Device_Boost
	VMMCall Adjust_Exec_Priority

mp40:   mov     [ebp.Client_EAX], ERR_NONE
	ret

mp90:   mov     [ebp.Client_EAX], ERR_MSG_FULL
	ret
MsgPost endp

We now have a message in the queue for a DOS VM. There are two calls to handle getting the message to the DOS app. The first call is Msgpeek. When a DOS app calls MsgPeek, it gets a copy of the next message in the queue. If there is no message, Release_Time_Slice is called and a no-message error is returned. This call assumes MsgPeek is only called in an idle loop. If you make this call to check for an abort message, you might want to remove the Release_Time_Slice.

MsgPeek proc
	mov     eax, ebx
	call    GetVmData                  ; ESI = VmData of VM

; do we have one?
	mov     edi, [esi].MsgGet
	cmp     edi, [esi].MsgPut
	je      short mpk90

; Lets fill it in
	mov     eax, [ebp].Client_EDX
	call    V86ToPmPtr
	mov     esi, edi
	mov     edi, eax
	mov     ecx, (size DosMsg) / 2
	rep     movsw

	mov     [ebp.Client_EAX], ERR_NONE
	ret

mpk90:  VMMcall Release_Time_Slice
	mov     [ebp.Client_EAX], ERR_NO_MSG
	ret
MsgPeek endp

The second call is MsgRead. Although MsgPeek will return the contents of the next message, MsgRead actually removes a message from the queue. The first step is to call is called, putting a message in the queue and signaling the semaphore. Next, the message is filled in and the pointer MsgGet is incremented to the next location in the queue. The message is then returned.

MsgRead proc
	mov     eax, ebx
	call    GetVmData                  ; ESI = VmData of VM

;Lets block if there are no messages
	mov     eax, [esi.MsgSem]
	mov     ecx, Block_Svc_Ints or Block_Enable_Ints
	VMMcall Wait_Semaphore

;Lets fill it in
mr10:   Save    <esi>
	mov     eax, [ebp].Client_EDX
	call    V86ToPmPtr
	mov     esi, [esi].MsgGet
	mov     edi, eax
	mov     ecx, (size DosMsg) - 2
	rep     movsw
	Restore <esi>

; inc next, roll it if past end
	add     [esi].MsgGet, size DosMsg
	mov     eax, [esi].MsgLast
	cmp     [esi].MsgGet, eax
	jbe     short mr20
	lea     eax, [esi].MsgArr
	mov     [esi].MsgGet, eax

mr20:   mov     [ebp.Client_EAX], ERR_NONE
	ret
MsgRead endp

DOS to Windows

Posting a message from a DOS app to a Windows app piggybacks on the internal message passing system. The DOS app needs to know the handle of the Windows app it is posting to. Then it just calls our internal PostMessage routine, passing it the message parameters. The message is then passed to Win-Link, which posts the message. The following code shows just the to Windows part of MsgPost.

MsgPost proc
	mov     eax, [ebp.Client_EBX)
	call    GetVm
	jc      short mp05
	cmp     eax, [SysVm]
	jne     short mp10

	PostPm  [SysVm], [ebp.Client_CX], [ebp.Client_ECX+2],\
			[ebp,Client_DI], [ebp.Client_EDX]
	mov     [ebp.Client_EAX], ERR_NONE
	ret

mp05:   mov     [ebp.Client_EAX], ERR_UNKNOWN_VM
	ret
mp10:   ; post to DOS app code ...
MsgPost endp

Shared Memory and Copying Data Between VMs

Posting messages has a couple of disadvantages: It has a high overhead, it has a high latency (slow response time), and it has a queue limit. Most of all, you cannot pass pointers, just data in the registers themselves. Therefore we need calls to let us share memory. This capability comes to us in three calls, which let us copy data from one VM to another and give us pointers in one VM to data in another VM. Unfortunately, the pointer trick only works in protected-mode apps. A protected-mode app can get a pointer to data in a real mode app, but, because a real-mode app uses segments instead of selectors, this is a one-way street. The real-mode DOS app cannot get a pointer to memory in a Windows application.

Copying Memory

The MsgMemCopy function copies data from any VM to any other VM. It assumes that any VM other than the system VM is a real-mode address. The code for this is very simple: The pointers are converted to flat 32-bit pointers and the data is then copied. The function V86ToPmPtr converts the pointer/VM pairs to the flat 32-bit offsets.

V86ToPmPtr proc
	Save    <edx>
	cmp     ebx, [SysVm]
	jne     short vtp20

	Save    <ecx>
	push    eax
	shr     eax, 16
	VMMcall _SelectorMapFlat <[SysVm], EAX, 0>
	pop     edx

	cmp     eax, -1
	je      short vtp10

	and     edx, 0FFFFh
	add     eax, edx
	Restore <ecx,edx>
	clc
	ret
vtp10:
	Restore <ecx,edx>
	stc
	ret
vtp20:
	movzx   edx, ax
	shr     eax, 12
	and     eax, 0FFFF0h
	add     eax, edx
	add     eax, [ebx.CB_High_Linearl

	Restore <edx>
	clc
	ret
V86ToPmPtr endp

GetVm performs a very simple function. If the passed-in value in EAX is 0, GetVm returns the system VM in EAX. Otherwise, it leaves EAX alone, assuming it is the handle to a VM. In debug mode GetVm validates the VM handle. Thus, it is a way to convert any passed-in VM handle from our system that maps a handle of 0 to the system VM, and in debug mode validates the handle.

GetVm  proc
	or      eax, eax
	jnz     short gv10
	mov     eax, [SysVm]
gv10:   Save	<ebx>
	mov     ebx, eax
	VMMcall Validate_VM_Handle
	Restore <ebx>
	ret
GetVm  endp

This function is not affected by what VM is currently running. However, the memory at both ends of this copy had better be locked down. The error-checking code has been removed from the following to make the sample clearer.

MsgMemCopy  proc
	Save    <ebx>

; Get the params
	mov     eax, [ebp.Client_EBX]
	call    GetVm
	mov     ebx, eax

	mov     eax, [ebp.Client_ESI]
	call    V86ToPmPtr
	mov     esi, eax

	mov     eax, [ebp.Client_EDX]
	call    GetVm
	mov     ebx, eax

	mov     eax, [ebp.Client_EDI]
	call    V86ToPmPtr
	mov     edi, eax

	mov     ecx, [ebp.Client_ECX]
	Restore <ebx>

; Copy the dwords
	Save    <ecx>
	shr     ecx, 2
	rep     movsd
	Restore <ecx>
	and     ecx, 03h
	jz      short mmc30
	rep     movsb

mmc30:  mov     [ebp.Client_EAX], ERR_NONE
	ret
MsgMemCopy  endp

Ldt and Gdt Pointers

The pairs of calls to create and free LDT and GDT pointers are MsgMemFreeLdt, MsgMemGdt, and MsgMemFreeGdt. The GDT calls are similiar except that you do not need to specify in which sector VM will be used.

MsgMemLdt first verifies that the VM where the memory is located is good. It then calls V86ToPmPtr to get the flat offset of the memory location. It next tests the limit. Because we are returning a 16:16 pointer, we have to ensure that the limit does not exceed 64K. Finally, we verify that the VM that will use the returned LDT pointer is legit.

We use the pair of calls _BuildDescriptorDWORDs and _Allocate_LDT_Selector to create a LDT pointer from the passed-in parameters.

MsgMemLdt proc
	Save    <ebx>

	mov     eax, [ebp.Client_EDX]
	call    GetVm
	jc      short mm130
	mov     ebx, eax

; Get flat address
	mov     eax, [ebp.Client_EDI]
	call    V86ToPmPtr
	jc      short mm130
	mov     esi, eax

; Get the limit
	mov     edi, [ebp.Client_ECX]
	test    edi, 0FFF00000h
	jnz     short mm130

	mov     eax, [ebp.Client_EBX]
	call    GetVm
	jc      short mm130
	mov     ebx, eax

; Create it
	VMMcall _BuildDescriptorDWORDs <esi, edi, RW_Data_Type, D_GRAN_BYTE, 0>
	VMMcall _Allocate_LDT_Selector <ebx, edx, eax, 1, 0>
	Restore <ebx>
	mov     [ebp.Client_AX], ax
	ret

mm130:  Restore <ebx>
	mov     [ebp.Client_EAX], 0
	ret
MsgMemLdt endp

Freeing an LDT is even easier. Again, because a VM handle of 0 needs to be converted we call GetVm. Then we call _Free_LDT_Selector to free the LDT.

Whether you use LDTs or GDTs, the free call is critical. There are only 8K of GDTs in the entire system and only 8K of LDTs in each VM. If you have a leak where you allocate and don't free pointers, you will bring the system to its knees sooner or later.

MsgMemFreeLdt proc
	Save    <ebx>

	mov	eax,[ebp.Client_EBX]
	call	GetVm
	jc	short mf120
	mov     ebx, eax

	movzx   edx, word ptr [ebp.Client_EDX]
	VMMcall _Free_LDT_Selector <ebx, edx, 0>
	Restore <ebx>
	mov     [ebp.Client_EAX], 0
	ret

mf120:  Restore <ebx>
	mov     [ebp.Client_EAX], ERR_UNKNOWN_VM
	ret
MsgMemFreeLdt endp

Launching a DOS box

A Windows app can launch a DOS or Windows app with no help from us. The trick is for a DOS app to launch a Windows app or another DOS box.

This is painfully easy. The DOS app sends a message to Win-Link, which calls DosExec in Win-Link. This call passes a file to exec and a run parameter. This file can be a DOS or Windows app. Win-Link will then call WinExec to launch the app. The app is launched in the mode specified. If the mode is SW_HIDE, the app is launched but you will not even see an icon for it.

void DosExec (HWND hDlg,LONG lParam) {
  BYTE _far *fpsFile;
  SENDDLG _far *fpSendDlg;
  VMDATA _far *fpVmData;

  fpSendDlg = (SENDDLG _far *) lParam;
  fpVmData = (VMDATA _far *) fpSendDlg->lParam;
  fpsFile = fpVmData->sExec;

  if (WinExec (fpsFile, fpSendDlg->wParam) <= 32)
    fpSendDlg->lRtn = 0L;
  else
    fpSendDlg->lRtn = -1L;

  *(fpVmData->sExec) = 0;

Launching Windows Applications from DOS

We now come to the initial instigation for the Win-Link program. In a window at the DOS prompt you type the name of a Windows app and it returns saying, "This program requires Microsoft Windows".

And the initial thought I always had was: What do you think is running? Granted this was partially a problem with wording -- I have seen some applications that will sense if Windows is running and, if it is, gives you a better message. But still, Windows is running and I want it to start up my Windows app, even if I type the command from the DOS command line. So we will now go through this process.

The first step is to intercept the int 21h call to exec a DOS program. (Note: all the following code fragments show just the necessary parts to catch the DOS exec. I have also removed the special case code for win.com.) If you type win at the DOS command prompt, Win-Link had some special handling. This is remnant from Windows 3.0 days when Windows would let you start Windows in a DOS box.

The first thing we do is open the .EXE file. We have to be careful here because if share is loaded and this is a Windows EXE that is already running, we will get a sharing violation. So we also have an int 24h hooker to catch the violation. This stops it from appearing in the DOS box.

BeginProc WinIpc_Int_24
        mov     eax, ebx
	call    GetVmData

	test    [test.wFlags], I24_ON
	jz      short i24_10

	mov     [ebp.Client_AL], 3
	clc
	ret

i24_10: stc
	ret
EndProc WinIpc_Int_24

If the open fails, we check the return code. If it is a sharing violation, we pass it on to Win-Link to try and exec because the odds are pretty good that its a Windows app. If it is a different error we pass it on to DOS for a try.

If the open succeeded, we next read to see if it is a New Executable format file. Unfortunately all this means is it is not real mode. However, there is no way to tell if it's a Windows or OS/2 application.

If the file does not have the NE signature, we pass it on to DOS. Up to this point our hit has been minimal. Yes we did an open, but DOS will open the same file again so all we did is get it in the cache sooner.

BeginProc WinIpc_Int_21
i21_70: cmp     [ebp.Client_AX], 4B00h  ; EXEC, func 0?
	jne     i2l_160

	test    [SysFlags], EXEC_OFF    ; EXEC off?
	jnz     i21_160

	Push_Client_State
	VMMcall Begin_Nest_Exec

	push    edi                     ; local vars
	push    esi                     ; local vars
	sub     esp, size DiStk
	mov     edi, esp
	mov     [edi.hVm], ebx

	movzx   edx, [ebp.Client_ES]    ; get offset to cmd line
	shl     edx, 4
	movzx   eax, [ebp.Client_BX]
	add     edx, eax
	add     edx, Eebx.CB_High_Linear]

	movzx   eax, word ptr [edx+4]
	shl     eax, 4
	movzx   edx, word ptr [edx+2]
	add     edx, eax
	add     edx, [ebx.CB_High_Linear]
	mov     [edi.pCmd], edx

	movzx   edx, [ebp.Client_DS]    ; get offset to file name
	shl     edx, 4
	movzx   eax, [ebp.Client_DX]
	add     edx, eax
	add     edx, [ebx,CB_High_Linear]
	mov     [edi.pFn], edx

	or      [esi.wFlags], I24_ON
	mov     eax, 3D20h              ; open file
	VxDint  21h
	jnc     short i2l_110           ; NO error on open
	and     [esi.wFlags], not 124_ON
	cmp     al, 5                   ; file locked?
	jne     i2l_150                 ; NO - leave it to DOS
	jmp     i21_120
i21_110:
	and     [esi.wFlags], not 124_ON
	mov     [edi.hFile], ax
	mov     ebx, eax

	mov     eax, 3F00h              ; read MZ
	mov     ecx, 2
	lea     edx, [edi]+RwBuf
	VxDint  21h
	jc      i21_140
	cmp     word ptr [edi.RwBuf], 5A4Dh
	jne     i21_140

	mov     eax, 4200h              ; seek to offset
	xor     ecx, ecx
	mov     edx, 3Ch
	VxDint  21h
	jc      i21_140

	mov     eax, 3F00h              ; read offset
	mov     ecx, 4
	lea     edx, [edi]+RwBuf
	VxDint  21h
	jc      i2l_140

	movzx   edx, word ptr [edi.RwBuf] ; Seek to new EXE
	movzx   ecx, word ptr [edi.RwBuf+2]
	mov     eax, 4200h
	VxDint  21h
	jc      i21_140

	mov     eax, 3F00h              ; read NE
	mov     ecx, 2
	lea     edx, [edi]+RwBuf
	VxDint  21h
	jc      i2l_140
	cmp     word ptr [edi.RwBuf], 454Eh
	jne     i21_140

	mov     bx, [edi.hFile]         ; close file
	mov     eax, 3E00h
	VxDint  21h

Ok, we may have a Windows app, so we copy the file name and command line into our structure and send a message to Win-Link. Win-Link will return a 0 if it launched the program successfully. In that case we return, eating the interrupt call. This will return the DOS box back to the DOS prompt.

If Win-Link returns non zero, then it could not launch the app. In that case we return with carry set and the interrupt is passed on to DOS. DOS then attempts to launch the application.

i2l_120:mov     ebx, [edi.hVm]
	mov     eax, ebx
	call    GetVmData

	push    edi
	push    esi

	mov     edi, [edi].pFn
	lea     esi, [esi).sExec        ; copy fn
	xchg    esi, edi
	mov     ecx, 128 / 4
	rep     movsd

	pop     esi
	pop     edi
	push    edi
	push    esi

	mov     edi, [edi] .pcmd        ; copy command line
	lea     esi, [esi] .sCmdLine
	xchg    esi, edi
	mov     ecx, 128 / 4
	rep     movsd

	pop     esi
	pop     edi

	mov     edx, [esi.VmLdt]
	Save    <edi,esi>
	SendPm  [SysVm], [Syswnd], MSG_WIN_EXEC, 0, edx

	Restore <esi,edi>
	mov     ebx, [edi.hVm]

	cmp     edx, 0                  ; WinExec OK?
	jne     short i21_150

	add     esp, size DiStk
	pop     esi
	pop     edi
	VMMcall End_Nest_Exec
	Pop_Client_State

	clc                             ; return done
	ret
i21_140:
	mov     bx, [edi.hFile]         ; close file
	mov     eax, 3E00h
	VxDint  21h
i2l_150:
	mov     ebx, [edi,hVm]
	add     esp, size DiStk
	pop     esi
	pop     edi
	VMMcall End_Nest_Exec
	Pop_Client_State
i2l_160:
	stc                             ; return continue chain
	ret
EndProc WinIpc_Int_21

When the message is sent to Win-Link, it processes it in ExecFile. We first look to see if this file is in a list of files that are to not be launched. This list includes bound OS/2 apps, apps that have both a real DOS program as their stub, and any other EXEs that have a NE header that you do not wish to launch. These files are tracked by file name only, not the full path. So we compare just the file name.

We then find the drive and directory of the file being executed. This is the directory it is in because command.com walks the path, but for each attempt it passes EXEC the fully qualified file name to run. We set that drive and directory as the default drive and directory. This way an application is run from its own directory. Experience has shown me that this is the best drive to use.

Now we're ready to try it. We call LoadModule because we only want to launch Windows apps and not DOS apps. A DOS app should stay in its own VM. LoadModule can only exec a Windows app. LoadModule gives us a return value which we then pass back as out return value. Obtaining this return value is the reason we needed a SendMessage instead of a PostMessage.

Finally, we restore the default drive and directory.

void ExecFile (HWND hDlg,WORD wParam,DWORD lParam) {
  BYTE _far *fpsBase, _far *fpsFile;
  SENDDLG _far *fpSendDlg;
  VMDATA _far *fpVmData;
  FARPROC lpProcAbout;
  int iNum;
  LOADMOD LoadMod;
  WORD wCmdShow[2];
  BYTE sBuf[FILE_MAX+2], sCwd[FILE_MAX+2];

  fpsendDlg = (SENDDLG _far *) lParam;
  fpVmData = (VMDATA _far *) fpSendDlg->lParam;
  fpsBase = fpVmData->sExec;
  fpsFile = fStrEnd (fpsBase);
  while ((fpsFile >= fpsBase) && (*fpsFile != '\\') &&
   (*fpsFile != '/') && (*fpsFile != ':'))
    fpsFile--;
  fpsFile++;

// see if in our no-no list
  fpsendDlg->lRtn = 0L;
  if ((iNum = (int) SendDlgltemMessage (hDlg, DLG_NO_EXEC,
   LB_FINDSTRING, 0, (LONG) fpsFile)) >= 0) {
    SendDlgltemMessage (hDlg, DLG_NO_EXEC, LB_GETTEXT,
     iNum, (LONG) (LPSTR) sBuf);
    if (!fStriCmp (sBuf, fpsFile)) fpSendDlg->lRtn = 0xFFFFFFFFL;
  }
  if (fpSendDlg->lRtn != 0)  {
    *(fpVmData->sExec) = 0;
    return;
  }

// Save the current dir & set the current dir to the dir
// the program is in. After the exec - we restore the cur dir
  _getdcwd (toupper (*fpsBase) - 'A' + 1, sCwd, FILE_MAX);
  fStrnCpy (sBuf, fpsBase, FILE_MAX);
  iNum = Min (fpsFile - fpsBase, FILE_MAX);
  if ((iNum > 3) && (sBuf[iNum-1] =='\\'))
    iNum--;
  sBuf [iNum] = 0;

// Set default drive & dir
  _dos_setdrive (toupper (sBuf[0)) - 'A' + 1, (unsigned *) &iNum);
  _chdir (sBuf);
  fpsFile = fpVmData->sCmdLine;
  *(fpsFile + (*fpsFile) + 1) = 0;

  LoadMod.wEnvSeg = 0;
  LoadMod.dwRes = 0;
  LoadMod.lpCmdLine = fpsFile + 1;
  LoadMod.lpCmdShow = wCmdShow;
  wCmdShow[0] = 2;
  wCmdShow[1] = SW_SHOWNORMAL;

  if (LoadModule (fpsBase, &LoadMod) <= (HINSTANCE) 32)
    fpSendDlg->lRtn = -1L;
  *(fpVmData->sExec) = 0;

// Back to the old drive & dir
  _dos_setdrive (toupper (sCwd[0]) - 'A' + 1, (unsigned *) &iNum);
  _chdir (sCwd);
}

The DOS Box Title, Print Intercepting, and Everything

We have covered a significant part of Win-Link. Unfortunately (or fortunately depending on your point of view), this book is not titled The Complete Guide to the Win-Link Sources.

The DOS box title tracking is fairly straightforward. Whenever Win-IPC believes that the running application has changed it sets the title to the string found in the memory arena for the currently selected PSP.

The one weird thing here is you can't track the set PSP call because there are usually TSRs or device drivers that temporarily change it.

You will find that the title constantly changes as you sit at the DOS prompt.

The print intercepting is probably the most complicated part of the entire program. It involves intercepting various interrupts, time-outs, and its own printer driver. A thorough discussion of it could be a book by itself. And, unfortunately, I do not have permission to include the sources to the raw printer driver. However, all RAW.DRV does is properly implement the PASSTHROUGH escape command; most 3.1 printer drivers also do that.

The rest of Win-Link is pretty dull. There is the code to handle the dialog box and the other details of a standard Windows program. I hope by explaining how a commercial program works, I have provided a different viewpoint into VxDs than you get from sample programs. I also hope that if you ever have to write a program like this that the code presented here will give you a head start. I can tell you from experience that attacking this for the first time is not the best way to learn about VxDs.

Chapter 1

The Anatomy of a VxD

Real-Mode Initialization Segment

Protected-Mode Initialization Segment

Pageable Data Segments

Device Declaration Block (DDB)

VxD Control Procedure

Virtual Device ID

Initialization Order

API Entry Procedures

Virtual Device Initialization

System Critical Initialization (Sys_Critical_Init)

Device Initialization (Device_Init)

Chapter 2

The Virtual Machine Manager

Asynchronous Services

Chapter 3

Memory Management

Allocating Selectors

Chapter 4

V86/PM VxD API

Chapter 5

Nested Execution

Chapter 6

I/O Trapping

Chapter 7

IRQ Virtualization

IRQ Virtualization

Shared IRQ Procedures

Chapter 8

Virtualized DMA

Chapter 9

VKD and Keyboard Processing

Chapter 10

Writing VxDs in C

Segment Attributes

Wrapping VxD Services

Thunking Callbacks

Service Tables

Chapter 11

Using the Debugging Services

Chapter 12

VCOMMD Design Notes

VidComlrq

The Buffers

Other Data

IoWrite

Chapter 13

Win-Link Design and Implementation Notes

Handling VM Creation

Win-Link to Win-IPC

Win-IPC to Win-Link

SendMessage

Win-Link

SendMessage to Win-Link

Other Design Considerations

DOS to DOS, Windows to DOS

DOS to Windows

Copying Memory

Ldt and Gdt Pointers