Table Of Contents |
---|
Virtual device drivers (VxDs) are not just for people writing drivers for hardware devices anymore than DOS device drivers are used for the same. A VxD is Windows' way of letting you do almost anything you want. If you miss the DOS world where you have direct access to the hardware, can interface to vital CPU functions, or can take over parts of the operating system - then welcome to VxDs, where you can do the a lot of same under Windows.
A VxD is code and data that runs at ring 0 in 32-bit flat model as part of the Windows 386 virtual machine manager (VMM). In fact, the VMM (WIN386.EXE) is primarily a number of standard VxDs compounded in a single file. VxDs only operate when Windows runs in 386 Enhanced mode.
VMM is not really a part of Windows; instead, it is a preemptive, multitasking kernel that controls multiple virtual machines. Once VMM has initialized, the Windows Graphical User Interface composed of KRNL386.EXE, GDI.EXE, USER.EXE, and all of the supporting drivers are loaded into the System VM (the initial virtual machine created when VMM is started). However, VMM could easily load COMMAND.COM into the System VM and with the assistance of a VxD and some helper hot-keys, so that you have a multitasking DOS instead of the fancy Windows GUI.
Because VxDs operate at ring 0, the operating-system level of protection, the CPU allows the code to execute any 386 instruction. At higher ring levels, access to memory addresses or I/O ports can be restricted from the VM, allowing the VMM or a VxD to process the exception as it wishes. Of course, certain instructions executed by the VM always cause processor exceptions, but a VxD can simulate the functionality of that instruction for the VM, allowing it to operate as if it has sufficient privilege.
With this power comes responsibility. Although a VxD can play with the Interrupt Descriptor Table (IDT) entries directly, this is something that should probably be avoided. Besides, the VMM provides enough functionality to get as close the IDT as needed, so why reinvent the wheel?
A VxD is always active, unlike any other part of Windows. When a DOS box is running exclusive mode, the primary code executing apart from the DOS box itself includes any VxDs responding to IRQs, code causing faulting instructions, and trapped I/O or page faults in the DOS box.
A VxD is the only program with unobstructed access to the hardware. If a VxD performs I/O to a port, it communicates directly to the physical port, without restrictions. If a VxD owns a hardware interrupt, the VxD receives the IRQ directly from the Virtual Programmable Interrupt Controller Driver (VPICD), without ring transitions. For example, an interrupt service routine for an non-owned interrupt in a VM sees a virtualized interrupt through events scheduled by the VPICD, whereas a VxD has a more direct path for interrupt servicing. Where code communicating to hardware in a VM may be restricted or slowed by ring transitions and access permission lookups, a VxD is unrestricted and extremely fast.
VxDs operate in 32-bit flat model, the 386 equivalent of small model. All of the segment registers are fixed to the same base address. The CS and DS selector values differ, due to access and execution restrictions (code versus data), but point to the same memory. Because a VxD is in 32-bit flat model, all offsets to code and data are 32-bit; therefore, you can access any part of the address space (4 gigabytes) with just an offset.
A VxD is also given priority on all actions in a VM. A VxD can intercept and/or generate interrupts (hardware or software), trap port I/O, and even restrict access to specific regions of memory. VxDs can determine whether to allow such access to occur, provide simulation, terminate (or nuke) the VM, or simply ignore the request.
Because VxDs utilize the base components of the 80386 chipset, it is important that you have a working knowledge of 386 architecture.
For a good description of 80386/80486 system architecture, see Hummel, Robert L. (1992), PC Magazine Programmer's TechnicaI Reference: The Processor and Coprocessor, Emeryville, CA: Ziff-Davis Press.
A misbehaving MS-DOS application will usually crash the DOS virtual machine. A misbehaving Windows application may affect the operation of other Windows applications. However, a misbehaving VxD will crash the entire Windows operating system. Because a VxD is part of the WlN386 kernel, the VxD is active during critical processing of the Windows operating system. The smallest, most subtle bug can have devastating effects on the operating system. Thorough testing of virtual device drivers is absolutely necessary. Do not simply test how the VxD operates under stringent configurations; instead, expand your testing to include all possible permutations of end-user system configurations you can design (limited only by a testing hardware budget of course!).
VxDs were originally designed to handle hardware device contention between multiple processes and to translate or buffer data transfers from a VM to hardware devices. When two or more programs attempt to access the same device, some method of contention management must be used. You can use a VxD to allow each process to act as though it has exclusive access to the device. For example, a Virtual Printer Device (VPD) would provide the process with a virtual printer port, and characters written to the port would be written to a print spooler. The VxD would then send the job to the printer when it becomes available. Windows 3.X does not operate in this fashion, but the Win-Link VxD provides this functionality (see Chapter 13 for more information). Another method would be to assign the physical device to only one process at a time, so that when a process attempts to access the device while it is in use, the VxD does not pass the request to the actual hardware, and the process operates as though the hardware did not exist. The Virtual COMM Device (VCD) uses this method.
Recently, the use of VxDs has been expanded to include interprocess communication (demonstrated in the Win-Link example). Some VxDs now also implement a truly virtual device, providing the necessary mechanisms to allow a virtual machine to see a device that may not actually exist in hardware. VxDs can also implement client-server hardware management, providing an interface to a VM that virtualizes I/O to the device and translates this information to commands to be sent across a network to a hardware server.
Note: Due to problems in Windows 3.x, you will need to make sure that your real-mode initialization segment is not exactly 4k, 8k, 12k, or 16k in size. Additionally, real-mode initialization segments greater than 8k (or 12k in Windows 3.1) must be a multiple of 4. Real-mode initialization segments cannot be greater than 12k under Windows 3.0 or greater than 16k under Windows 3.1. Using code segments greater than these restrictions will cause problems and will eventually hang VMM. These problems were reported on the CompuServe WinSDK forum and confirmed by Developer Support Engineers. Avoid these problems with real-mode initialization by adding the necessary boundary checks in your code.
Declare_Virtual_Device VSIMPLED,VSIMPLED_Major_Ver,\ VSIMPLED_Minor_Ver,\ VSIMPLED_Control_Proc,\ VSIMPLED_Device_ID,\ Undefined_Init_Order,\ VSIMPLED_V86_API_Proc,\ VSIMPLED_PM_API_Proc |
This declaration dispatches the system control events to the VSIMPLED_Control_Proc. This procedure must be declared in a VxD_LOCKED_CODE segment, which handles system event control such as the initialization dispatch, VM control events (creation or suspension of VMs), device focus changes, and system shutdown notifications. Defining it in any other segment will cause problems.
VXD_LOCKED_CODE_SEG ;VSIMPLED_Control_Proc ; ;Description: ; This is the entry point for system control calls from VMM. ; Control for system messages are dispatched through the ; Control_Dispatch macro in VMM.INC. BeginProc VSIMPLED_Control_Proc Control_Dispatch Sys_Critical_Init, VSIMPLED_Sys_Critical Init Control_Dispatch Device_Init, VSIMPLED_Device Init EndProc VSIMPLED_Control Proc VXD_LOCKED_CODE_ENDS |
VXD_LOCKED_CODE_SEG and VXD_LOCKED_CODE_ENDS are macros that define a segment of 32-bit code in a page-locked segment. Defining this segment as "page-locked" is necessary because the calls are dispatched during critical processing of the VMM. This procedure cannot be included in the initialization code segments, because it would be discarded after VMM completed its startup procedures and system failure would occur when the VMM attempted to dispatch a control message to the VxD during later processing.
The BeginProc and EndProc macros define the beginning and end of a specific VxD entry point. These macros define the procedure name of a VxD, declare it callable by other VxD, align the procedure for "fast-calling", declare the procedure as public for access outside of this module, or additionally define the procedure as an asynchronous service callable from another VxD at interrupt time. The valid parameters to BeginProc macro are PUBLIC, HIGH_FREQ, SERVICE, and ASYNC_SERVICE, and their functionality corresponds to the following table:
PUBLIC | Procedure is callable from an external module |
HIGH_FREQ | Aligns this procedure on a DWORD boundary. Useful for procedures called frequently such as hardware interrupt procedures or I/O trapping routines. |
SERVICE | Procedure can be called from another VxD. Requires an exported service table. |
ASYNC_SERVICE | Same as SERVICE, but the VxD routine can be called during interrupt procedures. VxD services that do not specify this option and are called at interrupt time will cause debug traces when using the debug version of VMM (WIN386.EXE). If you declare a service to be asynchronous be sure that it is atomic or can be interrupted while processing the request. |
If you are replacing an existing VxD, such as the Virtual Comm Device (VCD), you should use the value specified in VMM.INC. The replacement VCD would then have a device ID of VCD_Device_ID. Otherwise, assuming that your VxD does not provide an external API or services, you can use the predefined value of Undefined_Device_ID.
Sys_Critical_Init may also be used to hook your VxD in front of certain handlers, such as GP fault or NMI processing. Sys_Critical_Init is an optional procedure, and you should only define this procedure if you have specific initialization to perform. Below is a sample Sys_Critical_Init handler as used in the VSIMPLED Sample:
;VSIMPLED_Sys_Critical_Init ; ;Description: ; On entry, interrupts are disabled. Critical initialization ; for this VxD should occur here. For example, we can read ; settings from VMM's cached copy of the SYSTEN.INI and set up ; our VxD as appropriate. ; ; This procedure is called when the VSIMPLED_Control_Proc ; dispatches the Sys_Critical_Init notification from VMM. ; We can notify VMM of failure by returning with carry set ; or carry clear will suggest success. BeginProc VSIMPLED_Sys_Critical_Init clc ret EndProc VSIMPLED_Sys_Critical_Init |
MAKEFILE |
!IFDEF DEBUG DEFS=-DDEBUG ENDIF .asm.obj: masm5 -p -w2 -Mx $(DEFS) $*; .asm.lst: masm5 -l -p -w2 -Mx $(DEFS) $*; OBJS=vsimpled.obj all: vsimpled.386 vsimpled.obj: vsimpled.asm vsimpled.386: vsimpled.def $(OBJS) link386 /NOI /NOD /NOP /MAP @<< $ (OBJS) vsimpled.386 vsimpled.map vsimpled.def << addhdr vsimpled.386 mapsym32 vsimpled clean: del *.386 del *.obj del *.map del *.sym |
VSIMPLED.ASM |
page 60, 132 ; title VSIMPLED - A simple virtual device driver example ; ;(C)Copyright Woodruff Software Systems, 1993 ;Title: VSIMPLED.386 - Sample virtual device driver ;Module: VSIMPLED.ASM - Core code ;Version: 1.00 ;Date: November 24, 1992 ;Author: Bryan A. Woodruff ; ;Change log: ; DATE REVISION DESCRIPTION AUTHOR ; 11/24/92 1.00 Wrote it. BryanW ; ;Functional Description: ; Provides a minimal virtual device driver interface. ; .386p ; INCLUDES & EQUATES ; .XLIST INCLUDE VMM.Inc INCLUDE Debug.Inc .LIST VSIMPLED_Major_Ver equ 01h VSIMPLED_Minor_Ver equ 00h VSIMPLED_Device_ID equ Undefined_Device_ID ; VIRTUAL DEVICE DECLARATION Declare_Virtual_Device VSIMPLED, VSIMPLED_Major_Ver,\ VSIMPLED_Minor_Ver, VSIMPLED_Control_Proc,\ VSIMPLED_Device_ID, Undefined_Init_Order,,, ; ICODE VxD_ICODE_SEG ;VSIMPLED_Sys_Critical_Init ; ;Description: ; On entry, interrupts are disabled. Critical initialization ; for this VxD should occur here. For example, we can read ; settings from "VMM's cached copy of the SYSTEN.INI and act ; set up our VxD as appropriate. ; ; This procedure is called when the VSIMPLED_Control_Proc ; dispatches the Sys_Critical_Init notification from VMM. ; ; We can notify VMM of failure by returning with carry set ; or carry clear will suggest success. BeginProc VSIMPLED_Sys_Critical_Init Trace_Out "VSIMPLED: Sys_Critical_Init" clc ret EndProc VSIMPLED_Sys_Critical_Init ; ; VSIMPLED_Device_Init ; ;Description: ; This is a non-system critical initialization procedure. ; IRQ virtualization, I/O port trapping and VM control ; block allocation can occur here. ; Again, the same return value applies. ; CLC for success, STC for error notification. BeginProc VSIMPLED_Device_Init Trace_Out "VSIMPLED: Device_Init" clc ret EndProc VSIMPLED_Device_Init VxD_ICODE_ENDS VxD_LOCKED_CODE_SEG ; NONPAGEABLE CODE ; ;VSIMPLED_Control_Proc ; ;DESCRIPTION: ; Dispatches VMM control messages to the appropriate handlers. ;ENTRY: ; EAX = Message ; EBX = VM associated with message ;EXIT: ; Carry clear if no error (or if not handled by the VxD) ; or set to indicate failure if the message can be failed. ;USES: ; All registers. BeginProc VSIMPLED_Control_Proc Control_Dispatch Sys_Critical_Init, VSIMPLED_Sys_Critical_Init Control_Dispatch Device_Init, VSIMPLED_Device_Init clc ret EndProc VSIMPLED_Control_Proc VxD_LOCKED_CODE_ENDS END ; End of File: vsimpled.asm |
VSIMPLED.DEF |
LIBRARY VSIMPLED DESCRIPTION 'Win386 VSIMPLED Sample Device (Version 3.10)' EXETYPE DEV386 SEGMENTS _LTEXT PRELOAD NONDISCARDABLE _LDATA PRELOAD NONDISCARDABLE _ITEXT CLASS 'ICODE' DISCARDABLE _IDATA CLASS 'ICODE' DISCARDABLE _TEXT CLASS 'PCODE' NONDISCARDABLE _DATA CLASS 'PCODE' NONDISCARDABLE EXPORTS VSIMPLED_DDB @1 |
Note: WDEB386 and the debug version of WIN386.EXE are provided with VxD-Lite included on the accompanying disk.
The VSIMPLED device displays trace information at each initialization phase. Before the GUI starts, break into the debugger by using the appropriate hot-key (Control-D for Soft-ICE/W or a Control-C from the terminal keyboard for WDEB386) and unassemble the VSIMPLED_Sys_Critical_Init procedure:
Registration # SIW012345 :ALTSCR OFF :LINES 50 :i1here on :wc :X VSIMPLED: Sys_Critical_Init Break Due to Hot Key D800:00001A20 MOV CX,0040 :u VSIMPLED_Sys_Critical_Init VSIMPLED_Sys_Critical_Init 0028:8029478C CALL [Log_Proc_Call] 0028:80294792 PUSHFD 0028:80294793 PUSHAD 0028:80294794 MOV ESI,VSIMPLED_DDB+38(800FEA2C) 0028:80294799 CALL [Out_Debug_String] 0028:8029479F POPAD 0028:802947A0 POPFD :g VSIMPLED: Device_Init VMM Version 03.10 - Build Rev 00000103 Break Due to Hot Key 0028:800110A6 CMP AX,0030 :u VSIMPLED_Sys_Critical_Init VSIMPLED_Sys_Critical_Init 0028:8029478C INVALID 0028:8029478E INVALID 0028:80294790 INVALID 0028:80294796 INVALID 0028:80294798 INVALID :g |
Re-enter the debugger when the Windows GUI has completed initialization and unassemble the same procedure. You will find that the address is invalid because the initialization code and data segments were discarded after the device initialization was completed.
For more information on VMM's debugging services and debugging techniques, see Chapter 11, "Using the Debugging Services".
The Virtual Machine Manager is a single-threaded, non-reentrant, preemptive multi-tasking, event-driven operating system. This operating system is often referred to as "WIN386" or "VMM". VMM provides an interface layer to VxDs for event scheduling, memory management, descriptor table management, and other vital system services.The VMM creates, runs, and destroys virtual machines (VMs). On startup, the VMM creates the System VM for the Windows GUI. The System VM interfaces to the SHELL VxD in VMM to create new virtual machines or DOS boxes -- each new VM starts operation in Virtual 8086 (V86) mode. Because a VxD is a part of the VMM, it runs within whatever VM is active when it is called. Consequently, when a DOS VM calls a VxD, the VxD runs in protected mode in the context of the calling VM.
To write a VxD, you must have a clear understanding of how the VMM works.
There are two types of event lists: the global event list and VM-specific event lists. The global event list is the event list for the VMM. As each VM is created, VMM creates an event list for specific events of that VM. Prior to returning control to a VM, VMM processes any events in the global event list, any pending NMI events (a special form of a global event), and then the VM event list as shown in Figure 2.1. Note that VM-specific events are only processed for the active VM.
Figure 2.1: VMM Event Processing Order
When VM events are created, the execution priority of the VM can be adjusted. This is also known as a "boost". The boost can be temporary (automatically removed by VMM) or can be specifically removed by the VxD when all of the necessary event processing for that VM is completed. The execution priority of a VM is used by the primary scheduler (execution priority scheduling) to determine the active VM. (See the section on Scheduling for more detail.)
When all events from the global event list and active VM event list have been processed, the primary scheduler walks the VM list searching for the VM with the highest execution priority. The VM with the highest execution priority becomes the active VM. VMM returns to the active VM until it is reactivated by interrupt or fault processing.
When a VxD is processing an event, asynchronous VMM services may be called and new events generated as the result of IRQ handling. When an IRQ is generated by the PIC, the handlers installed into the IDT by VPICD (Virtual PIC Device) call the Hw_Int_Proc for the IRQ. During non-virtualized IRQ processing, the default VPICD handlers then schedule VM events for interrupt simulation. VxDs must be aware that VPICD handles interrupts while events are processed, and disabling interrupts during event processing may be necessary for VxDs performing critical hardware processing. IRQ handling is detailed in Chapter 7.
Because a VM does not continue executing until all events in the global event list and VM event list have been dispatched, the results of event processing in a VxD can become stacked in the VM. For example, a VxD processing a global timeout event may schedule an asynchronous call to a procedure in a VM. During this processing, the VxD may request that the VM resume execution. Before resuming execution of the VM, VMM processes any remaining events on the event list. If this includes an interrupt event scheduled by VPICD, the VxD may request a simulated interrupt in the VM. Finally, when VMM returns to the VM, the actual results of the event processing are executed in reverse order as pushed onto the VM's stack: The interrupt service is be processed first, before the callback scheduled by the timeout event.
When a VM is boosted, its order is changed in the queue. Normally, the active VM has a boost of Cur_Run_VM_Boost in as its execution priority. Devices that require a VM to become active as the result of I/O or interrupt processing may use a device boost of High_Pri_Device_Boost to force the VM to become active. This is typically implemented using the Call_Priority_VM_Event service. Using this service, VMM adjusts execution priority of the specified VM, and a callback is notified when the VM has activated. The VxD can then continue its processing for the VM. Figures 2.2 and 2.3 demonstrate the effect in the scheduling queue of changing the execution priority. The following code example demonstrates the technique of boosting a VM's execution priority:
// Example of calling priority VM event in 'C' DWORD dwEventHandle; static PEVENTPROC pEventProc=NULL if (!pEventProc) pEventProc=vmmwrapThunkEventProc(BoostEventProc); dwEventBandle=vmmCallPriorityVMEvent(hVM,High_Pri_Device_Boost, PEF_Wait_Not_Crit,dwRefData,pEventProc,0); // BoostEventProc - handler for VM event callback VOID BoostEventProc(DWORD hVM, DWORD dwRefData, PCRS_32 pCRS){ TRACEMSGPARAM("VM #EAX is now active\r\n", hVM); } // end of BoostEventProc() |
Figure 2.2: Scheduler queue prior to device boost
Figure 2.3: Scheduler queue after device boost.
The time-slice priorities are also used to determine how long the execution priority of a VM will be boosted. The boost value is constant -- that is, changing the time-slice priorities does not affect the amount of execution priority boost that a VM receives. When the next time-slice occurs and the VM's time-slice period has been exhausted, the VM is unboosted and the next VM in the time-slice scheduler's queue receives the execution priority boost.
The time-slice scheduler's execution priority boost for a VM is low compared to other high-priority event processing. Thus, the high-priority VM remains active until it is unboosted or until another VM of higher priority is found in the primary scheduler's queue.
<Push any C parameters> int Dyna_Link_Int dd VxD-ID SHL 16 + VxD_Service <Clean up C parameters> |
When the IDT dispatches the software interrupt to VMM, the dynalink routine patches the int 20h and the following dword with an indirect call to the VxD service handler. Stack parameters to the service are passed with the 'C' calling convention. VxDJmp is similar to VxDCall, with the exception that stack parameters cannot be used and the resulting code jumps to the VxD service handler, avoiding the extra cycles involved when the service call is followed by a return instruction.
Under some 386 'C' compilers, you cannot generate the appropriate in-line assembly instructions to duplicate this interface and/or load the registers required by the service. Consequently, you need to use .ASM thunks to provide a 'C' callable interface. Similarly, replacement VxDs (for example, a replacement VCD) may require register-parameter passing, and an assembly language front-end is necessary. The VDDVGA sample was written in 'C' and demonstrates the techniques required to interface to some of these services.
Note: The complete VDDVGA sample sources written in 'C' can be found on the enclosed diskette in the C\VDDVGA directory. The VMM "wrapper" for VxDs written in 'C' can be found in the C\VMMWRAP directory. For more information on writing VxDs in 'C' see Chapter 10.
Note that the critical section does not halt scheduling of VMs; that is, other VMs may be scheduled while the critical section is claimed. If a second VM attempts to claim the critical section, the VM is suspended until the current critical section owner has released the critical claim. When a VM claims a critical section, the execution priority of the VM is adjusted by the predefined value of Critical_Section_Boost; the execution priority is restored when the critical section is released.
The critical section allows a VxD to prevent multiple VMs from entering the same piece of code. If two VMs are executing and interfacing to the same TSR and the TSR can not handle multiple VMs calling simultaneously because it maintains global non-instanced data for the specific procedure, a VxD may wrap the V86 interrupt chain and claim a critical section prior to reflecting the interrupt to the VM. It releases the critical section when the interrupt has returned. This prevents two VMs from simultaneously entering the same interrupt routine in the TSR. The following example demonstrates hooking the V86 interrupt, watching for a specific signature, and claiming a critical section around the API call:
;Hook the V86 interrupt (Int 60h) BeginProc VSIMPLED_Sys_Critical_Init pushad mov eax,60h mov esi,OFFSET32 VSIMPLED_Int60_Hook VMMCall Hook_V86_Int_Chain popad clc ret EndProc VSIMPLED_Sys_Critical_Init ;Watches for the API signature. If found, claims ;a critical section and hooks the "back-end". BeginProc VSIMPLED_Int60_Hook, High_Freq cmp ([bp.Client_AX],4257h jne SHORT VIH_Exit pushad ;Claim the critical section but allow interrupts ;to be serviced if we block. mov ecx,Block_Svc_Ints or Block_Enable_Ints VMMCall Begin_Critical_Section ;Hook the back end of the Int60 call. xor eax,eax xor edx,edx mov esi,OFFSET32 VSIMPLED_Int60_Complete VMMCall Call_When_VM_Returns popad VIH_Exit: stc ;always chain ret EndProc VSIMPLED_Int60_Hook ;Completes the Int 60h handling by releasing the ;critical section and returning. BeginProc VSIMPLED_Int60_Complete, High_Freq VMMCall End_Critical_Section ret EndProc VSIMPLED_Int60_Complete |
When it suspends a VM, a VxD causes the VM to be removed from the active queue and added to the inactive queue. The primary scheduler does not activate this VM until it is resumed. If a VxD suspends a VM that is currently active, an immediate task switch occurs and the execution path in the VxD halts at the Suspend_VM call. To see this, try using debug traces to "wrap" the call to the Suspend_VM service. The debug trace in front of this call displays and a task switch occurs as when the active VM is placed in the inactive queue (the VM with the highest priority becomes the active VM), after which global events and VM events are processed. When the suspended VM has been resumed, the debug trace after the Suspend_VM call in the VxD is displayed, as the execution path of the VM continues.
VMM provides services (Wait_Semaphore and Signal_Semaphore) that allow VxDs to block and unblock VMs, based on events occurring in the VxD that decrement a token count by signaling the semaphore. A VM waiting on a semaphore resumes when the token count is less than or equal to zero. Additionally, it is possible to specify that certain events can be processed in a blocked VM. The following list describes the flags associated with the Wait_Semaphore service:
Block_Enable_Ints | Forces interrupts to be enabled and serviced even if interrupts are disabled in the blocked VM. (Only relevant if Block_Svc_Ints or Block_Svc_If_Int_Locked specified.) |
Block_Poll | Causes the primary scheduler to not switch away from the blocked VM unless another VM has higher priority. |
Block_Svc_Ints | Service interrupts in the VM even if the virtual machine is blocked. |
Block_Svc_If_Ints_Locked | Same as Block_Svc Ints with the additional requirement that the VMStat_V86IntsLocked flag is set. |
Figure 2.4 shows the flow control possible using the semaphore services. For example, a VxD can signal or wait on semaphores in response to API calls from both the V86 VM (DOS application) and from the PM VM (Windows Application), allowing the VxD to control a data transfer channel through the VxD. Note: A complete sample demonstrating semaphore usage and DOS to Windows communication, can be found on the enclosed diskette in the ASM\SEMAPHOR directory.
Figure 2.4: Possible design of semaphore implementation.
Asynchronous VMM Services | |
---|---|
Begin_Reentrant_Execution | Get_Time_Slice_Info |
Call_Global_Event | Get_VM_Exec_Time |
Call_Priority_VM_Event | Get_VMM_Reenter_Count |
Call_VM_Event | Get_VMM_Version |
Cancel_Global_Event | List_Allocate |
Cancel_VM_Event | List_Attach |
Close_VM | List_Attach_Tail |
Crash_Cur_VM | List_Deallocate |
End_Reentrant_Execution | List_Get_First |
Fatal_Error_Handler | List_Get_Next |
Fatal_Memory_Error | List_Insert |
Get_Crit_Section_Status | List_Remove |
Get_Crit_Status_No_Block | List_Remove_First |
Get_Cur_VM_Handle | Schedule_Global_Event |
Get_Execution_Focus | ScheduIe_VM_Event |
Get_Last_Updated_System_Time | Signal_Semaphore |
Get_Last_Updated_VM_Exec_Time | Test_Cur_VM_Handle |
Get_Next_VM_Handle | Test_Debug_Installed |
GetSetDetailedVMError | Test_Sys_VM_Handle |
Get_System_Time | Update_System_Clock |
Get_Sys_VM_Handle | Validate_VM_Handle |
Asynchronous Debugging Services | |
Clear_Mono_Screen | Is_Debug_Chr |
Debug_Convert_Hex_Binary | Log_Proc_Call |
Debug_Convert_Hex_Decimal | Out_Debug_Chr |
Debug_Test_Cur_VM | Out_Debug_String |
Debug_Test_VaIid_Handle | Out_Mono_Chr |
DisabIe_Touch_1st_Meg | Out_Mono_String |
EnabIe_Touch_1st_Meg | Queue_Debug_String |
Get_Mono_Chr | Set_Mono_Cur_Pos |
Get_Mono_Cur_Pos | Test_Reenter |
In_Debug_Chr | Validate_Client_Ptr |
Asychronous VxD Services | |
BlockDev_Command_Complete | VPICD_Get_Complete_Status |
BlockDev_Send_Command | VPICD_Get_IRQ_Complete_Status |
DOSMGR_Get_DOS_Crit_Status | VPICD_Get_Status |
PageFiIe_Read_Or_Write | VPICD_Phys_EOI |
VPICD_Call_When_Hw_Int | VPICD_Physically_Mask |
VPICD_Clear_Int_Request | VPICD_Physically_Unmask |
VPICD_Convert_Handle_To_IRQ | VPICD_Set_Auto_Masking |
VPICD_Convert_Int_To_IRQ | VPICD_Set_Int_Request |
VPICD_Convert_IRQ_To_Int | VPICD_Test_Phys_Request |
VPICD_Force_Default_Behavior | VTD_Update_System_Clock |
VPICD_Force_Default_Owner |
The VMM implements two memory managers. The V86MMGR VxD manages memory for V86-mode applications, including Expanded Memory Specification (EMS) and Extended Memory Specification (XMS), and the Memory Manager (MMGR) provides services such as GDT/LDT management, global heap management, physical memory management, protected mode address translation, and V86 page management, including V86 address mapping and allocation.
If you are writing a virtual display device or writing a VxD for a device requiring contiguous physical memory (such as devices using DMA transfers), you need to implement some form of memory management. Additionally, certain memory management implementations in your VxD such as memory mapped devices may require knowledge of the way the 80386 implements memory management using page tables.
While each VM has its own memory and linear address space, any VM that is presently executing is also mapped into the first megabyte of the linear address space. The MMGR performs this mapping on each task switch by updating the page tables to reflect the new mapping of the lower linear address space. Figure 3.1 shows a possible memory configuration with multiple VMs.
Figure 3.1: VMM Memory Map
//Allocate part of VM control block for VDD usage dwVidCBOff=vmmAllocateDeviceCBArea(sizeof(VDDCB),0); if (dwVidCBOff==NULL){ vmmDebugout("VDD ERROR: Could not allocate control block area!\r\n"); vddFatalMemoryError(); return FALSE; } pSysVMCB=(PVDDCB)(hVM+dwVidCBOff); |
VMM allocates a control block containing vital information for each VM and is located at the zero offset from the VM handle. VMM's control block has the following structure:
//VM control block structure (VMM) typedef struct tagVMMCB{ DWORD CB_VM_Status DWORD CB_High_Linear DWORD CB_Client_Pointer DWORD CB_VMID }VMMCB, *PVMMCB; |
Thus, given a VM handle, a VxD can obtain the VM's ID using the following method:
DWORD dwVMID; dwVMID=((PVMMCB)hVM)->CB_VMID; |
The low memory (interrupt vector table, BIOS & DOS data, and so forth) for each VM is located in high linear address space along with the rest of the memory for that VM. It is preferable to access VM memory using the high linear addresses, as these will not change. If a task switch occurs during memory reads or writes to a low linear address, your VxD may access an invalid address.
Client_Ptr_Flat is a macro that sets up a call to the Map_Flat service:
Client_Ptr_Flat esi,DS,DX |
which expands to:
push eax mov ax,Client_DS*100h + Client_DX VMMCall Map_Flat mov esi,eax pop eax |
The actual address mapping magic is performed in VMM's Map_Flat service. The following algorithm is used by Map_Flat to map the pointer to a 32-bit flat offset:
mov esi,[ebp.Client_EDX] mov eax,[ebp.Client_DS] if (VM is V86 mode) shl eax,4 movzx esi,si ;zero high order offset add eax,esi add eax,[ebx.CB_High_Linear] else (VM is prot. mode) if (!32-bit) movzx esi, Si eax = _Selector_Map_Flat( hVN, [ebp.Client_DS], 0 if (eax != -1) add eax, esi if (eax < 1 MB + 64KB) add eax,[ebx.CB_High_Linear] endif |
The translation APIs are often used when accessing memory specified through V86 or PM APIs. Dual-mode (combination V86 and PM) APIs accessing application-provided buffers can be easily implemented using the Map_Flat service as demonstrated here:
;VSIMPLED_Get_Info, PMAPI, RMAPI ; ;DESCRIPTION: ; This function is used to get information about the ; VSIMPLED configuration. ;ENTRY: ; Client_ES = selector/segment of VSIMPLEDINFO structure ; Client_BX = offset of VSIMPLEDINFO structure ;EXIT: ; IF carry clear ; success ; Client_AX = non-zero ; Client_ES:BX ->filled in VSIMPLEDINFO structure ; ELSE carry set ; Client_AX = 0 ;USES: ; Flags, EAX, EBX, ECX, ESI, EDI BeginProc VSIMPLED_API_Get_Info Assert_Client_Ptr ebp Trace_Out "VSIMPLED_API_Get_Info: called" Client_Ptr_Flat edi, ES, BX cmp edi, -1 je SHORT GI_Fail lea esi, [gVxDInfo] mov ecx, size VSIMPLEDINFO cld shr ecx, 1 rep movsw adc cl, cl rep movsb mov [ebp.Client_AX],1 ;success clc ret GI_Fail: Debug_Out "VSIMPLED_API_Get_Info: FAILED!!" mov [ebp.Client_AX],0 ;failed stc ret EndProc VSIMPLED_API_Get_Info |
VMMCall _HeapAllocate, <cbSize,dwFlags> or eax,eax jz SHORT Alloc_Failed mov pDataBlock,eax |
VMM allocates the memory on a doubleword boundary, but the cbSize parameter does not have to be dword aligned. The VxD is responsible for making sure that it stays within the bounds of the memory block, because VMM does not provide protection against accessing memory beyond the allocated range. The memory allocated by this service is fixed, and frequent allocating and freeing of memory may fragment the heap. Also, the memory block is not page-locked and may not be present when accessed. PageSwap VxD resolves the not-present fault so your VxD can continue with memory accesses.
If you require page-locked memory and are using the heap management services, the service _LinPageLock can be implemented. This avoids the possibility of VMM discarding the physical memory between accesses by a VxD. However, because physical memory is a limited resource, you should only use this service in cases where page-locked memory is vital to your implementation.
_HeapGetSize, _HeapReAllocate, and _HeapFree are used to determine the block size and to reallocate and free the memory block, respectively. Using _HeapReAllocate may cause the address of the block to change, and VxDs must not rely on the possibility of the address remaining constant. _HeapReAllocate can preserve the contents of the old block by copying the contents to the new block. The following flags are defined for use with this service:
HeapNoCopy | Do not copy the contents of the existing block. |
HeapZeroInit | Initialize the new bytes in the heap to zero. |
HeapZeroReInit | Fill all bytes in the block with zero. |
MMGR also provides low-level memory management services, allowing a VxD to allocate memory within a physical address range, to perform allocations within physical boundary constraints (not crossing 64k or 128k boundaries), and to allocate memory visible to all VMs or to only a single VM. Additionally, the page-fault handler for the allocated pages can be redirected to a specific handler in your VxD. (See the next section for more information on hooked pages.)
Allocation of pages with physical boundary restrictions and/or physical address limitations can only be performed during initialization. The following example demonstrates allocating a buffer for use with a DMA device:
;VSIMPLED_Allocate_DMA_Buffer ; ;DESCRIPTION: ; This function allocates a buffer suitable for DMA transfers. ; It attempts to allocate enough contiguous pages to hold the ; requested size. If the request fails, the size is halved ; until all allocation attempts have failed. ;ENTRY: ; EAX = Desired size (in KB) of the DMA buffer to allocate. ; This size cannot be exceed 64. ;EXIT: ; IF carry clear ; EAX = memory handle of the memory block allocated ; EBX = _physical address_ of memory block ; HCX = actual size in _bytes_ of memory block allocated ; EDX = _ring 0 linear address_ of memory block ; ELSE carry set ; EAX = EBX = ECX = EDX = 0 ;USES: ; Flags, EAX, EBX, ECX, EDX BeginProc VSIMPLED_Allocate_DMA_Buffer cmp eax,64 jle SHORT ADB_Start Debug_Out "Requested size #EAX too big!" mov eax,64 ADB_Start: add eax,3 ;round up to get shr eax,2 ;# of pages ADB_Allocate_DMA_Buffer_Loop: mov ebx,eax ; EBX = # of pages to allocate ; (examples: 3 7 11 ; 12K 28K 44K dec eax ; # pages - 1 10b 111b 1011b bsr cx, ax ; max power of 2 1 2 3 inc cl ; shift cnt 2 3 4 mov eax, 1 shl eax, cl ; mask + 1 100b 1000b l0000b dec eax ; mask 11b 111b 1111b ; alignment 16K 32K 64K mov ecx, ebx Trace_Out "pages=#ECX alignment=*EAX" ; EAX = alignment mask for allocation ; ECX = number of pages to allocate push ecx VMMcall _PageAllocate, <ecx,PG_SYS,0, eax,\ 0, 0FFFh, ebx,\ <PageUseAlign + PageContig + PageFixed>> pop ecx or eax, eax jnz short ADB_Success Trace_Out "Allocation failed! pages=#ECX" mov eax, ecx shr eax, 1 jnz short ADB_Loop xor ebx, ebx xor ecx, ecx stc ret ADB_Success: shl ecx,12 ; pages-->bytes ;Returns: ; EAX = memory handle of the memory block allocated ; EBX = _physical address_ of memory block ; ECX = size in _bytes_ of memory block allocated ; EDX = _ring 0 linear address_ of memory block clc ; success ret EndProc VSIMPLED_Allocate_DMA_Buffer |
To hook V86 pages, a range of pages is first assigned to the VxD:
//Buffer used for reserving pages DWORD aVMPagesBuf[9]; vmmGetDeviceV86PagesArray(NULL,&aVMPagesBuf,NULL); if (aVMPagesBuf[0xA0/32] & 0xFF00FFFF){ vmmDebugOut("VDD ERROR: Pages already allocated\r\n"); vmmFatalError(szVDD_Str_CheckVidPgs); return FALSE; } if (!_AssignDeviceV86Pages(0xA0,16,NULL,NULL)){ vmmDebugOut("VDD ERROR: Could not allocate pages\r\n"); vmmFatalError(szVDD_Str_CheckVidPgs); return FALSE; } if (!vmmAssignDeviceV86Pages(0xB8,8,NULL,NULL)){ vmmDebugOut("VDD ERROR: Could not allocate pages\r\n"); vmmFatalError(szVDD_Str_CheckVidPgs); return FALSE; } |
The V86 pages are then directed to a page fault handler:
//Put an .ASM front end on the page-fault procedure. pVDD_PFault=VMWRAP_ThunkV86PHProc(VDD_PFault); if (pVDD_PFault==NULL){ vmmDebugout("VDD ERROR: Could not thunk VDD_PFault!\r\n"); vmmFatalError(); return FALSE; } //Hook graphics pages for (i=0; i<16; i++) vmmHookV86Page(0xA0+i,pVDD_PFault); //Hook text pages for (i=0; i<8; i++) vmmHookV86Page(0xB8+i,pVDD_PFault); |
During the Create_VM message processing, the V86 pages are marked as not available (not present and not writeable), using the _ModifyPageBits service:
vmmModifyPageBits(hVM,0xA0,16,~P_AVAIL,NULL,PG_HOOKED,NULL); vmmModifyPageBits(hVN,0xB8,8, ~P_AVAIL,NULL,PG_HOOKED,NULL); |
Note that it is necessary to specify the PG_HOOKED in the type parameter of the _ModifyPageBits service when clearing any of the PG_PRES, PG_USER or PG_WRITE bits.
After the initialization is complete, any read or write access of the hooked pages causes a page fault. The page fault handler is called with the faulting page number and the handle of the VM, causing the fault. It is the responsibility of the page fault handler to map memory into the page to resolve the fault or terminate the virtual machine. To map physical memory into the faulting page, use the following code:
//dwPhysPage is the physical page allocated using //_PageAllocate with PG_HOOKED vmmPhysIntoV86(dwPhysPage,hVM,uFaultPage,nPages,0); |
Under some circumstances (such as low memory or other memory mapping error), it may be more desirable to allow the VM to continue without crashing the VM. In these cases, the system null page is assigned to this linear page:
vmmMapIntoV86(VMM_GetNulPageHandle(),hVM,uFaultPage,1,0,0); |
The system null page is guaranteed to contain invalid information for any given VM. Do not rely on its contents for further processing in your VxD.
The VDD uses these techniques to allow multiple VMs to access the
video display hardware and maintain separate virtual displays
for virtual machines.
It is also possible to simulate ROM in a virtual machine using
hooked pages.
When the page fault occurs, map the pages using _PhysIntoV86
and clear the P_WRITE bit using
_ModifyPageBits.
Note, however, that when the VM restarts, the instruction causing the
fault also restarts.
If the VM was performing a write operation, a page fault would occur
immediately.
To resolve this loop, you would need to modify the VM client registers
to point the IP to the instruction following the faulting instruction.
Note: A sample VxD demonstrating these hooked memory techniques can be found in the C/VMEMTRAP directory on the enclosed diskette. Also, C/VDDVGA is a good source of memory management sample code.
A linear address in a paging operating system such as VMM is decoded shown in Figure 3.2. Each PTE is 4 bytes in length and contains the access bits and physical address of the page. To examine the PTEs of the first megabyte of the active virtual machine, use page numbers in the range 0 to 10Fh. Page numbers of other virtual machines are computed using the CB_High_Linear field in the control block of the respective VM.
Given a pointer to a memory block in a VM, a VxD can use the Map_Flat service to translate this address to a flat offset. Shifting this address right by 12 gives you the page number. To determine if pages in a hooked V86 range have been accessed or if data has been written to these pages use the following code:
VMMCall CopyPageTable, <guHookedPagesStart,\ guNumHookedPages,\ <OFFSET32 aPageBuf>,0> mov eax,guNumHookedPages Check_Accessed_Or_Dirty: test dword ptr aPageBuf[ecx],P_ACC or P_DIRTY jz SHORT Next_Page Trace_Out "Page #ECX of hooked range is dirty or has been accessed" Next_Page: loop Check_Accessed_Or_Dirty |
Figure 3.2: Decoding a linear address to a physical address
VMMCall _BuildDescriptorDWoRDs,<dwLinAddr,cbSize,\ RW_Data_Type,0,0> VMMCall _Allocate_GDT_Selector,<edx,eax,0> |
The following equates are useful when building descriptor double-words:
;Common definitions for segment and control descriptors D_PRES segment is present in memory D_NOTPRES segment not present D_DPL0 descriptor privilege level definitions D_DPL1 D_DPL2 D_DPL3 D_SEG segment descriptor (application type) D_CTRL control descriptor (system type) D_GRAN_BYTE limit in byte granularity D_GRAN_PAGE limit in page granularity D_DEF16 default operation size is 16 bits (code) D_DEF32 default operation size is 32 bits (code) ;Definitions specific to segment descriptors D_CODE code segment D_DATA data segment D_RX if code, readable D_X if code, executable only D_W if data, writeable D_R if data, read only D_ACCESSED segment accessed bit ;Useful segment definitions RW_Data_Type present R/W data segment R_Data_Type read-only data segment Code_Type code segment |
For example, if an MS-DOS device driver maintains an input buffer, it may be useful to have the buffered input directed to the VM that was active when the buffer was filled. In this case, the VxD would query the device driver for the buffer address and maximum size and add an instance data area as shown here:
//Define instance data for instance data manager INSTDATASTRUC Instance_Area={ NULL,NULL,NULL,NULL,ALWAYS_Field}; //Specify instanced area as provided by DOS driver. Instance_Area.dwInstLinAddr=pInputBuffer; Instance_Area.dwInstSize=dwBufferSize; if (!VMM_AddInstanceItem(&Instance_Area,0)) goto DI_FatalError; |
//Allocate a global V86 data area of 512 bytes gdwGlobalArea=vmmAllocateGlobalV86DataArea(512,GVDADWordAlign); if (gdwGlobalArea==NULL){ vmmDebugout("Failed to allocate global V86 data area!\r\n"); return FALSE; } vmmTraceOutParam("Allocated global area at #EAX\r\n",gdwGlobalArea); |
The _Allocate_Global_V86_Data_Area service accepts the following flags:
GVDADWordAlign | Aligns the block on a doubleword boundary. |
GVDAHighSysCritOK | Informs the services that the VxD can handle a block that is allocated from high MS-DOS memory, such as UMBs or XMS. (Win 3.1 only) |
GvDAInquire | Returns the size in bytes of the largest block that can be allocated, given the requested alignment restrictions. (Win 3.1 only) |
GVDAInstance | Creates an instance data block, allowing the VxD to maintain separate blocks for each VM. |
GVDAPageAlign | Aligns the block on a page boundary. |
GVLAParaAlign | Aligns the block on a paragraph boundary. |
GVDAReclaim | Unmaps the physical pages in the block when mapping the system null page into the block. The physical pages are added to the free list when this value is specified. Only applies to blocks allocated on a page boundary. If this flag is not specified, it is up to the virtual device to reclaim these pages. |
GVDAWordAlign | Aligns the block on a word boundary. |
GVDAZeroInit | Fills the allocated block with zeros. |
In the VMEMTRAP sample, an unassigned V86 area is located and assigned
to the virtual device.
Pages are allocated for each new VM and "instanced" pages are simulated,
using hooked V86 pages and a page-fault handler.
Using the _AllocateGlobalV86DataArea service specifying
the GDVAInst accomplishes the same thing in a single
service call,
with the exception that a specific V86 range cannot be specified.
The
VMEMTRAP sample on the enclosed diskette is designed
to demonstrate the techniques necessary to manage contention of
memory mapped devices.
_AllocateGlobalV86DataArea has limitations. For example, you cannot hook the page fault handler or modify the page bits of the V86 linear range returned by this service. Windows 3.x does not provide an interface to allow VxDs to monitor access of these pages other than viewing the page table entry access bits. A virtual device must provide an additional interface to manage VM contention of these pages using software interrupts or the VxD's API.
An unsupported method of providing page protection is to modify the page table entries (PTEs) directly and hook the Invalid_Page_Fault handler. The PTE contains the page frame address in the upper 20 bits (4k page aligned), and the lower 12 bits provide access restriction and accessed and/or dirty information.
Entry 0 in the page directory contains the physical address of the page table for the V86 address space of the active VM. By modifying these page table entries, you can modify the access rights to a given page in V86 address space.
You must use caution when accessing the page tables directly. Modifying not-present page tables or incorrectly modifying page access bits will cause the system to crash. In other words, "Ok, here's your weapon, first point it at your foot before pulling the trigger!"
Page protection is risky business when it is not directly supported by the host operating system, but some implementations require such information about how a VM is behaving. Take note!! You can guarantee that anything that you do now to provide this mechanism may not be supported in future releases of Windows. Use this information at your own risk and version bind your code to the Microsoft Windows 3.1 VMM.
Figure 3.3: Possible design of TSR to VxD communication
For example, Int 21h commonly uses buffers referenced by DS:DX. The DOSMGR virtual device provides automatic buffer translation for most of these APIs by hooking Int 21h and translating the protected mode addresses so that DOS can understand the request without additional work required by the protected-mode application. Additionally VNETBIOS provides buffer mapping for NetBIOS data packets using V86MMGR services. These buffers are updated as the result of interrupt processing.
V86MMGR provides two types of services: buffer mapping and buffer translation. The mapping services update the page tables in all VMs so that the buffer is in global V86 space. The translation services copy a buffer to a V86 copy buffer and use the copy buffers address to communicate with the DOS device driver code. The mapping services should be used only when the buffers will be updated asynchronously. Do not use the mapping services in place of the translation services to avoid copying the buffers data- it is faster to copy data to and from a translation buffer than to map a buffer into multiple virtual machines.
V86MMGR does not directly support the mapping or translation of buffers referenced by pointers within a structure. The VxD is responsible for translating or mapping the buffer using V86MMGR services; it updates the structure to contain a valid V86 pointer and then passes the call to the DOS device driver.
When a VxD requires V86MMGR services, it must inform V86MMGR how many pages are required by using the V86MMGR_Set_Mapping_Info service. This service call must be made during initialization, preferably during Sys_Critical_Init processing. Alternatively, the VxD can call this service during Device_Init, if the VxD has an Init_Order less than V86MMGR_Init_Order.
When a call to the DOS device has been intercepted by the VxD, the VxD should determine whether the call is from V86 mode or protected mode. When a V86 call is trapped, buffer translation is not necessary, but mapping for asynchronously updated buffers may be necessary if the buffer is not located in global V86 address space determined by using the _TestGlobalV86Mem service.
To map pages to DOS addressable memory, a VxD calls V86MMGR_Map_Pages with the linear address and number of bytes to map. The returned linear address is guaranteed to be in the first megabyte and in global V86 address space. A map handle is also returned by this service. When the mapping region is no longer required, it is freed using the V86MMGR_Free_Page_Map_Region service with the map handle that was returned by V86MMGR_Map_Pages.
To translate a protected-mode buffer to V86 addressable memory, a VxD calls V86MMGR_Allocate_Buffer with the linear address of the buffer to translate and the number of bytes to allocate. If specified, this service copies data to the new buffer. Translation buffers are allocated in a "stack" fashion. In other words, the last buffer allocated must be the first buffer freed. When the translation buffer is no longer required, the V86_Free_Buffer service is used.
The following code fragment demonstrates how a software interrupt buffer is translated from a protected-mode to a real-mode driver:
; On entry Client_DS:Client_DX points to a buffer that is ; filled asynchronously and needs to be mapped globally. ; Eat the PM interrupt and reflect it to V86 mode. ; When the DOS device driver has completed the data ; transfer, the pages must be unmapped using the ; V86MMGR_Free_Page_Map_Region service. BeginProc PM_Translate pushad test [ebx.CB_VM_Status], VMStat_PM_Exec jz SHORT PT_Bail VMMCall Simulate_Iret Map_Flat esi, DS, DX movzx ecx,[ebp.Client_CX] VxDCall V86MMGR_Map_Pages mov hPageMap,esi shl edi,12 shr di,12 ; Simulate the interrupt to V86 Push_Client_State Begin_Nest_V86_Exec mov [ebp.Client_DX],di shr edi,16 mov [ebp.Client_DS],di mov eax,Trapped_INT VMMCall Exec_Int VMMCall End_Nest_Exec Pop_Client_State clc jmp SHORT PT_Exit PT_Bail: Debug_out "Failure: Call not from protected mode!" stc PT_Exit: popad ret EndProc PM_Translate |
V86MMGR provides a number of macros to define a script for use with the V86MMGR_Xlat_API service. A VxD defines a translation script in its data segment using these translation macros and calls the V86MMGR service to execute the script. This provides the VxD with a way to reduce the code size of V86 translation services and to use the optimized routines in V86MMGR.
The translation scripts are terminated by Xlat_API_Exec_Int or Xlat_API_Jmp_To_Proc. When the V86MMGR_Xlat_API service executes one of these commands, control returns to the VxD after the command has been executed. The following sample code demonstrates the use of these macros to translate a null-terminated string for a call to a DOS device driver:
; This code demonstrates a simple translation of a NULL ; terminated string in DS:SI to a local V86 buffer. VxD_DATA_SEG Xlat_ASCIIZ_Script: Xlat_API_ASCIIZ ds, si Xlat_API_Exec_Int 60h VxD_DATA_ENDS VxD_CODE_SEG BeginProc Translate_Int60h_Buffer mov edx,OFFSET32 Xlat_ASCIIZ_Script VxDJmp V86MMGR_Xlat_API EndProc Translate_Int60h_Buffer VxD_CODE_ENDS |
A VxD can export an API to protected-mode and V86 mode applications, extending the capabilities of a Windows or MS-DOS driver using supervisor code. For example, the VCD provides an interface to the Windows communications driver (COMM.DRV) to acquire a COM port. The COMM driver queries the VCD for the availability of a given port. If the port is in use by an MS-DOS application, the VCD returns failure. This API allows the COMM.DRV to provide intelligent information regarding the availability of COM ports to the calling application and provides a mechanism to manage device contention.
A VxD declares the API support by defining API procedure entry points in the DDB (see Chapter 1). In the following example, VSIMPLED_V86_API_Proc and VSIMPLED_PM_API_Proc procedures are the entry points for the API from V86 mode and protected mode, respectively. Additionally, the VxD must declare the device ID, as supplied by Microsoft.
Declare_Virtual_Device VSIMPLED,\ VSIMPLED_MAJOR_VER,\ VSIMPLED, MINOR_VER,\ VSIMPLED_Control_Proc,\ VSIMPLED_Device_ID,\ Undefined_Init_Order,\ VSIMPLED_V86_API_Proc,\ VSIMPLED_PM_API_Proc |
An application acquires the entry point of the VxD by using Int 2Fh with AX=1684h and BX=VxD_Device_ID:
; Obtain the VxD entry point, if NULL, VxD is not present. mov ax,1684h ; get VxD API entry point mov bx,VSIMPLED_Device_ID int 2fh mov word ptr dwVxDEntry[0],di mov word ptr dwVxDEntry[2],es |
When this entry point is called by the application, the call is dispatched to the VxD, where it processes the request and returns control to the calling application.
Prior to requesting the VxD entry point from VMM, the application should first determine whether Windows/386 (VMM) is present. A Windows application can use the GetWinFlags() API. A DOS application needs to use Int 2Fh, AX=1600h interface to determine whether VMM is present:
mov ax,1600h ;Enhanced Windows Check int 2fh test al,7fh ;VMM (Win386) present? jz Not_Win386 |
VMM determines the operation mode of the VM by testing the status flags in the VM control block. It determines whether the call was made from V86 or protected mode and then dispatches the call at ring 0 to the appropriate handler, as declared in the DDB.
typedef struct tagCRS_32{ DWORD Client_EDI; DWORD Client_ESI; DWORD Client_EBP; DWORD dwReserved_1; //ESP at pushall DWORD Client_EBX; DWORD Client_EDX; DWORD Client_ECX; DWORD Client_EAX; DWORD Client_Error; //DWORD error code DWORD Client_EIP; WORD Client_CS; WORD wReserved_2; //(padding) DWORD Client_EFlags; DWORD Client_ESP; WORD C1ient_SS; WORD wReserved_3; //(padding) WORD Client_ES; WORD WReserved_4; //(padding) WORD Client_DS; WORD wReserved_5; //(padding) WORD Client_FS; WORD wReserved_6; //(padding) WORD Client_GS; WORD wReserved_7; //(padding) DWORD Client_Alt_EIP; WORD Client_Alt_CS; WORD wReserved_8; //(padding) DWORD Client_Alt_EFlags; DWORD Client_Alt_ESP; WORD Client_Alt_SS; WORD wReserved_9; //(padding) WORD Client_Alt_ES; WORD WReserved_10; //(padding) WORD Client_A1t_DS; WORD wReserved_11; //(padding) WORD Client_Alt_FS; WORD wReserved_12; //(padding) WORD Client_Alt_GS; WORD wReserved_13; //(padding) } CRS_32, *PCRS_32 |
The parameters to the API call, as set by the calling application, are contained in the CRS, and the current VM handle is in EBX.
A VxD usually defines a jump table to the specific API functions that perform the requested action and return the results to the API handler that reflects the results in the CRS. The following example code demonstrates how functions are dispatched from a VxD API procedure entry point:
; DEVICE DATA VxD_DATA_SEG DOSXFER_PM_Call_Table LABEL DWORD dd OFFSET32 DOSXFER_Get_Version dd OFFSET32 DOSXFER_PM_Enable_CallBacks dd OFFSET32 DOSXFER_PM_Copy_Data Max_DOSXFER_PM_Service equ ($ - DOSXFER_PM_Call_Table) / 4 VxD_DATA_ENDS ; EXPORTED API BeginProc DOSXFER_PM_API_Proc, PUBLIC Trace_Out "In DOSXFER_PM_API_Proc" VMMCall Test_Sys_VN_Handle IFDEF DEBUG jz SHORT @f Debug_Out "DOSXFER_PM_API_Proc not from SYS VM" @@: ENDIF jnz SHORT DOSXFER_PM_Call_Bad movzx eax,[ebp.Client_DX] ; function in DX cmp eax,Max_DOSXFER_PM_Service jae SHORT DOSXFER_PM_Call_Bad and [ebp.Client_EFLAGS],NOT CF_Mask ; clear carry call DOSXFER_PM_Call_Table[eax*4] ; call service jc SHORT DOSXFER_PM_API_Failed ret DOSXFER_PM_Call_Bad: IFDEF DEBUG Debug_Out "Invalid function #EAX on DOSXFER_PM_API_Proc" ENDIF DOSXFER_PM_API_Failed: or [ebp.Client_EFLAGS],CF_Mask ; set carry ret EndProc DOSXFER_PM_API_Proc |
Modification of the client registers is made easy using these structure definitions:
;Copy the data structure to the VM and return the results ;of the function. ;EBX = VM handle, EBP = -> CRS Client_Ptr_Flat edi,ES,DI lea esi,gDataStruc mov ecx,size DATASTRUCT shr ecx,1 rep movsw adc cl,cl rep movsb mov [ebp.Client_CX],size DATASTRUCT mov [ebp.Client_AX],1 ; SUCCESS! and [ebp.Client_EFlags],NOT CF_Mask ; clc |
A VxD may also update a buffer referenced in the CRS by obtaining a flat address using the mapping services discussed in Chapter 3.
;Determine the execution mode of the VM. test [ebx.CB_VM_Status],VMStat_PM_Exec jz SHORT API_VM_In_V86 test [ebx.CB_VM_Status],VMStat_PM_Use32 jz SHORT API_VM_In_PMl6 API_VM_InPM32: Debug_Out "VM calling from 32-bit protect mode." ret API_VM_In_V86: Debug_Out "VM calling from V86 mode." ret API_VM_In_PMl6: Debug_Out "VM calling from 16-bit protected mode." ret |
Note: In Windows 3.x, calling VxD procedures through VxD API calls from 32-bit code segments in the System VM can cause unexpected results when the offset of the return address of the calling routine is greater than 0xFFFF. This is a problem with the way that VMM determines the "32-bitness" of the calling application. The System VM is flagged for 16-bit protected mode operation, because Krnl386.EXE is responsible for the switch to protected mode when the Windows GUI is started. Whether 32-bit segments are allocated within the System VM and code within these segments calls VxD APIs, VMM determines that the calling application is 16-bit because of the VM flags. The return address is assumed to be 16 bits and is truncated. This is also a problem for protected-mode software interrupts hooked by a VxD. The only current work-around is to guarantee that the code calling the VxD has a return address with an offset less than 0xFFFF.
Callbacks can be used to simulate DOS devices that return a pointer to a jump table by allocating a global V86 table and stuffing the address of the callback allocated using Allocate_V86_Call_Back service into this table. A segment and offset are returned that directs any calls to this routine to the VxDs callback procedure. The CRS reflects the current state of the VM when the callback entry point was called by the VM. A VxD can also provide a "chaining" interface to hooked software interrupts by using these services.
A VxD with "carnal" knowledge of a DOS device driver can intercept calls to this device by using the Install_V86_Break_Point service. This service patches the memory at the requested address with a call to the break point. When the break point is executed, the VxD can process the VM request as necessary and then return control by "bumping" the IP to the next instruction or by using Simulate_Far_Jmp to move the Client_CS:Client_IP to the correct address.
The nested execution services of VMM provide a controlled environment in which a VxD can cause a redirection of the execution path in a VM. A VxD saves the client registers, begins a nested execution block forcing a VM into V86 or protected mode, calls the necessary services to set up stack frames, and then resumes the VM execution. When the VM returns, the nested execution block is ended and the client registers are restored. Using this technique, a VxD can force the execution of code in TSRs, DOS applications, and even Windows procedures.When calling routines in a VM other than the current VM, you may need to schedule a VM event to force a specific VM to become active. You may also need to determine the execution status of the VM and wait for critical sections to be completed, interrupts to be enabled, and so on. In these cases, you can use the Call_Priority_VM_Event service and begin the nested execution when the event is processed.
Note that when a VxD simulates calls to a VM and the execution has returned to the VxD, the VxD must copy the results from the CRS before restoring the client's state:
;Simulate a software interrupt to the current VM Push_Client_State VMMCall Begin_Nest_V86_Exec mov [ebp.Client_AX], 4257h ; specific function mov [ebp.Client_BX], 4C57h ; subfunction mov eax, 60h VMMCall Simulate_Int VMMCall Resume_Exec VMMCall End_Nest_Exec movzx eax,[ebp.Client_AX] ; get return value Pop_Client_State |
What magic occurs in this code that allows a VxD to simulate an interrupt call in a VM? The Push_Client_State macro allocates space on the stack and copies the current CRS to this block. Begin_Nest_V86_Exec modifies the VM state so that the execution block occurs in V86 mode. Simulate_Int builds an IRET frame and modifies the client's stack and CS:(E)IP to call the interrupt handler. Resume_Exec forces VMM to complete event processing and then resumes the execution of the VM. When the VM completes the execution block, control returns to the VxD and the End_Nest_Exec restores the VM's execution state. The Pop_Client_State macro restores the client's registers, as saved on the stack.
To call Windows functions, you must use a helper application or DLL to provide the procedure address to the VxD. The VxD can then use the nested execution services to simulate a far call to the procedure in the System VM. If a VM context switch is required (if the current VM is other than the System VM), the VxD must schedule a VM event to call the procedure. The following code sample calls the Windows PostMessage() function from a VxD assuming the PostMessage function pointer was obtained from the application or DLL:
;VSIMPLED_NotifyApp ; ;This routine notifies the Windows application through a ;call to the PostMessage() API. ;ENTRY: ; EDX:contains the lParam of the message ;USES: ; FLAGS BeginProc VSIMPLED_NotifyApp, High_Freq VMMCall Test_Sys_VM_Handle je SHORT VSIMPLED_PostEvent NA_Schedule: push ebx mov eax, High_Pri_Device_Boost VMMCall Get_Sys_VM_Handle mov ecx, PEF_Wait_For_STI OR PEF_Wait_Not_Crit mov esi, OFFSET32 VSIMPLED_PostEvent xor edi, edi VMMCall Call_Priority_VM_Event pop ebx ret EndProc VSIMPLED_NotifyApp ;VSIMPLED_PostEvent ; ;Called by the priority VN event dispatch routine or ;directly if System VM was already active. ; ;ENTRY: ; EBX: The system VM handle ; EBP: Client register structure ; EDX: Reference data ;USES: ; EAX, EDX, FLAGS BeginProc VSIMPLED_PostEvent Trace_Out "In VSIMPLED_PostEvent" cmp lpPostNessage, 0 ; Q: ptr == NULL? je SHORT PE_Exit ; Y: can't call Push_Client_State VMMCall Begin_Nest_Exec mov ax, NotifyWnd ; handle to window VMMCall Simulate_Push mov ax, NotifyMsg ; notification msg VMMCall Simulate_Push xor ax, ax VMMCall Simulate_Push ; wParam is NULL mov eax, edx shr eax, 16 VMMCall Simulate_Push ; lParam is ref data mov eax, edx VMMCall Simulate_Push movzx edx, WORD PTR [lpPostMessage] mov cx, WORD PTR [lpPostNessage + 2] VMMCall Simulate_Far_Call ; call PostMessage() VMMCall Resume_Exec VMMCall End_Nest_Exec Pop_Client_State PE_Exit: ret EndProc VSIMPLED_PostEvent |
Figure 5.1: Possible design of calling a TSR directly (at ring 0) from a VxD
This method makes some assumptions of the way TSRs are loaded in the system:
;VCALLTSR_Sys_Critical_Init ;DESCRIPTION: ; Allocates necessary GDT selectors. ;ENTRY: ; EBX = handle to Sys_VM ; EDX = reference data from real-mode init ;EXIT: ; Carry clear if no error, otherwise set if failure. ;USES: ; Flags BeginProc VCALLTSR_Sys_Critical_Init Trace_Out "VCALLTSR: Sys_Critical_Init" pushad ; Note: ; An assumption is made that CS:0 is the base of the TSR. ; Since we don't have a segment size, we'll assume 1 page, ; but this could be handled by using a pointer to a structure ; within the TSR obtained from Exec_Int instead of using ; Real_Mode_Init to gather the information. mov eax, edx movzx edx, ax mov dwTSR_Ring0_EIP, edx shr eax, 16 shl eax, 4 push eax ; save address VMMCall _BuildDescriptorDWORDS, < eax, <P_SIZE \ <Code_Type + D_DPL0>, \ D_DEF16,\ BDDExplicitDPL > VMMCall _Allocate_GDT_Selector, < edx, eax, 0 > or eax, eax jnz SHORT SCI_GotCSSel pop eax jmp SHORT SCI_Failure SCI_GotCSSel: mov dwTSR_Ring0_CS, eax pop eax ; restore address VMMCall _BuildDescriptorDWoRDs, < eax, <P_SIZE - <RW_Data_Type + D_DPL0>,\ D_DEF16,\ BDDExplicitDPL > VMMCall _Allocate_GDT_Selector, < edx, eax, 0 > or eax, eax jz SHORT SCI_Failure mov dwTSR_Ring0_DS, eax VMMCall _BuildDescriptorDWoRDS, < <OFFSET32 VCT_Switch>,\ VCT_Switch_Size,\ <Code_Type + D_DPL0>, D_DEF32,\ BDDExplicitDPL > VMMCall _Allocate_GDT_Selector, < edx, eax, 0 > or eax, eax jz SHORT SCI_Failure mov wTSR_Switch_To_Flat_CS, ax mov eax, 500 ; 500 ms timeout xor edx, edx ; no data mov esi, OFFSET32 VCALLTSR_TimeOut VMMCall Set_Global_Time_Out mov hTimeout, esi popad clc ret SCI_Failure: ; Free any allocated selectors and exit mov eax, dwTSR_Ring0_CS or eax, eax jz SHORT SCI_Failure_TryDS VMMCall _Free_GDT_Selector, <eax, 0> SCI_Failure_TryDS: mov eax, dwTSR_Ring0_DS or eax, eax jz SHORT SCI_Failure_TryF1at VMMCall _Free_GDT_Selector, <eax, 0> SCI_Failure_TryFlat: movzx eax, wTSR_Switch_To_Flat_CS or eax, eax jz SHORT SCI_Failure_Exit VMMCall _Free_GDT_Selector, <eax, 0> SCI_Failure_Exit: popad stc ret EndProc VCALLTSR_Sys_Critical_Init |
When the timeout procedure is called, the stack frames are created to call the TSR code directly. When the TSR returns the VxD unwraps the stack to get back to 32-bit flat model:
;VCALLTSR_TimeOut ;DESCRIPTION: ; Event handler for global timeout. Calls TSR code directly ; from ring 0. ;ENTRY: ; EBX = Current VN handle ; ECX = additional ms since timeout ; EDX = reference data ; EBP = &CRS ;EXIT: ; Reschedules time-out. ;USES: ; All registers. BeginProc VCALLTSR_TimeOut pushad mov hTimeout, 0 ; clear handle Trace_Out "Setting up stack frames to call TSR." ; This stack frame is so we can get back to flat model. push cs ; save CS mov eax, OFFSET32 VCALLTSR_Back_To_Flat push eax ; save EIP ; This stack frame will get us back to 32-bit code in ; the VxD and is addressable via 16:16 for the TSR. push ds ; save off DS push dwTSR_RETF_From_16 ; This is the stack frame used to get us to the TSR ; code. Additionally, DS is setup with a R/W pointer ; to the same base address. mov eax, dwTSR_Ring0_DS mov ds, ax push cs:dwTSR_Ring0_CS push cs:dwTSR_Ring0_EIP retf ; go to the TSR VCT_Switch: pop ds ; restore DS retf ; return to flat VCT_Switch_Size equ ($ - VCALLTSR_Switch_To_F1at) - 1 VCALLTSR_Back_To_Flat: Trace_Out "Back in flat model. Return from TSR" ; Reschedule time out event mov eax, 500 ; 500 ms timeout xor edx, edx ; no data mov esi, OFFSET32 VCALLTSR_TimeOut VMMCall Set_Global_Time_Out mov hTimeout, esi popad ret EndProc VCALLTSR_TimeOut |
I/O protection is a powerful feature provided by the 80386/80486 chipset. When the Current Privilege Level (CPL) is less than or equal to the I/O privilege level (IOPL), the following instructions can be executed:
VMM keeps a copy of the IOPM for each VM (it is associated with the TSS and other task information). VxDs can enable or disable access to ports by modifying the IOPM using VMM services. Also, it is possible to trap ports in one VM and allow access to the hardware directly in another VM.
The Install_IO_Handler and Install_Mult_IO_Handlers services install handlers that are called when the GP fault handler has determined that I/O to the associated port has caused the fault. VMM provides the Enable_Local_Trapping, Enable_Global_Trapping, Disable_Local_Trapping, and Disable_Global_Trapping.
Trapping services to modify the IOPM of virtual machines to enable and disable access to the I/O ports.
I/O trapping is the primary method used to manage device contention. By allowing only one VM access to a hardware device address space, the VxD can manage accesses by other VMs. For cases of contention, a VxD can simulate the device I/O and submit the actual hardware request when the hardware is free, ignore the hardware access, and return as though the hardware did not exist, or crash the VM attempting to access the hardware.
A VxD can simulate hardware that does not exist by virtualizing the device using a finite state machine (or other similar method) and returning the appropriate information to the requesting application.
These services associate a callback (or table of callbacks) with an I/O port (or table of I/O ports). By default, global trapping is enabled, any access to the trapped ports causes a fault, and the associated callback procedure is called.
An I/O table has the following format:
VxD_IDATA_SEG Begin_VxD_IO_Table VTRAPIOD_Port_Table VxD_IO TRAP_IO_IDX, VTRAPIOD_10_Index_Reg VxD_IO TRAP_IO_DATA, VTRAPIOD_10_Data_Reg End_VxD_IO_Table VTRAPIOD_Port_Table VTRAPIOD_Port_Table_Entries equ (($-VTRAPIOD_Port_Table)-\ (SIZE VxD_IOT_Hdr)) / (SIZE VxD_IO_Struc) VxD_IDATA_ENDS |
This table uses offsets from the base I/O address as the port address. When the base address of the hardware has been determined, the VxD can update the I/O table and install the handlers:
;VTRAPIOD_Device_Init ; ;DESCRIPTION: ; Non critical system initialization procedure. ;ENTRY: ; EBX = Sys VM Handle ;EXIT: ; CLC if everything's A-OK, otherwise STC ;USES: ; Flags. BeginProc VTRAPIOD_Device_Init Trace_Out "VTRAPIOD: Device_Init" pushad ; Build an I/O port table for Install_Mult_IO_Handlers ; using the base address. mov ecx, VTRAPIOD_Port_Table_Entries mov esi, OFFSET32 VTRAPIOD_Port_Table mov edx, VTRAPIOD_Base_10 DI_Install_IO_Handlers: mov edi, esi ; save a copy in EDI add esi, (size VxD_IOT_Hdr) DI_Bump_IO_Loop: add [esi.VxD_IO_Port], dx ; add port base to offset add esi, (size VxD_IO_Struc) loop DI_Bump_IO_Loop ; Tell VMM to trap ports. VMMcall Install_Mult_IO_Handlers ifdef DEBUG jnc SHORT DI_Exit Debug_Out "VTRAPIOD: cannot trap ports!!" endif DI_Exit: popad ret EndProc VTRAPIOD_Device Init |
When an I/O port within the given range has been accessed, the fault handler dispatches to the associated I/O handler. For this example, the index register simply stores the index if valid (on write) or returns the current index (on read):
;VTRAPIOD_IO_Index_Reg ; ;DESCRIPTION: ; Handles IO trapping. ; This is a virtual R/W index register. ;ENTRY: ; EBX = VM Handle. ; ECX = Type of I/O ; EDX = Port number ; EBP = Pointer to client register structure ;EXIT: ; EAX = data input or output depending on type of I/O ;USES: ; FLAGS BeginProc VTRAPIOD_IO_Index_Reg, High_Freq Dispatch_Byte_Io Fall_Through, <SHORT IIR_Out> mov al, bIndex clc ret IIR_Out: cmp al, VTRAPIOD_Max_Index ja SHORT IIR_Exit mov bIndex, al IIR_Exit: clc ret EndProc VTRAPIOD_IO_Index_Reg |
The one drawback with this simple I/O trapping interface is that there is a single global virtual device. Multiple VMs can simultaneously (well, almost simultaneously) access this device and may inadvertently affect the processing of another VM by switching the index register while a different VM is updating an indexed data register. This is commonly referred to as device contention, and this VxD must be improved to properly handle contention between VMs. The next below discusses this topic in greater detail.
Note: The VTRAPIOD sample in the ASM\VTRAPIOD directory of the enclosed diskette demonstrates I/O trapping and dispatching techniques.
To avoid these problems, a VxD implements one of the following methods of device contention:
To implement this form of device contention, all I/O ports for the hardware device are trapped. When a VM accesses a trapped port, the handler routine checks to see whether the device has been assigned to a VM. If a contention is detected, the VxD may display a warning message using the Shell VxD's API and then return with carry set for all reads and writes to the hardware. If there is no current owner, the VxD assigns the device to the VM and disables the I/O trapping for the VM using the Disable_Local_Trapping service. When the VM terminates or when the hardware is explicitly released by the VM, the VxD re-enables the trapping for the VM, using the Enable_Local_Trapping service, and clears the owner status of the hardware.
The following sample code is contention management in its simplest form:
;VCONTEND_Check_Owner ; ;DESCRIPTION: ; Checks the current VM owner; if none, assigns ; device to VM. If the VN is an owning VM, returns ; carry clear, otherwise it returns carry set. ;ENTRY: ; EBX = VM Handle. ;EXIT: ; CLC if owner OK, or STC if contention ;USES: ; FLAGS BeginProc VCONTEND_Check_Owner, High_Freq push eax mov eax, hOwnerVM or eax, eax jz SHORT CO_Assign_To_VM cmp eax, ebx jne SHORT CO_Failure CO_Success: pop eax clc ret CO_Assign_To_VM: mov hOwnerVM, ebx jmp SHORT CO_Success CO_Failure: pop eax stc ret EndProc VCONTEND_Check_Owner ;VCONTEND_IO_Index_Reg ; ;DESCRIPTION: ; Handles IO trapping. ; This is a virtual R/W index register. ;ENTRY: ; EBX = VMM Handle. ; ECX = Type of I/O ; EDX = Port number ; EBP = Pointer to client register structure ;EXIT: ; EAX = data input or output depending on type of I/O ;USES: ; FLAGS BeginProc VCONTEND_IO_Index_Reg, High_Freq call VCONTEND_Check_Owner jc SHORT IIR_Exit Dispatch_Byte_IO Fall_Through, <SHORT IIR_Out> mov al, bIndex clc ret IIR_Out: cmp al, VCONTEND_Max_Index ja SHORT IIR_Exit mov bIndex, al clc IIR_Exit: ret EndProc VCONTEND_IO_Index_Reg |
Note that with this method of contention management, the hardware remains in the state the last owning VM left it in. You may decide to define an initial state for a VM in the VM control block and update the state when the VM releases the hardware. When a VM acquires the hardware, the state would he copied from the VM's control block to the hardware.
VxDs can use these techniques to translate common hardware interfaces to new or improved hardware interfaces and maintain the backward compatibility of the older platforms for MS-DOS applications.
To fully virtualize a hardware interface, your VxD may need to incorporate IRQ virtualization and/or DMA virtualization. These topics are covered in Chapters 7 and 8, respectively.
Note: The VCONTEND sample in the ASM\VCONTEND directory on the enclosed diskette demonstrates the virtualization of a simple hardware interface and manages contention between multiple virtual machines.
The Virtual Programmable Interrupt Controller Device (VPICD) provides an interface to hook (virtualize) lRQs, query information about the state of a hooked IRQ, simulate hardware interrupts to VMs, share interrupts, and handle interrupts in the System VM with a single ISR interface using the bimodal interrupt interface.
During initialization, the VPICD configures the PICs (slave and master), hooks the IDT entries, and establishes default handling for non-virtualized IRQs. The PICs are virtualized to all VMs. When a VM masks an interrupt, it is communicating with the VPICD and does not perform I/O directly to the PIC. VPICD provides services to affect the physical state of the PICs. It is strongly recommended that VxDs use this interface to change the physical state of a virtualized IRQ.
lRQ virtualization is recommended for hardware devices that use hardware interrupts as a form of communication with device drivers. There are several reasons for this recommendation:
The default hardware interrupt procedure (Hw_IntProc) simulates an interrupt to the current VM if the IRQ is unowned. When the IRQ is global, VPICD simulates the interrupt to the current critical section owner or the current VM, if there is no critical section owner. Also, interrupts simulated for global IRQs are nested in the VM until the nesting has been "unwound", but non-owned interrupts are always simulated to the current VM in all circumstances. When an interrupt is simulated to a VM (by a default IRQ handler or using the VPICD_Set_Int_Request service), the VM priority is boosted and the IRET procedure is hooked to notify the IRET procedure when the interrupt has been completed. These events only occur when the IRQ is not nested.
End-of-Interrupt results when the VM issues an EOI to the virtual PIC. The default EOI handler clears the virtual interrupt request and performs a physical EOI using the VPICD_Clear_Int_Request and VPICD_Phys_EOI services respectively.
By default each unowned or global interrupt procedure has a timeout of 500 ms. A VM timeout is scheduled to watch the interrupt processing time in a VM. If the ISR in the VM does not service the interrupt within the specified timeout period, VPICD continues execution as though the ISR had issued an IRET. The timeout is canceled when the VM issues an IRET (or the last IRET in a nested block).
VPICD simulates a level-triggered PIC. That is, when a virtual EOI occurs another interrupt will be simulated immediately unless the virtual interrupt request has been cleared by the VPICD_Clear_Int_Request service.
VPICD_IRQ_Descriptor STRUC VID_IRQ_Number dw ? VID_Options dw 0 VID_Hw_Int_Proc dd ? VID_Virt_Int_Proc dd 0 VID_EOI_Prac dd 0 VID_Mask_Change_Proc dd 0 VID_IRET_Proc dd 0 VID_IRET_Time_Out dd 500 VPICD_IRQ_Descriptor ENDS |
Some of the elements of this structure require further detail:
The following sample code demonstrates the use of VPICD services to virtualize an IRQ:
;INIT DATA VxD_IDATA_SEG VIRQD_IRQ_Descriptor VPICD_IRQ_Descriptor <,,\ OFFSET32 VIRQD_Hw_Int_Proc,, OFFSET32 VIRQD_EOI_Proc,,,> VxD_IDATA_ENDS ;INIT CODE VxD_ICODE_SEG ;VIRQD_Device_Init ; ;DESCRIPTION: ; Non critical system initialization procedure. ;ENTRY: ; EBX = Sys VM handle ;EXIT: ; CLC if everything's A-OK, otherwise STC ;USES: ; Flags. BeginProc VIRQD_Device_Init Trace_Out "VIRQD: Device_Init" push eax push edi mov edi, OFFSET32 VIRQD_IRQ_Descriptor mov [edi.VID_IRQ_Number], VIRQD_Interrupt VxDCall VPICD_Virtualize_IRQ ifdef DEBUG jnc SHORT @F Dehug_out "VIRQD: Unable to virtualize IRQ" jmp SHORT DI_Exit else jc SHORT DI_Exit endif mov hVirtIRQ, eax DI_Exit: pop edi pop eax ret EndProc VIRQD_Device_Init VxD_ICODE_ENDS |
When the hardware interrupt occurs, the following procedures simulate the interrupt to the current VM and clear the interrupt when the ISR issues an EOI to the virtual PIC:
;================================== ; HARDWARE INTERRUPT PROCEDURES ;================================== VxD_LOCKED_CODE_SEG ;VIRQD_Hw_Int_Proc ; ;DESCRIPTION: ; Hardware interrupt handler. Called by VPICD. ;ENTRY: ; EAX = IRQ handle ; EBX = current VN handle ;EXIT: ; CLC if processed, STC otherwise. ;USES: ; Flags. BeginProc VIRQD Hw_Int_Proc, High_Freq Trace_Out "<i" VxDCall VPICD_Set_Int_Request clc ret EndProc VIRQD_Hw_Int_Proc ;VIRQD_EOI_Proc ; ;DESCRIPTION: ; Hardware interrupt handler. Called by VPICD. ;ENTRY: ; EAX = IRO handle ; EBX = current VN handle ;EXIT: ; Nothing. ;USES: ; Nothing. BeginProc VIRQD_EOI_Proc, High_Freq Trace_Out "i>" VxDcall VPICD_Clear_Int_Request VxDCall VPICD_Phys_EOI ret EndProc VIRQD_EOI_Proc VxD_LOCKED_CODE_ENDS |
Note that services called during the processing of the Hw_Int_Proc procedure must be declared asynchronous (see Chapter 2 for a complete list of asynchronous services). If a VxD requires the use of a non-asynchronous service to continue interrupt processing, the VxD must schedule a global event to continue. The debug version of WIN386.EXE notifies you when you attempt to call a non-asynchronous service during interrupt processing. Heed the warnings of VMM, lest your ignorance cause the system to crash.
Note that the VxD cannot assume that subsequent calls to other callback procedures specified in the IRQ descriptor structure are the result of an interrupt for the associated hardware device. The VxD should set a flag when it has simulated an interrupt to a VM and test against this flag when notifications from VPICD are processed. When the VxD processes the EOI_Proc it should clear the flag, perform the necessary EOI procedures, and then return.
When a VxD requests an interrupt for a VM using the VPICD_Set_Int_Request service, the interrupt simulation may not occur immediately. There are several conditions that do not allow an interrupt to be simulated immediately:
Note that using VPICD_Set_Int_Request does not guarantee that an interrupt will be simulated to a VM. For example, if a VM has masked and never unmasks the IRQ, the interrupt will not be simulated. Additionally, a call to VPICD_Clear_Int_Request before the interrupt has been simulated prevents the VM from receiving the interrupt.
The example also does not demonstrate proper techniques when processing hardware interrupts for device contention management. The VIRQD_Hw_Int_Proc should be expanded to first determine whether an owner VM exists and then simulate the interrupt to that VM, as follows:
;VIRQD_Hw_Int_Proc ; ;DESCRIPTION: ; Hardware interrupt handler. Called by VPICD. ; Simulates the interrupt to the hardware owner or ; to the current VM if unowned. ;ENTRY: ; HAX = IRQ handle ; EBX = current VM handle ;EXIT: ; CLC if processed, STC otherwise. ;USES: ; EBX, Flags. BeginProc VIRQD_Hw_Int_Proc, High_Freq Trace_Out "<i" cmp hownerVM, 0 je SHORT HIP_SetIt mov ebx, hownerVN HIP_SetIt: VxDCall VPICD_Set_Int_Request clc ret EndProc VIRQD_Hw_Int_Proc |
A Hw_Int_Proc for servicing an interrupt directly might be similar to this:
;VIRQD_Hw_Int_Proc ; ;DESCRIPTION: ; Hardware interrupt handler. First, EOI the PIC ; so we avoid missing another IRO generated by the ; device. Call a procedure elsewhere in the VxD to ; service the hardware device and then return. ;ENTRY: ; EAX = IRQ handle ; EBX = current VM handle ; Interrupts are disabled. ;EXIT: ; CLC if processed, STC otherwise. ;USES: ; EBX, Flags. BeginProc VIRQD_Hw_Int_Proc, High_Freq Trace_Out "<i>" VxDCall VPICD_Phys_Eoi call VIRQD_Service_Hardware clc ret EndProc VIRQD_Hw_Int_Proc |
In this example, VIRQD_Hw_Int_Proc does not set the interrupt request for the VM. The VIRQ_Service_Hardware procedure may set an interrupt request to the owning VM when a threshold has been reached. This is strictly depended by the requirements of your hardware and the maximum amount of CPU load you wish to generate. The VxD could also use some other form of communication to a driver in a VM, such as nested execution or updating global memory buffers.
Additionally, the VIRQ_EOI_Proc would not perform a physical EOI of the PIC. Its only requirement would be to clear the interrupt request status for the VM if simulated interrupts are used to communicate with the VM's device driver. Note that interrupt simulation is an expensive procedure. Ring transitions and VM context switches are often a result of interrupt simulation, and reducing simulated interrupt generation will help reduce the total burden of the CPU.
The following services are available through the PM API of the VPICD to install and remove bimodal interrupt handlers:
The VPICD API can only be accessed via the protected mode API entry point. It is not available to V86 VMs. To access the VPICD API, a VM obtains the API entry point:
VPICD_Device_ID EQU 0003h VPICD_API_Get_Ver EQU 0000h VPICD_Install_Handler EQU 0001h VPICD_Remove_Handler EQU 0002h VPICD_Call_At_RingO EQU 0003h xor di, di mov es, di mov ax, 1684h ; get API entry point mov bx, VPICD_Device_ID ; of the VPICD int 2fh mov word ptr lpVPICDEntry, di mov word ptr lpVPICDEntry + 2, es mov ax, es or ax, di jz SHORT No_VPICD_API |
Under Windows 3.0, the VPICD entry point will be NULL, because it does not support any API functionality. If the entry point is not NULL, VPICD's version can be obtained:
Get_VPICD_Version: mov ax, VPICD_API_Get_Ver call dword ptr lpVPICDEntry jc SHORT VPICD Error cmp ax, 30Ah jbe SHORT VPICD_Error |
A DLL installs and removes a bimodal IRQ handler using the VPICD_API_Install and VPICD_API_Remove functions respectively:
Install_Bimodal_Handler: les di, lpBIS ; pointer to BIS struct. mov ax, VPICD_Install_Handler call dword ptr lpVPICDEntry jc SHORT VPICD_Error Remove_Bimodal_Handler: les di, lpBIS ; pointer to BIS struct. mov ax, VPICD_Remove_Handler call dword ptr lpVPICDEntry jc SHORT VPICD_Error |
In these routines, the Bimodal_Int_Struc (BIS) is referenced. This structure has the following format:
Bimodal_Int_Struc STRUC BIS_IRQ_Number dw ? BIS_VM_ID dw 0 BIS_Next dd ? BIS_Reserved1 dd ? BIS_Reserved2 dd ? BIS_Reserved3 dd ? BIS_Reserved4 dd ? BIS_Flags dd 0 BIS_Node dw 0 BIS_Entry dw ? BIS_Control_Proc dw ? dw ? BIS_User_Mode_API dd ? BIS_Super_Node_API dd ? BIS_User_Node_CS dw ? BIS_User_Node_DS dw ? BIS_Super_Node_CS dw ? BIS_Super_Node_DS dw ? BIS_Descriptor_Count dw ? Bimodal_Int_Struc ENDS |
The field definitions of this structure are detailed as follows:
BIS_IRQ_Number | VPICD installs a bimodal interrupt for the IRQ specified by this field when the VPICD_Install_Handler API is called. |
BIS_VM_ID | Contains the current VM ID when the interrupt handler specified by BIS_Entry is called. |
BIS_Next | Currently not used by the Windows 3.1 VPICD. |
BIS_Flags | Must be set to zero. |
BIS_Mode | Set to 0 to indicate user mode or 4 to indicate supervisor mode.
This value can be used as an offset to obtain the appropriate
user-mode or super-mode BIS API handler.
(Set by VPICD when calling the procedures defined by the
BIS_Entry and BIS_Control_Proc offsets.)
mov bx, es: [di.BIS_Node] ; mode 0=user, 4=super call es: [bx] Edi.BIS_User_Node_API] |
BIS_Entry | Specifies the offset of the ISR from the CS specified in the BIS_User_Mode_CS field. When VPICD calls the interrupt handler for interrupt servicing, ES:DI points to this structure. (Filled by caller for the call to VPICD_Install_Handler.) |
BiS_Control_Proc | Specifies the offset of the control procedure from the CS specified in the BIS_User_Mode_CS field. The control procedure is currently not used by the Windows 3.1 VPICD, but should point to a dummy control procedure that performs a far return. (Filled by the caller for VPICD_Install_Handler.) |
BIS_User_Mode_API | Specifies the far address of the user-mode API procedure entry point. (Filled by VPICD after a call to VPICD_Install_Handler API.) |
BIS_Super_Mode_API | Specifies the far address of the supervisor mode API procedure entry point. (Filled by VPICD after a call to the VPICD_Install_Handler API.) |
BIS_User_Mode_CS | Specifies the selector of the user-mode code segment of the interrupt handler. The BIS_Entry and BIS_Control_Proc offsets must be relative to the code selector specified by this field. (Filled by caller for VPICD_Install_Handler.) |
BIS_User_Mode_DS | Specifies the selector of the user-mode data segment of the interrupt handler. The Bimodal_Int_Struc structure should be located in this segment. (Filled by caller for VPICD_Install_Handler.) |
BIS_Super_Mode_CS | VPICD stores the GDT alias of the user-mode CS selector in this field after a call to VPICD_Install_Handler. |
BIS_Super_Mode_DS | VPICD stores a GDT alias of the user mode CS selector in this field after a call to VPICD_Install_Handler. |
BIS_Descriptor_Count | Specifies the number of EBIS_Sel_Struc structures immediately following the Bimodal_Int_Struc structure. VPICD creates a GDT alias for each of the selectors in the structures that follow. |
EBIS_Sel_Struc STRUC EBIS_User_Node_Sel dw ? dw ? EBIS_Super_Node_Sel dw ? EBIS_Sel_Struc ENDS EBIS_User_Mode_Sel User mode selector EBIS_Super_Mode_Sel GDT alias of selector created by VPICD after a call to VPICD_Install_Handler. |
VPICD automatically creates GDT aliases for the ISR code and data segments as specified in BIS_User_Mode_CS and BIS_User_Mode_DS, respectively. Additionally, the caller can request that VPICD create GDT aliases for a number of selectors specified by BIS_Descriptor_Count. The user-mode selectors are filled in an array of the EBIS_Sel_Struc structures immediately following the Bimodal_Int_Structure. The associated GDT aliases are returned in the EBIS_Super_Mode_Sel element of each of the EBIS_Sel_Struc structures. For example, the Windows 3.1 COMM driver uses this functionality to create CDT aliases of the receive and transmit queues. A DLL creates a Bimodal_Int_Struc and fills the appropriate fields. When the IRQ occurs, VPICD calls the ISR directly at ring 0, regardless of the current VM. On entry to the ISR, the CS is set to the GDT alias of the ISR code segment and ES:DI is set to the GDT alias of the Bimodal_Int_Struc. If this structure is located in the data segment, you can make the data addressable by moving ES into DS.
The ISR executes at ring 0 (CPL=0) through a 16-bit GDT code segment alias. As with calling TSR code directly from a VxD, the provided stack is a Use32 segment and parameter passing must reference the stack using 32-bits (ESP and EBP). The ISR cannot switch to a different stack unless a ring 0 stack selector is created. Note that a DLL cannot legally create such a selector.
The ISR must return from the procedure with a far return and carry clear if the IRQ was serviced or carry set if the IRQ was not serviced. When the ISR is called directly by VPICD, it must not manipulate the PIC directly. Instead, VPICD provides services through the BIS_Super_Mode_API procedure to perform these operations:
BIH_API_EOI BIH_API_Mask BIH_API_Unmask BIH_API_Get_Mask EQU 0003h BIH_API_Get_IRR EQU 0004h BIH_API_Get_ISR BIH_API_Call_Back |
BIH_API_EOI | Equivalent to calling VPICD_Phys_EOI. |
BIH_API_Mask | Equivalent to calling the VPICD_Physically_Mask service. |
BIH_API_Get_IRR | Equivalent to calling the VPICD_Test_Phys_Request service. Returns carry set if the physical interrupt request is set. |
BIH_API_Get_ISR | Retrieves the in-service state of the IRQ. Returns with carry set if the IRQ is in service. |
BIH_API_Call_Back | Uses the Call_Priority_VM_Event service to schedule an event for the target VM specified BX. When the event callback is processed, VPICD will use nested execution services to simulate a far call to the address specified by CX:DX. |
The BIH_API_Call_Back procedure is useful for calling routines that do not have GDT aliases or that must be executed in a specific VM. A common use of this service is to call a routine in the driver that posts a message using the PostMessage() Windows API.
Note: VMM schedules event services to process the callback in the specified VM. The callback is not executed synchronously. A driver should not post more than one event without notification that the event has been processed. If multiple events are posted without verifying that outstanding callbacks already exist, the VMM event services may run out of resources and crash the system.
The Virtual DMA Device (VDMAD) provides services that allow a VxD to take control of a DMA channel. A VxD using these services can intercept the DMA requests and modify the VM state causing the VM to believe that the request completed. Also, it is possible to translate or modify the VM's request before the physical state of the DMA controller is updated. Additionally, by using these services, a VxD can add another level of hardware contention management or indirectly replace portions of VDMAD's default handling.
All DMA channels are virtualized by VDMAD to map DMA requests by drivers to the physical hardware. VDMAD validates the memory region supplied by the driver, and if necessary, allocates the region from an internal DMA buffer.
Certain restrictions imposed by the DMA controller require the region management of VDMAD1:
VDMAD breaks up requests into partial DMA transfers to satisfy these requirements. DMA buffers submitted using the auto-init mode of the DMA controller cannot be broken; consequently, these requests must be submitted with regions adhering to the restrictions.
87 For this reason, auto-init-mode DMA requires special memory management on behalf of the device driver.
Note that this discussion does not cover advanced DMA topics, such as bus-mastering devices and DMA controllers supporting scatter-gather.
After the VM has unmasked the channel, VDMAD attempts to lock the memory region, as programmed by the VM. If it is unsuccessful, VDMAD buffers the DMA transfer and modifies the DMA controller's physical state.
VDMAD uses the VPICD_Hw_Int_Proc service to provide a watchdog event to poll for the DMA controller's terminal count when non-auto-init-mode DMA transfers are requested. When the DMA controller has completed the request, the necessary buffers are updated (if a read operation was requested and buffers were allocated) and the VM's virtual DMA state is updated to reflect the completed transfer.
A VxD can modify the DMA controller's virtual and physical states using the VDMAD_Set_Virt_State and VDMAD_Set_Phys_State services, which are usually incorporated with a handle of DMA channel that has been virtualized by a VxD.
;Tell VDMAD that we want to know about this DMA controller. xor eax, eax mov [gdwDMAHandle), eax movzx eax, gbDMAchannel mov esi, OFFSET32 VSIMPLED_Virtual_DMA_Trap VxDCall VDMAD_Virtualize_Channel mov [gdwDMAHandle], eax jc SHORT VDC_Exit_Failure |
When a VM has changed the virtualized DMA controller's mask state, it calls the supplied procedure, in this case VSIMPLED_Virtual_DMA_Trap. -t The VxD can modify the virtual state of the VM and then call the default handler, VDMAD_Default_Handler, to allow VDMAD to continue the region management as follows:
;VSIMPLED_Virtual_DMA_Trap ; ;DESCRIPTION: ; Forces DMA_block_mode and then calls the default DMA handler. BeginProc VSIMPLED_Virtual_DMA_Trap, High_Freq VxDCall VDMAD_Get_Virt_State test dl, DMA_requested jz SHORT VDT_Exit test dl, DMA_masked jnz SHORT VDT_Exit ; Force block mode DMA, channel is requested and ; unmasked by the VM. and dl, NOT (DMA_mode_mask) or dl, DMA_block_mode xor dh, dh VxDCall VDMAD_Set_Virt_State VDT_Exit: VxDCall VDMAD_Default_Handler ret EndProc VSIMPLED_Virtual_DMA_Trap |
If necessary, a VxD can handle the actual DMA buffer translation and program the physical state of the DMA controller. This type of virtualization requires the use of the VDMAD buffer copy and region management services (listed in Appendix A).
Additionally, a VxD can translate the DMA request to a replacement interface, such as those supplied by the PCMCIA hardware implementations. Again, the VxD must virtualize the DMA channel and process the notifications from VDMAD.
Although some of the buffer management details are discussed in the next section, you should investigate the VDMAD sources provided in the Microsoft Windows 3.1 Device Driver Kit for code samples and to develop a better understanding of the operation of VDMAD.
To request a DMA buffer from VDMAD and copy information from a VM to this buffer, the VxD uses the VDMAD Reouest_Buffer and VDMAD_Copy_To_Buffer services:
;Request a buffer from VDMAD and copy from VM ;On entry, EAX is DMA handle, EBX is VM handle. VxDCall VDMAD_Get_Virt_State push edx ; save mode for later push ebx ; save VM for later ;ESI = linear address ;ECX = count ;DL/DH = mode/flags test dl, DMA_requested jnz SHORT Buffer_New test dl, DMA_masked jnz SHORT Buffer_CleanUp VxDCall VDMAD_Request_Buffer jc SHORT Error_No_Buffer ;EDX now contains the physical address of the DMA buffer... test dl, DMA_type_read jz SHORT Dont_Copy ;EBX = buffer handle ;ESI = linear region ;ECX = size ;EDI = offset xor edi, edi VxDCall VDMAD_Copy_To_Buffer jc SHORT Error_Copy |
To prepare the hardware state, the VxD updates the region information and programs the physical state to the DMA controller. The VxD starts DMA transfer by unmasking the channel:
Dont_Copy: pop ebx VxDCall VDMAD_Set_Region_Info VxDCall VDMAD_Set_Phys_State ;Unmask the DMA channel to begin the transfer VxDCall VDMAD_UnMask_Channel |
Note that these code fragments are very simple and incomplete. For instance, the VxD does not check to see whether the region can be locked by using the VDMAD_Lock_DMA_Region service before requesting the buffer from VDMAD.
When a DMA channel is unmasked using the VDMAD_UnMask_Channel service, the ownership of the DMA channel is assigned to the requesting VM. VDMAD sets up the watchdog event to modify the virtual channel state when the terminal count is reached for non-auto-init-mode transfers. When the watchdog event determines that the channel has reached terminal count, VDMAD virtually masks it. If the operation was a DMA write operation, the buffer is copied to the VM's linear address, as supplied with VDMAD_Set_Region_Info. The virtual count register is updated, the channel is physically masked, and the channel owner is set to NULL.
;Define hot keys for ctrl-pgup and ctrl-pgdn mov al, 49h ; page-up mov ah, ExtendedKey_B ShiftState <SS_Toggle_mask + SS_Either_Ctrl>, <SS_Ctrl> mov cl, CallOnPress + CallOnRepeat + Local_Key mov esi, OFFSET32 VSIMPLED_Hot_Key_Handler xor edx, edx xor edi, edi VxDCall VKD_Define_Hot_Key JC SHORT Exit_Failure mov ghhkCtrlPgUp, eax mov al, 51h ; page-down mov ah, ExtendedKey_B Shiftstate <SS_Toggle_mask + SS_Either_Ctrl>, <SS_Ctrl> mov cl, CallOnPress + CallOnRepeat + Local_Key mov esi, OFFSET32 VSIMPLED_Hot_Key_Handler xor edx, edx xor edi, edi VxDCall VKD_Define_Hot_Key jc SHORT Exit_Failure mov ghhkCtrlPgDn, eax |
To disable these keys by default, use the VKD_Local_Disable_Hot_Key service during the Sys_VM_Init and VM_Critical_Init message processing:
VSIMPLED_Sys_VM_Init LABEL NEAR BeginProc VSIMPLED_VM_Critical_Init mov eax, ghhkCtrlPgUp VxDCall VKD_Local_Disable_Hot_Key mov eax, ghhkCtrlPgDn VxDCall VKD_Local_Disable_Hot_Key clc ret EndProc VSIMPLED_VM_Critical_Init |
Once a hot key has been enabled in a VM the VxD receives a notification from VKD whenever the hot key is pressed and processes it accordingly:
BeginProc VSIMPLED_Hot_Key_Handler push eax ;Turn off hot key mode in case we_re going ;to expand this to force keys. Don_t want ;to be in hot key mode when forcing keys to a VM. VxDCall VKD_Cancel_Hot_Key_State cmp al, 49h jne SHORT HK_PgDn ;Ctrl-PgUp pressed... Trace_Out "Control-PgUp pressed in VM #EBX" jmp SHORT HK_Exit HK_PgDn: ; Ctrl-pgDn pressed... Trace_Out "Control-PgDn pressed in VM #EBX" HK_Exit: pop eax ret EndProc VSIMPLED_Hot_Key_Handler |
;This code snippet just forces PgDn and PgUp ;to the VM in place of Ctrl-PgDn and Ctrl-PgUp. ForceKey_Buffer_Down label byte db 51h, 0D1h ForceKey_Buffer_Down_Len equ $-ForceKey_Buffer_Down ForceKey_Buffer_Up label byte db 49h, 0C9h ForceKey_Buffer_Up_Len equ $-ForceKey_Buffer_Up BeginProc VSIMPLED_Hot_Key_Handler push eax ;Don_t want to be in hot key mode ;when forcing keys to a VM. VxDCall VKD_Cancel_Hot_Key_State cmp al, 49h jns SHORT HK_PgDn ; Ctrl-PgUp pressed... Trace_Out "Control-PgUp pressed in "N *EBX" mov ecx, ForceKey_Buffer_Up_Len lea esi, ForceKey_Buffer_Up jmp SHORT HK_ForceEm HK_PgDn: ; Ctrl-PgDn pressed... Trace_Out "Control-PgDn pressed in VN #EBX" mov ecx, ForceKey_Buffer_Down_Len lea esi, ForceKey_Buffer_Down BK_ForceEm: VxDCall VKD_Force_Keys IFDEF DEBUG jnc SHORT @F Debug_Out "VKD_Force_Keys failed!" @@: ENDIF pop eax ret EndProc VSIMPLED_Hot_Key_Handler |
Using the force keys service is quite simple, but determining which scan codes to send is probably the most time-consuming part of using this interface. To make determining the scan codes simpler, I have created a simple utility that watches INT 9h and displays the keystrokes to the screen until you press the <ESC> key. The code for the KEYDISP utility can be found on the accompanying disk in the ASM\KEYDISP directory.
The concept of writing VxDs in 'C' has been widely misunderstood. Writing VxDs in 'C' is not impossible -- on the contrary, you can do it without a great deal of grief. Forget everything anyone has every told you about writing VxDs in 'C' and open your mind. VxDs written in 'C' are the wave of the future, not just a passing fad.VMM does not look in the object code of VxDs for magical embedded notations to determine whether the code was generated by a 'C' compiler or the magical MASM 5.10B assembler. When a good 386 32-bit 'C' compiler generates the necessary code, the LINK386 linker will link the objects and generate a proper executable, which can be called a VxD.
The main hurdle to overcome when writing VxDs in 'C' is that a great portion of VMM services require either parameter passing using registers or that the mystical dynalinking macro must be used to generate the code to call VxD or VMM services. Additionally, services declared by VxDs are created with tables hidden by the VMM.INC macros and the actual procedure entry points are renamed with a new prefix. But that doesn't mean that it's time to give up and return to assembly, only that you may not be able to write all of your VxD in 'C'. Some assembly may be required: I affectionately refer to this as MASM-tape. I'll provide the MASM-tape on the accompanying disk and some instruction and you can begin writing VxDs in 'C' almost immediately, assuming you have the rest of the necessary tools. I have been successful using the WATCOM C/386 V9.5 compiler to generate flat 32-bit code. The samples included on the diskette were created using this compiler.
The limitations and restrictions of writing a VxD in 'C' include the following:
// code and data segment directives for init code #pragma code_seg("_TTEXT", "ICODE") #pragma data_seg("_IDATA", "ICODE") // code and data segment directives for pageable code #pragma code_seg("_TEXT", "PCODE") #pragma data_seg("_DATA", "PCODE") // code and data segment directives for locked code #pragma code seg("_LTEXT", "LCODE") #pragma data seg("_LDATA", "LCODE") |
When developing the samples in 'C' for this book, I experienced problems with the WATCOM C/386 compiler using the #pragma code_seg directive and was forced to use command line options to define the segment and class names (see the sample makefiles for more information). Also, some 'C' compilers may not support multiple segment declarations in a single module. You may be required to create one module for initialization code and data, another for locked code and data and another for pageable code and data.
EXPORTS VSIMPLED_DDB @1 |
In order to maintain compatibility with this naming convention, the compiler must not generate the 'C'-style underscore prefix. The WATCOM C/386 compiler provides an option for disabling this naming convention.
The DDB structure, as defined using 'C', is as follows:
#define DDK_Version 0x30A typedef struct tagVxD_Desc_Block { DWORD DDB_Next // VMM reserved field WORD DDB_SDK_Version // WMM reserved field WORD DDB_Req_Device_Number // Required device number BYTE DDB_Dev_Major_Version // Major device number BYTE DDB_Dev_Minor_Version // Minor device number WORD DDB_Flags // Flags init calls complete BYTE DDB_Name[8] // Device name DWORD DDB_Init_Order // Initialization Order DWORD DDB_Control_Proc // Offset of control procedure DWORD DDB_V86_API_Proc // Offset of APT procedure DWORD DDB_PM_API_Proc // Offset of API procedure DWORD DDB_V86_API_CSIP // CS:IP of API entry point DWORD DDB_PM_API_CSIP // CS:IP of API entry point DWORD DDB_Reference_Data // Ref. data from real mode DWORD DDB_Service_Table_Ptr // Pointer to service table DWORD DDB_Service_Table Size // Number of services } DDB; |
The following example declares a DDB within a 'C' module:
#include <vmm.h> #include "vsimpled.h" #pragma data_seg("_LDATA", "CODE") /* * V I R T U A L D E V I C E D E C L A R A T I O N */ DDB VSIMPLED_DDB = { NULL, // must be NULL DDK_Version, // DDK_Version VSIMPLED_Device_ID, // Device ID VSIMPLED_Major_Ver, // Major Version VSIMPLED_Minor_Ver, // Minor Version NULL, "VSIMPLED", Undefined_Init_Order, DWORD) vmmwrapVxDcontrolProc, NULL, NULL, NULL, NULL, NULL, NULL, NULL}; |
To provide an interface to the register parameters for VxD control procedures, an assembly wrapper is necessary. This procedure creates a 'C' stack frame and calls the associated procedure as defined in a dispatch table:
// This table is used by the vmmwrapVxDcontrolproc defined // in "NMWRAP.ASM". It lists the messages and associated // dispatch functions, it must be terminated with -1 and NULL. DISPATCHINFO alpVxDDispatchProcs[] = { Create_VM, VSIMPLED Create_VM, Sys_Critical_Init, VSIMPLED_Sys_Critical_Init, Device_Init, VSIMPLED_Device_Init, -1, NULL }; |
When the VxD control procedure is called by VMM, the vmmwrapVxDControlPrnc (provided by VMMWRAP.ASM) walks this table and dispatches the system message to the associated procedure. Note that vmmwrapVxDControlProc uses a linear search algorithm; consequently, the least-frequent system events should be located at end of the table. Some of the dispatch functions have slightly different prototypes, not listed here becausse the sample sources demonstrate their use and the VMMWRAP.ASM code is well documented.
The following code excerpt demonstrates a VxD initialization procedure as written in 'C':
#pragma data_seg("_IDATA", "ICODE") /* I C O D E */ BOOL VSIMPLED_Device_Init(DWORD VM, PSTR pcmdTail, PCRS_32 pCRS) { /* Description: * This is a non-system critical initialization procedure. * IRQ virtualization, I/O port trapping, and VM control * block allocation can occur here. * Again, the same return value applies... TRUE for success, * FALSE for error notification. * Parameters: * DWORD VM - System VM handle * PSTR pcmdTail - pointer to WIN.COM's command tail * PCRS_32 pCRS - pointer to System VM client register structure * * History: Date Author Comment * 3/ 9/93 BryanW Wrote it. */ vmmTraceOut("VSIMPLED_Device_Init\r\n"); return TRUE; } // end of VSIMPLED_Device_Init() |
VMMWRAP.ASM defines a large number of 'C' callable routines that convert stack parameters into the correct register parameter interfaces used by the various services and return the results of the service call. For example, the VMM service List_Create uses the ECX, EAX, and ESI registers to define a node size and flags and to return a handle to the list. It then becomes necessary to provide an C-callable interface:
;DWORD PASCAL vmmListCreate(UINT uNodeSize, UINT uFlags); ;DESCRIPTION: ; Creates a new list structure. ;PARAMETERS: ; UINT uNodeSize ; UINT uFlags ; Specifies the creation flags, it can be a ; combination of the following values: ; LF_Alloc_Error, LF_Async, LF_Use_Heap ;RETURN VALUE: ; DWORD ; handle to the list or NULL if failure BeginProc vmmListCreate, PUBLIC uFlags equ [ebp + 8] uNodeSize equ [ebp + 12] push ebp mov ebp, esp push esi push ecx mov ecx, uNodeSize mov eax, uFlags VMMCall List_Create pop ecx mov eax, esi pop esi jnc SHORT VLC_Exit xor eax, eax VLC_Exit: pop ebp ret 8 EndProc vmmListCreate |
A VxD in 'C' can then call this service as follows:
//Create a list with elements of the type NODE hList = vmmListCreate (sizeof(NODE), 0); |
A thunk is created "on the fly" by a thunking procedure. Given a procedure address, a thunking procedure copies the base code, patches the necessary offsets, and returns a pointer to this piece of code. An advantage to using flat model code here is that a VxD can reference code and data with the same offset. Creating executable code with a simple heap allocation is easy, because selector restrictions are not an issue. For example, the following will create a procedure thunk for a generic VMM event callback:
;EVENTPROC PASCAL vmmwrapThunkEventProc(EVENTPROC pProc); ;DESCRIPTION: ; Creates a procedure thunk for VxD generic event callbacks. ;PARAMETERS: ; IDWORD pProc ; pointer to callback procedure, must have the form: ; VOID CDECL EventProc( DWORD hVM, DWORD dwRefData, PCRS_32 pCRS); ;RETURN VALUE: ; EVENTPROC ; pointer to thunk or NULL if failure BeginProc vmmwrapThunkEventProc, PUBLIC pCRS equ [ebp] pProc equ [ebp + 8] push ebp mov ebp, esp call Allocate_Procedure_Thunk jc SHORT VEProc_Failure jmp SHORT VEProc_CreateThunk ; Begin thunk code EventThunk label byte push pCRS push edx ; uPage push ebx ; hVM call $ EventThunkCallAddr equ $ -EventThunk add esp, 12 ; fixup for CDECL ret EventThunkSize equ $-EventThunk ; End thunk code VEProc_CreateThunk: push ecx push edi push esi ; Copy the thunk... lea esi, EventThunk mov edi, eax mov ecx, EventThunkSize cld shr ecx, 1 rep movsw adc cl, cl rep movsb ;Fix it up... push eax add eax, EventThunkCallAddr mov esi, eax sub esi, 4 sub eax, pProc neg eax mov dword ptr [esi], eax pop eax pop esi pop edi pop ecx jmp SHORT VEProc_Exit VEProc_Failure: xor eax, eax VEProc_Exit: pop ebp ret 4 EndProc vmmwrapThunkEventproc |
To avoid page faults while executing thunk code, allocate a non-pageable memory block for a thunk table on the first call to Allocate_Procedure_Thunk. To simplify thunk allocation management, the allocation routine uses a fixed, maximum thunk size; this routine could be improved to be more memory efficient. The actual thunk code is embedded in the specific thunk allocation procedure. After the memory allocation for the thunk has been performed, the thunk code is copied and patched with the correct offset to the caller's provided procedure address. Thunks should be created only once per procedure, as follows:
/*NOTE!!! pVMEMTRAP_PFault is a global pointer to the * Page_Fault procedure thunk. */ if (!pVMEMTRAP_PFault) if (pVMEMTRAP_PFault = vmmwrapThunkV86PHProc(VMEMTRAP_PFault)) { }else{ vmmDebugOut("Could not allocate Page_Fault thunk!\r\n"); return FALSE; } vmmHookV86Page(wPage, pVNEMTRAP_PFault); return TRUE; |
#define VSIMPLED_Get_Version (VSIMPLED_Device_ID) % 16 + 0x0000 #define VSIMPLED_Get_Info (VSIMPLED_Device_ID) % 16 + 0x0001 DWORD CDECL I_VSIMPLED_Get_Version(VOID); BOOL CDECL I_VSIMPLED_Get_Info(PINFOSTRUCT); SERVICETABLE VSIMPLED_ServiceTable = { I_VSIMPLED_Get_Version, I_VSIMPLED Get_Info}; |
The service table must be located in the locked data segment. The DDB should be contain a pointer to service table and number of services declared.
If your VxD is replacing a standard VxD, such as the Virtual Display Driver, a service interface already exists. To support this interface and to allow the VxD service procedures to be written in 'C', the service entry points are thunked using a macro, such as the following to provide an interface to the register parameters:
Service_Thunk MACRO Service_Name, Type IFNB <Type> IFIDNI <Type>, <ASYNC_SERVICE> BeginProc Service_Name, ASYNC_SERVICE ELSE %OUT ERROR: Service_Thunk <Type> parameter must be\ ASYNC_SERVICE or undefined err ENDIF ELSE BeginProc Service_Name, SERVICE ENDIF EXTRN _&Service_Name:NEAR IFDEF DEBUG Debug_Out `In &Service_Name' ENDIF pushad pushfd push esp cCall _&Service_Name add esp, 4 popfd popad ret EndProc Service_Name ENDM |
The service thunks are defined as follows using the macro:
VxD_CODE_SEG Service_Thunk VDD_Get_ModTime Service_Thunk VDD_Set_HcurTrk Service_Thunk VDD_Msg_ClrScrn Service_Thunk VDD_Msg_ForColor Service_Thunk VDD_Msg_BakColor Service_Thunk VDD_Msg_TextOut Service_Thunk VDD_Msg_SetCursPos Service_Thunk VDD_Query_Access ; New services for 3.1 Service_Thunk VDD_Check_Update_Soon VxD_CODE_ENDS |
The service table is defined as usual:
.xlist INCLUDE VMM.INC PUBLIC VDD_Service_Table Create_VDD_Service_Table EQU True INCLUDE VDD.INC list |
Finally, a service procedure written in 'C' uses a pointer reference to the registers, as provided by the thunk, to access the parameters:
/* * VOID VDD_PIF_State * Description: * Informs VDD about PIF bits for newly created "N. * Parameters: * PREGS pRegs * pRegs -> ebx = VM handle * pRegs -> ax = PIF bits * Return (VOID): * Nothing. */ VOID CDECL VDD_PIF_State(PREGS pRegs, PVDDCB pVMCB) { if (vmmTestSysVMHandle(pRegs -> ebx)) { wPIFSave = (WORD) pRegs -> eax; }else{ pVMCB = (PVDDCB) (pRegs -> ebx + dwVidCBoff) if (pVMCB -> VDD_PIF != (WORD) pRegs -> eax) { pVMCB -> VDD_PIF = (WORD) pRegs -> eax; VDD_TIO_SetTrap( pRegs -> ebx, pVMCB); } } } // VDD_PIF_State() |
VSDINIT.C |
---|
/* Module: vsdinit.c * Purpose: * Init code and data for VSIMPLED. * Development Team: * Bryan A. Woodruff * History: Date Author Comment * 3/14/93 BryanW Wrote it. * * Copyright (c) 1993 Woodruff Software Systems. * All Rights Reversed. */ #include <vmm.h> #include "vsimpled.h" #pragma data_seg("_IDATA","ICODE") /* I C O D E * BOOL VSIMPLED_Sys_Critical_Init * Description: * On entry, interrupts are disabled. Critical initialization * for this VxD should occur here. For example, we can read * settings from VM's cached copy of the SYSTEM.INI and act * set up our VxD as appropriate. * * This procedure is called when the VxD_Control_Proc * dispatches the Sys_Critical_Init notification from VMM. * * We can notify VMM of success or failure by returning TRUE or * FALSE. * * Parameters: * DWORD hVM System VM handle * DWORD dwRefData reference data passed from real-mode init * PSTR pcmdTail pointer to WIN.COM's command tail * PCRS_32 pCRS pointer to System VM client register structure * * History: Date Author Comment * 3/ 9/93 BryanW Wrote it. * BOOL CDECL VSIMPLED_Sys_Critical_Init( DWORD hVM, DWORD dwRefData, PSTR pCmdTail, PCRS_32 pCRS) { vmmDebugOut("VSIMPLED_Sys_Critical_Init\r\n"); return TRUE; } // end of VSIMPLED_Sys_Critical_Init() /* BOOL VSIMPLED_Device_Init * Description: * This is a non-system critical initialization procedure. * IRQ virtualization, I/O port trapping, and VM control * block allocation can occur here. * Again, the same return value applies: TRUE for success, * FALSE for error notification. * Parameters: * DWORD hVM System VM handle * PSTR pCmdTail pointer to WIN.COM's command tail * PCRS_32 pCRS pointer to System VM client register structure * * History: Date Author Comment * 3/ 9/93 BryanW Wrote it. */ BOOL CDECL VSIMPLED_Device_Init( DWORD hVM, PSTR pCmdTail, PCRS_32 pCRS) { vmmTraceOut("VSIMPLED_Device_Init\r\n"); return TRUE; ) // end of VSIMPLED_Device_Init() // End of File: vsdinit.c |
VSIMPLED.C |
---|
/* * Module: vsimpled.c * Purpose: * A simple VxD written in C'. * Development Team: * Bryan A. Woodruff * History: Date Author Comment * 3/ 9/93 BryanW Wrote it. * * Copyright (c) 1993 Woodruff Software Systems. * All Rights Reversed. */ #include <vmm.h> #include "vsimpled.h" #pragma data_seg("_LDATA", "CODE") // V I R T U A L D E V I C E D E C L A R A T I O N DDB VSIMPLED_DDB = {NULL, // must be NULL DDK_Version, // DDK_Version VSIMPLED_Device_ID, // Device ID VSIMPLED_Major_Ver, // Major Version VSIMPLED_Minor_Ver, // Minor Version NULL, "VSIMPLED", Undefined_Init_Order, (DWORD) vmmwrapVxDControlProc, NULL, NULL, NULL, NULL, NULL, NULL, NULL}; // This table is used by the vmmwrapVxDControlproc. // It lists the messages and associated dispatch functions. // It must be terminated with -1 and NULL. DISPATCHINFO alpVxDDispatchprocs t = { Sys_Critical_Init, VSIMPLHD_Sys_Critical_Init, Device_Init, VSIMPtED_Device_Init, Create_VM, VSI'APLED_Create_VM, -1, NULL}; /* BOOL CDECL VSIMPLED_Create_VM( DWORD hVM, PCRS_32 pCRS * Description: * Notification when VMs (other than system VM) are created. * Parameters: * hVM VM handle * pCRS pointer to client register structure * * History: Date Author Comment * 3/ 9/93 BryanW Wrote it. */ IBOOL CDECL VSIMPLED_Create_VM( DWORD hVM, PCRS_32 pCRS) { vmmTraceout("VSIMPLED_Create_VM\r\n"); return TRUE; } // end of VSIMPLED_Create_VM() // End of File: vsimpled.c |
Debugging services are some of the most important, but least used, services of the VMM. The debugging services provide important feedback during the operation of your VxD. The debug version of WIN386, through the debugger interface, provides key information that can help you track down even the most difficult bugs. A better understanding of the debug services and VMM's debugging interface can save you time and frustration.
Debug trace strings are useful when you are tracking the last action before a crash or the watching execution path of code. Trace_Out is particularly well-suited to this. Debug_Out is most commonly used when an assertion fails or some other unexpected event occurs.
In Windows 3.1, the Mono_Out and Mono_Out_At macros call the Out_Mono_String service to display a string on the monochrome display. The Out_Mono_String service offers you a fast memory write so you don't have to wait for the serial port when using the WDEB386 debugger. This is excellemt for high frequency debug strings in such places as interrupt handlers.
The Queue_Out macro calls the Queue_Debug_String service, which queues a message string until it is retrieved by the lq command from the debugger interface. This is useful when multiple debug traces are occuring and scrolling from view. The Queue_Out macro lets you to record events and display them at your convenience.
Assert_VM_Handle | Verifies that the provided register or memory location contains a valid VM handle. |
Assert_Cur_VM_Handle | Verifies that the provided register or memory location contains the current VM handle. |
Assert_Client_Ptr | Verifies that the provided register or memory location points to the client register structure of the current VM. |
Assert_Ints_Disabled | Verifies that interrupts are disabled. |
Assert_Ints_Enabled | Verifies that interrupts are enabled. |
VMM DEBUG INFORMATIONAL SERVICES [A] System time [B] Time-slice information/profile [C] Dyna-link service profile information [D] Reset dyna-link profile counts [E] I/O port trap information [F] Reset I/O profile counts [G] Turn procedure call trace logging on [H] V86 interrupt hook information [I] PM interrupt hook information [J] Reset PM and V86 interrupt profile counts [K] Display event lists [L] Display device list [M] Display V86 break points [N] Display PM break points [O] Display interrupt profile [P] Reset interrupt profile counts [Q] Display GP fault profile [R] Reset GP fault profile counts [S] Toggle Adjust_Exec_Priority Log AND DISPLAY [T] Reset Adjust_Exec_Priority Log info [U] Toggle verbose device call trace [V] Fault Hook information Enter selection or [ESC) to exit: |
The information available through this interface is quite extensive and specific to VMM. For example, the time slice command displays the following:
# VMs scheduled = 02 # idle VMs = 01 Time-Slice focus VM = 804A1000 Scheduled VM = 804A1000 Time slice size = 00000014 Timer period = 14 804A1000 background |
Additionally, the following additional dot (.) commands are available in the debug version of VMM:
.VM [#] Displays complete VM status .VC [#] Displays the current VMs control block .VH Displays the current VM handle .VR [#] Displays the registers of the current VM .VS [#] Displays the current VM's virtual mode stack .VL Displays a list of all valid VM handles Toggles the trace switch .S [#] Displays short logged exceptions starting at # .St [#] Displays long logged exceptions .LQ Display Queue outs from most recent .DS Dumps the protected mode stack with labels .HE [handle] Displays Heap information .ME [handle] Displays Memory information .MV Displays VM Memory information .MS PFTaddr Display PFT info .NF Display Free List .MI Display Instance data info .Mt LinAddr Display Page table info for given linear address .MP PhysAddr Display ALL Linear addrs that map the given addr .ND Change debug MONO paging display .NO Set a page out of all present pages .VMM Menu VMM state information .<dev name> Display device specific info |
One of the most useful commands is the exception tracing option. To turn tracing on, use the T command:
start tracing stop tracing exceptions logged = 00000C9D 00000C9D: OUT 804A1000 02 HI MMM 800E097E 00000C9C: 0050 804A1000 02 EI MMM 800E097E 00000C9B: 0006 804A1000 02 EI V86 2586:2230 00000C9A: OUT 804A1000 02 DI V86 C803:0A05 00000C99: 0006 804A1000 03 EI V86 2586:2230 00000C98: OUT 804A1000 03 DI V86 FFFF:0BEB 00000C97: 0006 804A1000 04 DI V86 265F:14A0 00000C96: OUT 804A1000 04 EI V86 D800:04A1 00000C95: 001A 804A1000 04 EI V86 D800:04A1 INT 1A 00000004 00000C94: OUT 804A1000 04 EI V86 D800:04A1 00000C93: 001A 804A1000 04 El V86 D800:04A1 INT 1A 0000008C 00000C92: OUT 804A1000 04 El V86 0486:0EF0 00000C91: 0050 804A1000 04 EI V86 0486:0EF0 INT 50 00000308 00000C90: OUT 804A1000 04 DI V86 1024:0F3C 00000C8F: 0013 804A1000 03 EI V86 FFFF:0BHB INT 13 00000308 00000C8E: OUT 804A1000 02 DI V86 BlAD:0031 00000C8D: 002A 804A1000 02 DI V86 BlAD:0031 INT 2A 00008200 00000C8C: OUT 804A1000 02 DI V86 C803:0A05 00000C8B: 0006 804A1000 02 EI V86 2586:2230 00000C8A: OUT 804A1000 02 DI V86 BlAD:0031 00000C89: 002A 804A1000 02 DI V86 BlAD:0031 INT 2A 00008200 |
The exception log shows 0xC9B exceptions during the short period that the system is allowed to run. To display details about an exception, use the sl command:
#.sl c8b stop tracing Show exception 00000C8B 00000C8B: 0006 804A1000 02 El V86 2586:2230 V86 Fault 0006 VM_Handle = 804A1000 00000C8B AX=00007000 CS=2586 IP=00002230 FS=0000 BX=00000005 SS=0BCC SP=00000190 GS=0000 TIME=00000096:1930 CX=0000001A DS=9E9B SI=0000003F BP=0000201A DX=0000001A ES=0000 DI=00004000 FL=00033202 |
This fault occurred in V86 mode and was an invalid opcode (exception 6). To learn why an invalid opcode occur, we need to look at the disassembly:
#u &2586:2230 &2586:00002230 6380fc90 arpl word ptr [bx+si+90fc],ax |
Obviously, an arpl is not a valid V86 instruction. This arpl instruction is really a V86 break point. To demonstrate that this assumption is valid and to find the owner, we can use the M command (Display V86 break points) in the VMM debugging interface:
CS:IP Hit Count Ref Data Procedure 2586:2230 00002D76 00000031 @Resume_Exec + 2a |
The owner of this break point is the the Resume_Exec service, which probably means that this fault was generated as the result of V86 nested execution in the VM.
As you can see, using of the debug version of W1N386 is essential to tracking down problems with your VxD. Some additional helpful debugging tips:
The following program virtualizes the COM1 port. One of the biggest problems with WIN386 today is the multitude of hardware cards, mostly used for communication of one type or another (modem, fax, network, tape, and so forth), that attempt to run without a VxD. I chose this topic in the hope that, by focusing on this particular problem, more hardware vendors will provide VxDs for their cards.
This driver does not fully replace the VCD. It virtualizes the COMM port and can be used instead of the VCD by DOS apps. However, it does not include the calls required to support Windows COMM drivers, so it cannot be used by Windows programs that talk to the Windows COMM API.
We can fully virtualize all of the ports except for the actual data port. Because we cannot virtualize the actual data port, we have to make sure that only one application can talk on the line at any given time. If two try to talk at the same time, we have to let the user decide which application can use the port.
We also need to reflect interrupts into the proper VM, which is an expensive operation, so we want to make sure that we only do it if absolutely necessary. We can establish this by watching the value that the application writes to the Interrupt Enable Register and by trapping when the application does an EOI. Also, since emulation has so much overhead, we need to define a new interface that is directly callable from DOS, Windows, and other VxDs, is designed to allow block I/O (which is much faster than handling things on a byte-by-byte basis), and implements an open and close on the port so that we know when an app is done with the port. This eliminates the need to handle contention problems.
So, while we emulate to support existing applications, we also create a new API that works a lot more efficiently in a WIN386 world. If you write the only code that touches your card, then you should consider creating just the new interface. In this case, you still want to trap on your ports, so that other applications cannot write to them by mistake.
Finally, when we are reflecting interrupts to a VM, we want to be careful to not use up all of their stack. Therefore, rather than simulating another IRQ when the VM does an EOI, we wait until their IRQ handler does an iret, completely unusing the stack, before we send in another one. We use ComIret, which is called after the VM does an iret to emulate the next pending IRQ.
When VPICD receives an interrupt, it masks the interrupt off and sends an EOI. It then reflects the IRQ to our VxD. When we do a VPICD_Phys_EOI, the VPICD unmasks the interrupt. This has two important ramifications. First, another interrupt can then occur immediately, and we can see it as soon as we unmask it. Second, if we never EOI, the interrupt is never unmasked, and we never see it again.
If the read and write pointers point to the same location, the buffer is empty. There is no buffer overrun check because a check would create the possibility of losing old or new data: If we ignore the problem, we lose old data. The result is the same: the program still runs but data is lost. (Granted, we lose more data this way, but if we lose any data, we are generally in trouble.) This eliminates the performance hit of checking the buffer size on each read and write.
The read buffer needs of three bytes for each data byte received. For each data byte, we first read the two status registers and store them. We then read the data byte and store it. We read the status bytes first so that the line status shows the data byte. By saving all three bytes, the calling application can get the status for each data byte.
bInVmirq is a count of how many IRQs sent to the VM have not yet returned. Sending several at once is not a problem, as long as we don't overflow the VM's stack. This count should never go over 2.
bIntEnb holds the value of the Interrupt Enable Register as set by the VM that owns the port. Regardless of the value set, the hardware always has bits 0111b set. If the app in the VM has not set these bits, we do not want the performance hit of emulating an IRQ. Therefore, we use the values in bIntEnb to see whether we need to reflect an IRQ.
Next we take over the eight COM1 I/O ports. If we cannot take over all of them, we return with carry set, which tells WIN386 not to load our VxD. If we don't own all of the ports, we are in conflict with another VxD (this is why VCD will fail to load if you load this VxD).
Following that, we take over IRQ4. In a commercial VxD, both the port numbers and the IRQ should be able to be overridden by values in system.ini. You can read system.ini by using Get_Profile_String. This allows you to change settings if the board is reconfigured. Once we have both the ports and the IRQ, we know we can run.
Now, we hook interrupts 21h, 23h, and 24h, so that we can take ownership of the port away from a VM if it terminates. While interrupts 23h and 24h do not guarantee that an app has terminated, an app can terminate in this manner.
Finally, we initialize the COM hardware, turning the interrupts on and enabling the transmit and receive interrupts.
First we call Emulate_Non_Byte_IO. If we get a request for non-byte I/O (word, dword, string), this macro breaks it into byte-sized calls. Since I don't foresee anyone actually using these calls, I use the emulate macro. If an app is likely to do a string of 512 bytes, you will want to handle it yourself. The overhead of Emulate_Non_Byte_IO is significant.
Next, we clear the direction flag. (If we don't we will get annoying, time-consuming intermittent bug.)
Then, if we don't take the jmp, we build the jmp vector offset. This takes into account the sizes of the read and write tables, as well as the specific values of ECX for reads and writes. We then jmp to the proper function, so that the ret from that function will take us indirectly back to WIN386. Any call, jmp, or ret flushes the on-board cache on the 386 & 486, so we want to minirnize these. Conditional jmps that are not taken do not flush the cache. That's why ComIoPortTrap has a single jmp for the common code path throughout this code. Generally, emulation code is never fast enough, so you do everything you can to speed it up.
If the calling VM doesn't own the port, we need to decide what to do. If no one owns the port, we can assign it to the calling VM. It would probably be better to assign the port to the first VM that accessed the data port; instead it is assigned to the first app to hit the port at all. We then initialize the port to the values we were holding in our instance data. If the app has written those values (while another app owned the port), it expects the hardware to be in a certain configuration. If someone else owns the port, we fake it, providing it is not a data read/write, by reflecting it back to the port-specific function which handles this. The one exception is I/O to 3F8h, when it is set to be the baud rate instead of the data port. That is handled in-line. If we have a data I/O and someone else owns the port, we have to decide who gets it. If the owner app used the new API, they keep the port. This not only gives apps an incentive to use the new API but leaves the API with the app that will free up its use as soon as it is done. Use a contention prompt when you think the owner may be done but are not sure.
Otherwise, we put up a contention MessageBox using SHELL_Resolve_Contention. This call puts up a box asking the user to pick between the two VMs by using their window titles to ID them (which usually both read MS-DOS Prompt). If the user picks the new one, the ownership is switched. The one that is not picked is marked as FAILED so we don't keep prompting every time it tries to read/write a byte.
In IoRead8 all input goes through the buffer. Therefore, the first thing we do is look for bytes in the buffer. If the buffer is empty, we return a 0; otherwise, we get the data byte from the buffer, inc the read pointer to the next set of data, and return the byte. Notice that we only take a conditional jmp if the pointer wrapped. This eliminates jmps from the common code path. We only get to IoRead8 if the DLAB bit is off (its the data byte). ComIoPortTrap handles virtualizing the low byte baud rate in 3F8h.
IoRead9 First we test to see whether we own the port. If not, we jmp to the end of the function to return the information from our instance data. On a write to 3F9h, we save these values so we return what the app expects. If DLAB is set, we read the port and return the value. If DLAB is not set, we return the value in bIntEnb so that the app receives the value it expects.
IoReadA is completely faked. We know which IRQ we sent down to the app and return the appropriate value. If we did not send an IRQ down, we either return 001b (receive IRQ) if we have data or return nothing if we do not. IoReadB and IoReadC, on the other hand, are both quite simple. If the app owns the port, we read from the hardware. If not, we read from the instance data.
IoReadD returns the line status. It tells us whether we can read or write a byte and whether there are any errors. If the calling app owns the port, we return data from the read buffer. If the read buffer is empty, we read the actual port. But if the calling app does not own the port, we return 00011110b which tells the app that the transmit buffer is full (the app cannot write), the receive buffer is empty (the app cannot read), and all error bits are on. This seems to be the best way to get the point across to the app that it is not going to have any luck with this port.
IoReadE is straightforward. If the calling app does not own the port, we use our instance data. If it does own the port, we get the data from the read buffer. If the read buffer is empty, we read from the hardware.
IoReadPort (used only for port F) just reads from the hardware if the calling app owns the port. If the caller does not own the port, it returns 0. This port is undefined for the 8250, so we can't virtualize it.
IoWrit9, like IoRead9, is tricky. If the write is from an app that does not own the port, we copy the value to the instance data for that VM. We do this for both the interrupt enable and the high-baud registers (both of which use this port). We use the instance data for the line control register to determine whether DLAB is set. If the app owns the port, and it is writing to the interrupt enable register, we save the value in bIntEnb and then `or' it with 0011b. This forces an IRQ to receive empty and transmit full, which we need for our buffering code. We then write the byte to the hardware.
IoWritB and IoWritC are both quite simple. If the calling app does not own the port, we copy the value to the instance data for that VM. If the app does own the port, we write to the hardware. IoWritPort (used for ports A, D, E, and F) goes directly to the port if the calling app owns the port. Writing to these ports is undefined for the 8250, so we cannot virtualize it.
In ComHwlnt we determine the correct handler to call based on the value in port 3FAh. We use this value to determine which offset in IrqTabl to jmp to. We jmp so that the ret in the called function returns directly back to WIN386.
In IrqReceive we first go into a loop that reads the data port until it is empty. We loop because the 16550 has a 16-byte FIFO and we could get multiple bytes. Doing this in this loop is much faster than getting each IRQ individually. We read the status ports first so that the line status will show that we have a data byte. After reading in the data, we call VPICD_Phys_EOI, which causes the IRQ to be unmasked (remember, it has already been EOIed). Its critical to do this as soon as possible so that we can get to the next interrupt quickly. This separates talking to the port from virtualizing it.
Now we need to virtualize the IRQ down to the VM. We only do this if we are not already in the middle of reflecting an IRQ. We also make sure we have data in our buffer. Finally, we don't reflect it if the app didn't turn on that interrupt. We then call VPICD_SetIntRequest, which attempts to reflect the IRQ immediately, otherwise it will reflect it as soon as possible.
Finally, if we have set up a callback function, we set tip an event to call the app back. We need to set up an event because we received the IRQ as an asynchronous event, limiting what we can do. We may not even be in the proper VM (remember, a VxD is always running in a VM, but which particular VM it is running on can change). If a fast response is critical, you may want to use Critical_Section_Boost instead of Cur_Run_VM_Boost.
IrqTransmit works basically the same way as IrqReceive. IrqModemStaus and IrqLineStatus are used merely to reflect the interrupts down to the VM. Our driver itself doesn't care about these.
VmCallBack is very simple. We pass a parameter in EAX which is the appropriate value in port 3FAh, letting the called app know whether the callback is due to a non-empty receive buffer or an empty transmit buffer. We then put the callback address in CX:EDX and use the Simulate_Far_Call to set up the stack and Resume_Exec to make the call. Don't forget the Client_State and Nest_Exec calls; without them it will not work.
ComEoi is called when the app does an EOI sends an EOI to the PIC. We have to call VPICD_ClearIntRequest to end the IRQ in that VM.
CoIntRet is called after the IRQ handler in a VM has completed the iret call in the interrupt handler called when we called VPICD_Set_Int_Request. At this point we call VPICD_SetIntRequest if we have data in our buffers and the app wants the IRQs. We do it here so that we do not eat up the app's stack by having IRQs come in on top of each other.
ComRead and ComWrite essentially copy their data from and into the buffers and return. Doing read/writes of blocks of data is faster than emulating on a byte-by-byte basis and avoids buffer overruns.
ComVmCreate is called every time a VM is created (except the system VM). On creation, we set the instance data to 1200,n,8,1.
ComInt21 and ComInt23_24 are used to determine when to take away ownership of a port. If a program exits, we want to take away its ownership. An app can end with to an int 23 or int 24. It can also end with an int 21, function 4Ch, 31h, or 00h. We take away ownership on an EXEC call.
First, build the core code that will talk to the hardware. Once you get this to work, decide which is more critical, the new API or the emulation, and build in that part. Then, build the other. As you do this, you need to keep a couple of things in mind. First, it is absolutely critical that your VxD performs all communication to the physical hardware. Do not let even the smallest part of it be handled directly by an application. For example, port 3FFh is undefined for the 8250. My VxD emulates it and only allows the app that owns the port to access it, rather than assuming that no one will access it. By the same token, port 3FBh is called very rarely, and I probably could have not trapped it. In that case, another VM could have written to it, changing the behavior of the port, and I would never know. Thus, you handle all of the hardware from your VxD for both speed and security reasons.
Create a new API using the direct call in capability. It is much more efficient than trapping ports, interrupts, an so on. While you will still emulate the old API, you will have a much more efficient approach for new code. Also, try to minimize the number of times you have to make calls. Don't make calls to write one byte at a time -- have a call to write a block of data. In most situations, you can write 1 to 4 K as quickly as one byte.
Your emulation must average a certain speed, depending on what you are doing. However, if at 9600 baud the buffers in this VxD slowly fill up, its average speed is slower than 9600 baud. Your either have is to make your emulation faster or live with the limits. Generally you should find that there is only so much you can do to speed up emulation. Emulating a port is a big hit, and emulating an IRQ is a gigantic hit. Compared to real mode, emulation speed versus actual hardware speed is a difference in orders of magnitude. However, in this case, all is not lost. First, you can also trap software interrupts, which is faster than trapping ports and generally eliminates the need for IRQ emulation. In the example of this driver, we could trap int 14h. Unfortunately, most applications don't use int 14h, but we could be faster with those that do. Second, in the case of the this VxD, while we talk to a 8250, we could emulate a 16550 with a FIFO buffer. On an IRQ, an app can read multiple bytes, eliminating the IRQs for all those bytes. By the same token, just because you are written for a specific device does not mean you can't emulate another device more efficiently.
Now it's time to look at how you can use VxDs to pull tricks in the real world. We'll use Win-Link as an example. As with many real-world projects, I had several reasons for writing this program.
The first part arose when I was having lunch with a number of other authors shortly before the launch of Windows 3.1. They complained that Windows was not 32-bit and was not pre-emptively multi-tasked, while OS/2 was. I immediately set about to refute this. Although little known at the time, Windows 3.1 did have support in it for 32-bit programs. Granted it was minimal and required assembler at first but it did exist (and it is what Win32 uses).
But that left OS/2 as the pre-emptively multi-tasked O/S. So I pointed out that the DOS boxes were pre-emptively multi-tasked under Windows. If a Windows app could talk to a DOS app in a DOS box and have the DOS app do the heavy work, then the Windows app would essentially be multi-tasked.
It made an interesting argument. Almost everyone at lunch was willing to concede that a Windows app could be multi-tasked. But it made me wonder how this could be implemented.
At the same time, there were a couple of features of Windows 3.1 that I found frustrating. When I am in a DOS box and type the name of a Windows program, it tells me that I need Windows to run it. Well, what does it think is running? When typing in the name of a Windows EXE from a DOS box, I want it to run that EXE. I also found the title of DOS boxes a little less than desirable. ALT-TABing through five windows, all called MS-DOS Prompt, usually did not tell me which DOS box was running Brief. I wanted the name of the program. And while I was at it, I had one more pet peeve: You can only print from one DOS box or Windows at a time. The DOS boxes don't spool their printing, they are dedicated to it until the printing completes. Yet Windows has a nice spooler. Everything was there I just wanted the DOS boxes to print to the Windows spooler. Then all the DOS boxes could print simultaneously and do it quickly to the spooler.
Out if this came Win-Link, so named because it linked Windows and DOS applications. Win-Link is essentially two programs in one. First, it provides Interprocess Communication between Windows and DOS boxes as well as shared memory. Second, it extends the User Interface of Windows by (1) launching Windows applications (and additional DOS boxes) from a DOS box, (2) listing the running program as the title of a Windows DOS box, and (3) sending all printer output from DOS boxes to the Windows spooler.
Implementing this was a killer. First of all, a number of the major concepts had not been tried before. While everything should have worked, only one implementation that actually did. In addition, there were a mynah of little details necessary to getting it right. Because the code intercepted calls in every DOS box and made asynchronous calls to Windows, every detail had to be right or the entire system would hang, or worse.
This chapter lays out the basic capabilities of the program to give you a clear picture of what the code is trying to accomplish. Then it details the specific logic used to implement each of these pieces, building on the previous pieces where appropriate. Finally, it walks through and explains the actual code. This chapter does not try to teach you anything general about writing VxDs. Instead, by concentrating on the specifics of a piece of real-world code that pulls a number of interesting hacks, you can learn from it by example.
We know how a Windows app can launch a DOS box. However, how does a DOS app launch another DOS box (as opposed to spawning a process)? We add a call allowing a DOS app to launch another DOS app. The parameters are similar to spawning, but instead of spawning in the same VM, Win-Link starts a new VM that runs the app.
Next we need a way to pass messages back and forth. On the Windows side we already have a system, so we merely give DOS boxes a way to call PostMessage. In the other direction, and for between DOS boxes, we have our own message queue. It has three calls, MsgPost to post a message to a VM, Msgpeek to look at a message sent to a VM, and MsgRead to read a message posted to a VM. Unlike Windows messages, these messages cant send pointers, because they are in different address spaces. So we provide two ways to pass blocks of data between VMs. MsgMemCopy copies data from memory in one VM to memory in another VM. MsgMemCopy automatically knows whether the each of the VMs is in V86 or protected mode and interprets the segment/selector appropriately. There are calls to allocate and free LDTs/GDTs for memory in a VM. While real-mode DOS applications cannot access these selectors, Windows apps as well as protected-mode DOS apps can. So a DOS app can pass a LDT to the Windows app to some of its memory. Then both applications can access the memory. These calls give applications a way to communicate with each other between VMs.
Two other sets of calls are provided to DOS applications. Win-Link provides a call to let a DOS application set its Window title. For example, when Brief running having B is preferable to MS-DOS Prompt. Brief - [filename.c] is even nicer. Win-Link also provides a set of calls for printing. While DOS printer output is captured fairly efficiently, again all Win-Link can show for a print job is the name of the application printing the job. By adding a call to open the job, the application can display the name of the document being printed in the Windows spooler. Also, Win-Link generally has to guess when a print job has ended. This can be fixed by adding a call at the end of a job.
Finally, there are the DOS calls Win-Link intercepts. Win-Link intercepts all EXEC calls. On these calls Win-Link determines whether the program being executed is a Windows application. If so, Win-Link checks it against a list of files to execute as DOS apps. If the application is not on that list, Win-Link executes the program from Windows instead of from DOS.
The exception list is there for two reasons. There is no way to differentiate between bound OS/2 applications and Windows applications, so any bound OS/2 app must be on the exception list. Also, some applications have a complete DOS app as their Windows dos-stub program, and you may wish to run the DOS stub. Win-Link intercepts all output sent to LPT1 via int 17h. We do not intercept print I/O directly to the port, nor do we intercept printers on other ports. But all output written to LPT1 at the DOS level eventually gets to int 17h so that output is intercepted. Printing a file performed via the PRINT command or programmatically using PRINT's int 2Fh calls is also intercepted. But printing a file is intercepted at the command level, so that just the file name is passed to Win-Link, which is much more efficient than intercepting the calls to int 17h. When a file prints, the file name is the job name in the Windows print spooler. When a file prints to int 17h, the name of the program is the name of the job. When a program uses the Win-Link call to name a print job, it will be the name the program gave it.
EXEC, TERMINATE, and some other calls are tricked to determine the name of the program running in the DOS box. This name is then matched against a list, which expands predefined names to different names. For example, B changes to Brief. This name is then set as the title of the Window for the DOS box.
The primary data structure is called VMDATA and is in both win_link.h and win_ipc.inc. One of these structures exists for each VM, including the system VM. These are set up in a linked-list so that Win-Link or Win-IPC can walk through all the VM's instances of the structure. This gives the VxD full access, with little effort, to any VM data. In addition, the first element is a LDT selector:offset that points to the structure, valid in the system VM. This provides an easy way for Win-IPC to give Win-Link a pointer to the structure for any VM.
In general, Win-IPC or Win-Link changes values in this structure and then sends a message to the other telling it what to look at in the structure. Following is a brief description of each element of the structure.
VmData struc VmLdt dd 0 VinHandle dd 0 Prnsem dd 0 MsgSem dd 0 TimeHdl dd 0 LinkNext dd 0 LdtNext dd 0 pPsp dd 0 MsgGet dd 0 ; Next Message to read NsgPut dd 0 ; Next free spot MsgLast dd 0 ; Next == free -> empty PrntNum dw 0 hDc dw 0 iPrnErr dw 0 iStr dw 0 _hW_I1_d_g5 dd 0 Bufcnt dw 0 PrntBuf db SIZE_PRNT_BUF dup (0) sxtra db 0, 0 MsgArr db ((size DosMsg) * MAX_DOS_MSG) dup (?) sPsp db 9 dup (0), 0 sProgName db 31 dup (' '), 0 sTitle db 80 dup (0) sExec db 129 dup (0), 0 sCmdLine db 129 dup (0), 0 sPrntStr db 129 dup (0), 0 VmData ends |
When creating the system VM, we _Allocate_Device_CB_Area for the VMDATA structure for each VM and interrupt we need to intercept (17h, 21h, 23h, 24h, & 2Fh).
BeginProc winIpc_Sys_Critical_Init ; Allocate per-VM instance data VMMCall _Allocate_Device_CB_Area, <size VmData, 0> cmp eax, 0 je short scil0 ; No memory - do nothing mov [CbvmData], eax and [SysFlags], not MEM_OFF ; Set up the System VM data mov eax, ebx call GetVmData mov [esi.VmHandle], ebx VMMcall Get_Sys_VM_Handle ; Save System VM mov [SysVM], ebx scil0: clc ret EndProc winIpc_Sys_Critical_Init BeginProc WinIpc_Dev_Init ; Hook interrupts mov eax, 17h ; Sit on int 17 mov esi, OFFSET32 WinIpc_Int_17 VMMcall Hook_V86_Int_Chain mov eax, 21h ; Sit on int 21 mov esi, OFFSET32 WinIpc_Int_21 VMMcall Hook_V86_Int_Chain mov eax, 23h ; Sit on int 23 mov esi, OFFSET32 WinIpc_Int_23 VMMcall Hook_V86_Int_Chain mov eax, 24h ; Sit on int 24 mov esi, OFFSET32 WinIpc_Int_24 VMMcall Hook_V86_Int_Chain mov eax, 2Fh ; Sit on int 2F mov esi, OFFSET32 WinIpc_Int_2F VMMcall Hook_V86_Int_Chain clc ret EndProc WinIpc_Dev_Init |
For each additional VM created we do a little more. First, we need to initialize VMDATA by performing the following steps:
BeginProc WinIpc_VM_Create test (SysFlags], MEM_OFF jnz vmcl0 ; Turned off - do nothing ; Get & zero-fill VmData mov eax, ebx call GetvmData mov edi, esi xor eax, eax mov ecx, (size VmData) / 4 rep stosd ; Init VmData mov [esi.VmHandle], ebx lea ecx, [esi] .MsgArr mov [esi].MsgGet, ecx mov [esi].MsgPut, ecx mov eax, MAX_DOS_MSG - 1 mov edx, size DosMsg mul edx add eax, ecx mov [esi] .MsgLast, eax ; Get the PSP (via SDA) location Push_Client_State VMMcall Begin_Nest_Exec mov [ebp.Client_AX), 5D06h mov eax, 21h VNNcall Exec_Tnt movzx edx, [ebp.Client_DS] shl edx, 4 movzx eax, [ebp.Client_SI] add edx, eax add edx, [ebx.CB_High_Linear] add edx, 10h mov [esi].pPsp, edx VMMcall End_Nest_Exec Pop_Client_State ; Set up Msg semaphore xor ecx, ecx VMMcall Create_Semaphore jc vmcl0 mov [esi].MsgSem, eax ; Set up Prn semaphore VMMcall Create_Semaphore jc vmclO mov [esi].PrnSem, eax ; Create LDT so Win-Link can access structure SizeVmData EQU (size VmData) VMMcall _BuildDescriptorDWORDs <esi, SizeVmData, EW_Data_Type,\ D_GRAN_BYTE, 0> VMMcall _Allocate_LDT_Selector <[SysVm], edx, eax, 1, 0> rol eax, 16 mov [esi.VmLdt], eax ; Build linked-list ; Do this last so we are only in the list if 1) We are all ; filled in & 2) We were able to set up semaphores, etc. mov edi, esi mov eax, [sysvm] call GetVmData mov eax, [esi.LinkNext] mov [edi.LinkNext], eax mov [esi.LinkNext], edi mov eax, [esi.LdtNext] mov [edi.LdtNext], eax mov eax, [edi.VmLdt] mov [esi.LdtNext], eax ; ... see next listing ; We now send a msg to set the title. We do this here ; so we get the message before another VM is created; we ; just grab the first free VM in Windows. PostPm [SysVm], [Syswndi, MSG_DOS_TITLE, 0, [edi.VmLdt] vmcl0: clc ret EndProc WinIpc_VM_Create |
At this point we still have two remaining tasks before we are fully ready for the new VM. The easy one is setting the title of the DOS box. The difficult one is, determining the handle of the Window for this VM and we can't set the title until we know the hWnd. Be warned that the method covered here is not completely foolproof. It seems to work about 98 percent of the time. It runs into trouble largely when a bunch of DOS boxes are launched in a row, so that we have several hVM <-> hWnd resolutions pending.
; We now send a msg to set the title. PostPm [SysVm], [Syswnd], MSG_DOS_TITLE, 0, [edi.VmLdt] vmcl0: clc ret EndProc WinIpc_VM_Create |
If we are running under Windows 3.1, we set a hook and post a message back to Win-IPC. We cover what this does in a moment because it has no effect until we complete the rest of the processing in DosTitle.
We next walk through all Windows whose class is tty (the class of all DOS box windows). We also check that this window is a DOS box, although this may be merely paranoia on my part. Once we find a tty window, we check whether it is already registered to another of our VMs. If so, we keep looking. If not, we assume that it belongs to this VM. If you are following along in the code you'll notice we also passed in a NULL text string and you will set a potentially wrong hWnd to the title. However, because the string is NULL, the text will not be set. DosTitle actually is two separate functions wrapped in one for historical reasons. I originally attempted to get the hWnd by other means.
// We walk the list of top windows looking for one of class tty hWnd = FindWindow ("tty", NULL); while (hWnd) { // See if its a DOS box GetClassName (hWnd, sBuf, 5); if (lstrcmp(sBuf,"tty")) goto NextWin; if (!IsWinOldApTask(GetWindowTask(hWnd))) goto NextWin; // See if we already have this one fpVmOn = fpVmData; do { if (fpVmon->hWnd == hWnd) goto NextWin; if (!(fpVmOn = fpVmOn->LdtNext)) break; } while (fpvmon != fpVmData); // We have it! fpVmData->hwnd = hWnd; // Get the next window Nextwin: hWnd = GetWindow (hWnd, GW_HWNDNEXT); } // We failed fpVmData->hWnd = (EWND) -1; |
Now we have a hVM == hWnd pairing. But this was merely a guess. This is where the hook comes in. We have hooked all messages being sent to any window; a very expensive hook but quite necessary. We then posted a message to Win-IPC. The message causes _MsgShellEvent in Win-IPC to be called. In _MsgShellEvent we make a VxD call to SHELL_Event. SHELL_Event allows us to send a Windows message to a DOS box window by specifying its hVM, which we do know. So we post a message with a constant in uMsg to ID the message and the selector to VmData (we make use of the fact that all our LDT pointers have an offset of 0) in wParam. In our hook filter proc we look for any message with this message number. When we see it, we set that hWnd as the hWnd for our VM. Finally, we post a message to ourselves. When we receive this message we remove the hook. Once the hook is removed, we no longer impose any overhead on the system. We have the correct hWnd unless someone else sent the same message number between the time we installed the hook and the time SHELL_Event got the message back to us. We now have our hWnd and are initialized for the VM just created.
// ... in DosTitle if (uVer >= 0x030A) if (iEookCnt++ == 0) hhookMsgFilterHook = SetWindowsHook (wH_GETMESSAGE, (HOOKPROC) lpfnMsgFilterProc); PostHessage (hDlg, MSG_EVENT_ON, 0, fpvmnata->VmHandle); // ... In main DlgProc case MSG_EVENT_ON: dShellEvent (lparam); break; case MSG_EVENT_OFF: if (--iEookCnt == 0) UnhookWindowsHook (WH_GETMESSAGE, (HOOKPROC) lpfnMsgFilterProc); break; // HOOK Call-backs LRESULT CALLBACK export __loadds MsgFilterFunc (int nCode, WORD wparam, DWORD lparam if (((MSG __far *) lparam)->message == 0x6969) HandleEvent (iParam); return 0; void __loadds HandleEvent (long lParam) VMDATA far *pVmData; pVmData = PTR (((MSG __far *) lparam)->wparam, 0); if (! SelOk ((void far *) pVmData, sizeof (VMDATA))) return; pVmData->hWnd = ((MSG __far *) lparam)->hwnd; PostMessage (hMainDlg, MSG_EVENT_OFF, 0, 0); |
; WIN_IPC.386 dShellEvent MsgShellEvent proc push ebx mov eax, [ebp.Client_ECX] mov ebx, eax call GetVmData mov ecx, 6969h movzx eax, word ptr [esi.VmLdt + 2] xor esi, esi xor edx, edx VxDcall SHELL_Event pop ebx ret endp |
The implementation of this is simple enough that no code is shown, but can be found in the source code on the book's disk. However, it is a critical piece; you can't talk to a DOS app until you know its hVM, and the Register/Query calls provide a means to determine the hVM.
; EAX: uMsg = Message to post to Win-IPC ; ECX: lParaml = first long param ; EDX: lParam2 = second long param CallVxd MACRO uMsg, lparaml, lParam2 mov ecx, lParaml mov edx, lParam2 mov eax, uMsg xor ebx, ebx call dword ptr [WinIpcAddr] ENDM |
This gets a message to WinIPC_PM_Api_Proc in Win-IPC. A jump table is used to go to the handler for the specific message passed in. Because this is also the entry point other Windows applications use to call Win_IPC the procedure first checks to make sure the passed-in message legit number for Windows application. It does this by using the message number as an offset into the table PrnOkTable, which is a table of bytes. If a byte is 0, then the message is not legal; if it is -1, it is legitimate. At the same time the procedure also makes sure that the message number is within the range of handled messages.
BeginProc WinIpc_PM_API_Proc movzx eax, [ebp.Client_AX] cmp eax, [NumPmOk - 1] ja short pap10 and eax, 0FFh mov al, [PmOkTable + eax] cmp al, 0 je short pap20 pap10: call DefMsgProc ret pap20: mov [ebp.ClientAX], ERR_UNKNOWN_MSG ret ; exit error EndProc WinIpc_PM_API_Proc |
DefMsgProc is even simpler. It first looks to see if Win-IPC is on. If the flag MEM_OFF in SysFlags is set, the Win-IPC is turned off. In this case, DefMsgProc does nothing and refuses to handle any messages. DefMsgProc then jmps to the appropriate handler from MsgDispTable. This is a quick way to get to the correct message. We jump instead of call because that saves us a ret when we are done.
DefMsgProc proc test [SysFlags], MEM_OFF ; Are we running? short dmp20 mov [ebp.Client_AX], ERR_NO_VM_MEMORY ret dmp20: movzx eax, [ebp.Client_AX] ; Get the message jmp [MsgDispTable + 4 * eax] DefMsgProc endp |
Whichever function is called then executes and returns. When it returns, the return goes back to Win-Link, with the return value passed in AX.
There is still one minor concern. We do not want to call PostMessage if the Windows VM is in the critical section or has interrupts off. This is not an absolute requirement, but it is part of being a good neighbor. Taking the time to post a message while a Windows app (or DLL, more likely) is in a critical section can delay that application enough to cause it major harm -- and bring the system down. We also have to wait until the Windows VM can be scheduled. An immediate call would go into the current VM, which quite possibly is not the Windows VM. Therefore, when LinkMsgProc returns, the message may not yet have been posted. So we have to get a temporary structure to hold our message until we can post it to Windows. Otherwise, the message could be overwritten as soon as LinkMsgProc returned.
LinkMsgProc proc ; Get a VmMsg struct dmp70: mov cx, [VmMsgAlloc] mov edi, [VmMsgOff] mov eax, [ebp.Client_EBX] dmp80: xchg [edi.Handle], eax cmp eax, 0 je short dmp90 xchg [edi.Handle], eax add edi, size VmMsg loop dmp80 mov [ebp.Client_AX], ERR_MSG_FULL ret ; edi points to a VMMSG struct dmp90: mov eax, [ebp.Client_EAX] ; save message mov [edi.lParam1], eax mov eax, [ebp.Client_EDX] mov [edi.lParam2], eax mov eax, [ebp.Client_ECX] mov [edi.lWndMsg], eax mov [edi.VmOff], esi ; lets generate the call-back mov eax, Low_Pri_Device_Boost push ebx mov ebx, [esi.VmHandle] mov ecx, PEF_Wait_For_STI or PEF_Wait_Not_Crit mov edx, edi mov esi, OFFSET32 HandleCallBack VMMcall Call_Priority_VM_Event pop ebx mov edx, [edi.Rtn] ; rtn regs & Client_regs mov [ebp.Client_EDX], edx mov eax, ERR_NONE mov [ebp.Client_EAX], eax ret LinkMsgProc endp |
This code has not necessarily posted a message. It has merely saved it in the structure and set up a call to HandleCallBack. If the Windows VM had interrupts on and was not in a critical section, HandleCallBack was called before Call_Priority_VM_Event returned. Either way, HandleCallBack has been, or shortly will be, executed.
HandleCallBack first pushes the client state so it can modify the VM's registers. It then moves the message values to the client registers on the stack. These are the values the registers will have when Resume_Exec is called. HandleCallBack then sets up a nested execution call to _dMsgProc in Win-Link. This code makes a call to PostMessage to get the message posted. On return from Resume_Exec, the message is posted, assuming that there was room in the queue for it. Finally, the VMMSG struct is marked as free and the client registers are taken off the stack. When HandleCallBack returns, it has returned the VM to its original state.
HandleCallBack proc Push_Client_State mov edi, edx ; Get pointer mov eax, [edi.lParam1] ; Set up registers mov [ebp.Client_EAX], eax mov eax, [edi.lParam2] mov [ebp.Client_EDX], eax mov eax, [edi.lWndMsg] mov [ebp.Client_ECX], eax mov [ebp.Client_EBX], edi mov edx, [SysCallBack] mov cx, dx ; Call the sucker shr edx, 16 VMMcall Begin_Nest_Exec VMMcall Simulate_Far_Call VMMcall Resume_Exec VMMcall End_Nest_exec mov eax, [ebp.Client_EAX] ; save rtn value mov [edi.Rtn], eax mov [edi.Handle], 0 ; Mark VmMsg avail Pop_Client_State ret HandleCallBack endp |
_dMsgProc proc far push si push ds push bp push 0 mov bx, sp push ax push cx mov ax, _DATA mov ds, ax mov cx, NUM_MSG mov si, offset _DATA:MsgData mp10: mov ax, 0FFFFh xchg ds:[si.InUse], ax cmp ax, 0 je mp20 add si, size VXDMSG loop mp10 pop cx pop ax jmp mp30 mp20: pop cx pop ax mov dword ptr ds:[si.mWnd], ecx mov dword ptr ds:[si.mwParam], eax mov ds:[si.mlParam], edx mov ds:[si.mEDI], ebx push ds:[MainWnd] push MSG_WIN_IPC push 0 push ds push si call PostMessage mp30: add sp, 2 pop bp pop ds pop si ret dMsgProc endp |
This pushes the message into the Windows message queue. We have to look at what happens when it pops out the other end.
For this we look at the function MainDlgProc in win_link.c. Again, we abbreviate it to show just the PostMessage code. We find that we post a plain old Windows message, so we go hack into the message queue.
case MSG_WIN_IPC: pVxdMsg = (VXDMSG _far *) lParam; //Lots of SendMessage code... PostMessage (pVxdMsg->hwnd, pVxdMsg->uMsg, pVxdMsg->wParam, pVxdMsg->lParam); pVxdMsg->Inuse = 0; break; |
This is not necessarily the best way to handle a post; but it works.
The second addition to the code involves returning a value. The main reason to call SendMessage instead of PostMessage is that you need to know the return value from SendMessage. So we start with LinkMsgProc again. We add a semaphore, block on after setting an event to HandleCallBack, and destroy the semaphore when we have unblocked. We create and destroy the semaphore on a per-message basis for two reasons. First, there can be multiple SendMessages, so we can't use a single semaphore. Second, a SendMessage is a pretty rare event, so the overhead is not a killer.
The handle to the semaphore is included in the message structure. The handle is needed by Win-Link to make a call back to Win-IPC, telling it to unblock that semaphore. We first check to see whether IPC is turned on or off. If it is turned off we do not accept any messages. Then we check to see whether we are sending a message from a Windows app to a Windows app. There is no reason for that to go through us, so we don't allow it. Next we get the VmData struct for the receiving VM. GetVmData returns a pointer to VmData in ESI. This also assures us that we are sending a message to a VM that exists.
We now check to make sure we have an address to call in the Windows VM to get to PostMessage. The flag IPC_OFF should be set if this is NULL, but I like to be paranoid in cases like this. We then go into the code we saw before to get a VMMSC struct. This struct holds our passed-in message parameters, the semaphore we use to block, and the return value from the SendMessage call. This data is allocated to this message until the semaphore is unblocked at the end of ListMsgProc.
LinkMsgProc proc ; We have a message to post_send. ; We can't send a msg from Windows to Windows!! dmp40: test [SysFlags], IPC_OFF ; Are we running? jz short dmp50 mov [ebp.Client_AX], ERR_NO_WIN_APP ret dmp50: cmp ebx, [SysVm] ; Win Msg to WinMsg? jne short dmp60 cmp ebx, [ebp.Client_EBX] jne short dmp60 mov [ebp_Client_AX], ERR_WIN_TO_WIN ret dmp60: mov eax, [ebp_Client_EBX] ; Get destination VM call GetVm jc short dmp65 call GetVmData cmp [SysCallBack], 0 jne short dmp70 dmp65: mov [ebp.Client_AX], ERR_UNKNOWN_VM ret ; Get a VmMsg struct dmp70: mov cx, [VmMsgAlloc] mov edi, [VmMsgOff] mov eax, [ebp.Client_EBX] dmp80: xchg [edi.Handle], eax cmp eax, 0 je short dmp90 xchg [edi.Handle], eax add edi, size VmMsg loop dmp80 mov [ebp.Client_AX], ERRMSG_FULL ret |
Here is where we start to differentiate because we are sending a message. First we create a semaphore, and this value is stored in our VMMSO structure. Following that, we set up the rest of the structure and then set up an event to call HandleCallBack, just as we did in PostMessage.
dmp90: test [ebp.Client_EAX], FLAG_SEND_MSG ; send? jz short dmp110 xor ecx, ecx ; Set up a semaphore VMMcall Create_Semaphore jnc short dmp100 mov [ebp.Client_AX], ERR_NO_SEMAPHORE ret dmp100: mov [edi.SendSem], eax dmp110: mov eax, [ebp.Client_EAX] ; save message mov [edi.lParam1], eax mov eax, [ebp.Client_EDX] mov [edi.lParam2], eax mov eax, [ebp.Client_ECX] mov [edi.lWndMsg], eax mov [edi.VmOff], esi ; lets generate the call-back mov eax, Low_Pri_Device_Boost push ebx mov ebx, [esi.VmHandle] mov ecx, PEF_Wait_For_STI or PEF_Wait_Not_Crit mov edx, edi mov esi, OFFSET32 HandleCaliHack VMMcall Call_Priority_VM_Event pop ebx mov edx, [edi.Rtn] ; rtn regs & Client_regs |
The rest of the function is send-specific. The semaphore is blocked to stop LinkMsgProc from returning until after the semaphore is unblocked. In the meantime, before or after the semaphore is blocked, HandleCallBack calls Win-Link, which processes the message. When the message has been processed, Win-Link makes a call to Win-IPC, passing the semaphore and return value. This call in Win-IPC sets the return value in the VMMSG struct and clears the semaphore.
The end result of this is that when WaitSemaphore returns, the return value of the SendMessage is in EDI.Rtn. All that is left to do is to destroy the semaphore, free up the VMMSU struct, and return the result from SendMessage.
Note that the value is returned in DX. AX is always the status returned from the call so that you can differentiate between a 1 returned from SendMessage and an error code of 1.
test [ebp.Client_EAX], FLAG_SEND_MSG ; send? jz short dmp130 dmp120: mov eax, [edi.SendSem] mov ecx, Block_Svc_Ints or Block_Enable_Ints VMMcall Wait_Semaphore ; block until sent mov eax, [edi.SendSem] VMMcall Destroy_Semaphore ; destroy it mov edx, [edi.Rtn] ; rtn regs & Client_regs mov [edi.Handle], 0 ; Mark VmMsg avail dmp130: mov [ebp.Client_EDX], edx mov eax, ERR_NONE mov [ebp.Client_EAX], eax ret LinkMsgProc endp |
So what happens differently in HandleCallBack? Nothing! There is a different code path for a SendMessage to a VM other than the system VM, but a SendMessage to the system VM is identical to a PostMessage. The same goes for _dMsgProc in Win-Link. Which brings us to MainDlgProc. I have shown the full code for handling a message from Win-IPC, but the part executed when we send a message from Win-IPC to Win-Link is the part that creates the SendDlg struct and passes that. So all the messages we send to Win-Link are sent from the MSG_WIN_IPC case back to MainDlgProc, with all the variables passed in a struct that lParam points to. The return value to be passed back is set in that struct. When the internal SendMessage call returns, we call dPostMsg, passing the return value and a pointer to the VMMSG struct that is holding the sent message on the Win-IPC side. This call sets the return value in VMMSU and clears the semaphore. Finally, the VXDMSG struct is freed. At this point the message has been processed, but we still need to go back to Win-IPC, pass the return value, and clear the semaphore.
case MSG_WIN_IPC: pVxdMsg = (VXDMSG _far *) lParam; if (pVxdMsg->wFlags & 0x0001) { if (pVxdMsg->hWnd != hDlg) lRtn = SendMessage (pVxdMsg->hWnd, pVxdMsg->uMsg, pVxdMsg->wParam, pVxdMsg->lParam); }else{ SendDlg.lParam = pVxdMsg->lParam; SendDlg.wParam = pVxdMsg->wParam; SendDlg.lRtn = 0; SendMessage (pVxdNsg->hWnd, pVxdMsg->uMsg, 0, (long) (LPVOID) &SendDlg); lRtn = SendDlg.lRtn; } dPostMsg (_MSG_SEND_RTN, lRtn, pVxdMsg->lEDI); }else{ PostMessage (pVxdMsg->hWnd, pVxdMsg->uMsg, pVxdMsg- >wParam, pVxdMsg->lParam); } pVxdMsg->InUse = 0; break; |
The message MSU_SEND_RTN works its way through the dispatching code and ends up at _MsgSendRtn. _MsgSendRtn checks to make sure the passed-in pointer is good, then places the return value in VMMSU and clears (signals) the semaphore. This causes the Block_Semaphore in LinkMsgProc to return with the original SendMessage call.
_MsgSendRtn proc ; Check edi (points to VmMsg, good handle) mov edi, [ebp.Client_EDX] mov ecx, [VmMsgOff] cmp edi, ecx jb short msr10 mov eax, size VmMsg mul [VmMsgAlloc] add eax, ecx cmp edi, eax jae short msr10 cmp ebx, [edi.Handle] jne short msr10 ; Its ok - save the rtn value & turn semaphore off mov eax, [ebp.Client_ECX] mov [edi.Rtn], eax mov eax, [edi.SendSem] VMMcall Signal_Semaphore msr10: ret MsgSendRtn endp |
We have thus sent a message from Win-IPC to Win-Link. Definitely not a trivial undertaking, hut not terribly complicated or convoluted.
The only time this comes up is when you post a message in an interrupt handler in your VxD and while the message is being posted, another interrupt comes in so that you post again. Using PostMessage under these conditions causes the first message to disappear. This is not a good idea anyway -- you would probably max out the message queue under such a design.
You need to make sure that any memory touched by Win-Link while in _dMsgProc is locked down in physical memory. Again, because we can call this at any time, the code and data used cannot be swapped out to disk. If it were, you would use whatever happened to be there instead or fault, depending on the state of the system at the time. That is why Win-Link locks down its code and data when it starts. It is not necessary to lock the entire program down (I did it because Win-Link is small model), but it is critical that every byte of code and data that you touch at this time is locked down.
If the DOS app receiving the message calls MsgRead, it is blocked on a semaphore. We signal the semaphore to free it up. If MsgRead has not been called yet, it is called to read the message. Because we already signaled the semaphore, when MsgRead calls Block_Semaphore it returns instantly.
Finally, we boost the execution priority of the receiving VM. The theory behind this is that this VM has been waiting for the message. We now want to give it a boost so it can get started processing the message. Depending on your application, you may prefer not to include this step. It gives you a faster response but makes Windows freeze for a moment. In the following code fragment I have removed the part that handles messages posted to a Windows app. This is the code that handles posting to a DOS app.
MsgPost proc mp10: call GetVmData ; ESI = VmData of dest VM ; Do we have room in the message array??? ; NO if Write == Read-1 OR (Read == MsgArr ; AND Write == last element) mov eax, size DosMsg mov edi, [esi].MsgGet sub edi, eax cmp edi, [esi].MsgPut ; Write == Read-1? je short mp90 ; YES lea edx, [esi].MsgArr cmp [esi].MsgGet, edx ; Read == MsgArr? jne short mp20 ; NO mov eax, [esi].MsgLast cmp eax, [esi].MsgPut ; AND Write == last je short mp90 ; OK we can store it mp20: mov edi, [esi].MsgPut mov ax, [ebp.Client_CX] mov [edi].dWnd, ax mov ax, [ebp.Client_CX] mov [edi].dMsg, ax mov ax, [ebp.Client_DI] mov [edi].dwParam, ax mov eax, [ebp.Client_EDX] mov [edi].dlParam, eax ; inc free, roll it if past end add [esi].MsgPut, size DosMsg mov eax, [esi].MsgLast cmp [esi].MsgPut, eax jbe short mp30 lea eax, [esi].MsgArr mov [esi].MsgPut, eax ; Signal read we have a message mp30: mov eax, [esi.Msgsem] VMMcall Signal_Semaphore ; Boost the execution priority of the guy we call ; so it gets the message ASAP. mov eax,Low_Pri_Device_Boost VMMCall Adjust_Exec_Priority mp40: mov [ebp.Client_EAX], ERR_NONE ret mp90: mov [ebp.Client_EAX], ERR_MSG_FULL ret MsgPost endp |
We now have a message in the queue for a DOS VM. There are two calls to handle getting the message to the DOS app. The first call is Msgpeek. When a DOS app calls MsgPeek, it gets a copy of the next message in the queue. If there is no message, Release_Time_Slice is called and a no-message error is returned. This call assumes MsgPeek is only called in an idle loop. If you make this call to check for an abort message, you might want to remove the Release_Time_Slice.
MsgPeek proc mov eax, ebx call GetVmData ; ESI = VmData of VM ; do we have one? mov edi, [esi].MsgGet cmp edi, [esi].MsgPut je short mpk90 ; Lets fill it in mov eax, [ebp].Client_EDX call V86ToPmPtr mov esi, edi mov edi, eax mov ecx, (size DosMsg) / 2 rep movsw mov [ebp.Client_EAX], ERR_NONE ret mpk90: VMMcall Release_Time_Slice mov [ebp.Client_EAX], ERR_NO_MSG ret MsgPeek endp |
The second call is MsgRead. Although MsgPeek will return the contents of the next message, MsgRead actually removes a message from the queue. The first step is to call is called, putting a message in the queue and signaling the semaphore. Next, the message is filled in and the pointer MsgGet is incremented to the next location in the queue. The message is then returned.
MsgRead proc mov eax, ebx call GetVmData ; ESI = VmData of VM ;Lets block if there are no messages mov eax, [esi.MsgSem] mov ecx, Block_Svc_Ints or Block_Enable_Ints VMMcall Wait_Semaphore ;Lets fill it in mr10: Save <esi> mov eax, [ebp].Client_EDX call V86ToPmPtr mov esi, [esi].MsgGet mov edi, eax mov ecx, (size DosMsg) - 2 rep movsw Restore <esi> ; inc next, roll it if past end add [esi].MsgGet, size DosMsg mov eax, [esi].MsgLast cmp [esi].MsgGet, eax jbe short mr20 lea eax, [esi].MsgArr mov [esi].MsgGet, eax mr20: mov [ebp.Client_EAX], ERR_NONE ret MsgRead endp |
MsgPost proc mov eax, [ebp.Client_EBX) call GetVm jc short mp05 cmp eax, [SysVm] jne short mp10 PostPm [SysVm], [ebp.Client_CX], [ebp.Client_ECX+2],\ [ebp,Client_DI], [ebp.Client_EDX] mov [ebp.Client_EAX], ERR_NONE ret mp05: mov [ebp.Client_EAX], ERR_UNKNOWN_VM ret mp10: ; post to DOS app code ... MsgPost endp |
V86ToPmPtr proc Save <edx> cmp ebx, [SysVm] jne short vtp20 Save <ecx> push eax shr eax, 16 VMMcall _SelectorMapFlat <[SysVm], EAX, 0> pop edx cmp eax, -1 je short vtp10 and edx, 0FFFFh add eax, edx Restore <ecx,edx> clc ret vtp10: Restore <ecx,edx> stc ret vtp20: movzx edx, ax shr eax, 12 and eax, 0FFFF0h add eax, edx add eax, [ebx.CB_High_Linearl Restore <edx> clc ret V86ToPmPtr endp |
GetVm performs a very simple function. If the passed-in value in EAX is 0, GetVm returns the system VM in EAX. Otherwise, it leaves EAX alone, assuming it is the handle to a VM. In debug mode GetVm validates the VM handle. Thus, it is a way to convert any passed-in VM handle from our system that maps a handle of 0 to the system VM, and in debug mode validates the handle.
GetVm proc or eax, eax jnz short gv10 mov eax, [SysVm] gv10: Save <ebx> mov ebx, eax VMMcall Validate_VM_Handle Restore <ebx> ret GetVm endp |
This function is not affected by what VM is currently running. However, the memory at both ends of this copy had better be locked down. The error-checking code has been removed from the following to make the sample clearer.
MsgMemCopy proc Save <ebx> ; Get the params mov eax, [ebp.Client_EBX] call GetVm mov ebx, eax mov eax, [ebp.Client_ESI] call V86ToPmPtr mov esi, eax mov eax, [ebp.Client_EDX] call GetVm mov ebx, eax mov eax, [ebp.Client_EDI] call V86ToPmPtr mov edi, eax mov ecx, [ebp.Client_ECX] Restore <ebx> ; Copy the dwords Save <ecx> shr ecx, 2 rep movsd Restore <ecx> and ecx, 03h jz short mmc30 rep movsb mmc30: mov [ebp.Client_EAX], ERR_NONE ret MsgMemCopy endp |
MsgMemLdt first verifies that the VM where the memory is located is good. It then calls V86ToPmPtr to get the flat offset of the memory location. It next tests the limit. Because we are returning a 16:16 pointer, we have to ensure that the limit does not exceed 64K. Finally, we verify that the VM that will use the returned LDT pointer is legit.
We use the pair of calls _BuildDescriptorDWORDs and _Allocate_LDT_Selector to create a LDT pointer from the passed-in parameters.
MsgMemLdt proc Save <ebx> mov eax, [ebp.Client_EDX] call GetVm jc short mm130 mov ebx, eax ; Get flat address mov eax, [ebp.Client_EDI] call V86ToPmPtr jc short mm130 mov esi, eax ; Get the limit mov edi, [ebp.Client_ECX] test edi, 0FFF00000h jnz short mm130 mov eax, [ebp.Client_EBX] call GetVm jc short mm130 mov ebx, eax ; Create it VMMcall _BuildDescriptorDWORDs <esi, edi, RW_Data_Type, D_GRAN_BYTE, 0> VMMcall _Allocate_LDT_Selector <ebx, edx, eax, 1, 0> Restore <ebx> mov [ebp.Client_AX], ax ret mm130: Restore <ebx> mov [ebp.Client_EAX], 0 ret MsgMemLdt endp |
Freeing an LDT is even easier. Again, because a VM handle of 0 needs to be converted we call GetVm. Then we call _Free_LDT_Selector to free the LDT.
Whether you use LDTs or GDTs, the free call is critical. There are only 8K of GDTs in the entire system and only 8K of LDTs in each VM. If you have a leak where you allocate and don't free pointers, you will bring the system to its knees sooner or later.
MsgMemFreeLdt proc Save <ebx> mov eax,[ebp.Client_EBX] call GetVm jc short mf120 mov ebx, eax movzx edx, word ptr [ebp.Client_EDX] VMMcall _Free_LDT_Selector <ebx, edx, 0> Restore <ebx> mov [ebp.Client_EAX], 0 ret mf120: Restore <ebx> mov [ebp.Client_EAX], ERR_UNKNOWN_VM ret MsgMemFreeLdt endp |
This is painfully easy. The DOS app sends a message to Win-Link, which calls DosExec in Win-Link. This call passes a file to exec and a run parameter. This file can be a DOS or Windows app. Win-Link will then call WinExec to launch the app. The app is launched in the mode specified. If the mode is SW_HIDE, the app is launched but you will not even see an icon for it.
void DosExec (HWND hDlg,LONG lParam) { BYTE _far *fpsFile; SENDDLG _far *fpSendDlg; VMDATA _far *fpVmData; fpSendDlg = (SENDDLG _far *) lParam; fpVmData = (VMDATA _far *) fpSendDlg->lParam; fpsFile = fpVmData->sExec; if (WinExec (fpsFile, fpSendDlg->wParam) <= 32) fpSendDlg->lRtn = 0L; else fpSendDlg->lRtn = -1L; *(fpVmData->sExec) = 0; |
And the initial thought I always had was: What do you think is running? Granted this was partially a problem with wording -- I have seen some applications that will sense if Windows is running and, if it is, gives you a better message. But still, Windows is running and I want it to start up my Windows app, even if I type the command from the DOS command line. So we will now go through this process.
The first step is to intercept the int 21h call to exec a DOS program. (Note: all the following code fragments show just the necessary parts to catch the DOS exec. I have also removed the special case code for win.com.) If you type win at the DOS command prompt, Win-Link had some special handling. This is remnant from Windows 3.0 days when Windows would let you start Windows in a DOS box.
The first thing we do is open the .EXE file. We have to be careful here because if share is loaded and this is a Windows EXE that is already running, we will get a sharing violation. So we also have an int 24h hooker to catch the violation. This stops it from appearing in the DOS box.
BeginProc WinIpc_Int_24 mov eax, ebx call GetVmData test [test.wFlags], I24_ON jz short i24_10 mov [ebp.Client_AL], 3 clc ret i24_10: stc ret EndProc WinIpc_Int_24 |
If the open fails, we check the return code. If it is a sharing violation, we pass it on to Win-Link to try and exec because the odds are pretty good that its a Windows app. If it is a different error we pass it on to DOS for a try.
If the open succeeded, we next read to see if it is a New Executable format file. Unfortunately all this means is it is not real mode. However, there is no way to tell if it's a Windows or OS/2 application.
If the file does not have the NE signature, we pass it on to DOS. Up to this point our hit has been minimal. Yes we did an open, but DOS will open the same file again so all we did is get it in the cache sooner.
BeginProc WinIpc_Int_21 i21_70: cmp [ebp.Client_AX], 4B00h ; EXEC, func 0? jne i2l_160 test [SysFlags], EXEC_OFF ; EXEC off? jnz i21_160 Push_Client_State VMMcall Begin_Nest_Exec push edi ; local vars push esi ; local vars sub esp, size DiStk mov edi, esp mov [edi.hVm], ebx movzx edx, [ebp.Client_ES] ; get offset to cmd line shl edx, 4 movzx eax, [ebp.Client_BX] add edx, eax add edx, Eebx.CB_High_Linear] movzx eax, word ptr [edx+4] shl eax, 4 movzx edx, word ptr [edx+2] add edx, eax add edx, [ebx.CB_High_Linear] mov [edi.pCmd], edx movzx edx, [ebp.Client_DS] ; get offset to file name shl edx, 4 movzx eax, [ebp.Client_DX] add edx, eax add edx, [ebx,CB_High_Linear] mov [edi.pFn], edx or [esi.wFlags], I24_ON mov eax, 3D20h ; open file VxDint 21h jnc short i2l_110 ; NO error on open and [esi.wFlags], not 124_ON cmp al, 5 ; file locked? jne i2l_150 ; NO - leave it to DOS jmp i21_120 i21_110: and [esi.wFlags], not 124_ON mov [edi.hFile], ax mov ebx, eax mov eax, 3F00h ; read MZ mov ecx, 2 lea edx, [edi]+RwBuf VxDint 21h jc i21_140 cmp word ptr [edi.RwBuf], 5A4Dh jne i21_140 mov eax, 4200h ; seek to offset xor ecx, ecx mov edx, 3Ch VxDint 21h jc i21_140 mov eax, 3F00h ; read offset mov ecx, 4 lea edx, [edi]+RwBuf VxDint 21h jc i2l_140 movzx edx, word ptr [edi.RwBuf] ; Seek to new EXE movzx ecx, word ptr [edi.RwBuf+2] mov eax, 4200h VxDint 21h jc i21_140 mov eax, 3F00h ; read NE mov ecx, 2 lea edx, [edi]+RwBuf VxDint 21h jc i2l_140 cmp word ptr [edi.RwBuf], 454Eh jne i21_140 mov bx, [edi.hFile] ; close file mov eax, 3E00h VxDint 21h |
Ok, we may have a Windows app, so we copy the file name and command line into our structure and send a message to Win-Link. Win-Link will return a 0 if it launched the program successfully. In that case we return, eating the interrupt call. This will return the DOS box back to the DOS prompt.
If Win-Link returns non zero, then it could not launch the app. In that case we return with carry set and the interrupt is passed on to DOS. DOS then attempts to launch the application.
i2l_120:mov ebx, [edi.hVm] mov eax, ebx call GetVmData push edi push esi mov edi, [edi].pFn lea esi, [esi).sExec ; copy fn xchg esi, edi mov ecx, 128 / 4 rep movsd pop esi pop edi push edi push esi mov edi, [edi] .pcmd ; copy command line lea esi, [esi] .sCmdLine xchg esi, edi mov ecx, 128 / 4 rep movsd pop esi pop edi mov edx, [esi.VmLdt] Save <edi,esi> SendPm [SysVm], [Syswnd], MSG_WIN_EXEC, 0, edx Restore <esi,edi> mov ebx, [edi.hVm] cmp edx, 0 ; WinExec OK? jne short i21_150 add esp, size DiStk pop esi pop edi VMMcall End_Nest_Exec Pop_Client_State clc ; return done ret i21_140: mov bx, [edi.hFile] ; close file mov eax, 3E00h VxDint 21h i2l_150: mov ebx, [edi,hVm] add esp, size DiStk pop esi pop edi VMMcall End_Nest_Exec Pop_Client_State i2l_160: stc ; return continue chain ret EndProc WinIpc_Int_21 |
When the message is sent to Win-Link, it processes it in ExecFile. We first look to see if this file is in a list of files that are to not be launched. This list includes bound OS/2 apps, apps that have both a real DOS program as their stub, and any other EXEs that have a NE header that you do not wish to launch. These files are tracked by file name only, not the full path. So we compare just the file name.
We then find the drive and directory of the file being executed. This is the directory it is in because command.com walks the path, but for each attempt it passes EXEC the fully qualified file name to run. We set that drive and directory as the default drive and directory. This way an application is run from its own directory. Experience has shown me that this is the best drive to use.
Now we're ready to try it. We call LoadModule because we only want to launch Windows apps and not DOS apps. A DOS app should stay in its own VM. LoadModule can only exec a Windows app. LoadModule gives us a return value which we then pass back as out return value. Obtaining this return value is the reason we needed a SendMessage instead of a PostMessage.
Finally, we restore the default drive and directory.
void ExecFile (HWND hDlg,WORD wParam,DWORD lParam) { BYTE _far *fpsBase, _far *fpsFile; SENDDLG _far *fpSendDlg; VMDATA _far *fpVmData; FARPROC lpProcAbout; int iNum; LOADMOD LoadMod; WORD wCmdShow[2]; BYTE sBuf[FILE_MAX+2], sCwd[FILE_MAX+2]; fpsendDlg = (SENDDLG _far *) lParam; fpVmData = (VMDATA _far *) fpSendDlg->lParam; fpsBase = fpVmData->sExec; fpsFile = fStrEnd (fpsBase); while ((fpsFile >= fpsBase) && (*fpsFile != '\\') && (*fpsFile != '/') && (*fpsFile != ':')) fpsFile--; fpsFile++; // see if in our no-no list fpsendDlg->lRtn = 0L; if ((iNum = (int) SendDlgltemMessage (hDlg, DLG_NO_EXEC, LB_FINDSTRING, 0, (LONG) fpsFile)) >= 0) { SendDlgltemMessage (hDlg, DLG_NO_EXEC, LB_GETTEXT, iNum, (LONG) (LPSTR) sBuf); if (!fStriCmp (sBuf, fpsFile)) fpSendDlg->lRtn = 0xFFFFFFFFL; } if (fpSendDlg->lRtn != 0) { *(fpVmData->sExec) = 0; return; } // Save the current dir & set the current dir to the dir // the program is in. After the exec - we restore the cur dir _getdcwd (toupper (*fpsBase) - 'A' + 1, sCwd, FILE_MAX); fStrnCpy (sBuf, fpsBase, FILE_MAX); iNum = Min (fpsFile - fpsBase, FILE_MAX); if ((iNum > 3) && (sBuf[iNum-1] =='\\')) iNum--; sBuf [iNum] = 0; // Set default drive & dir _dos_setdrive (toupper (sBuf[0)) - 'A' + 1, (unsigned *) &iNum); _chdir (sBuf); fpsFile = fpVmData->sCmdLine; *(fpsFile + (*fpsFile) + 1) = 0; LoadMod.wEnvSeg = 0; LoadMod.dwRes = 0; LoadMod.lpCmdLine = fpsFile + 1; LoadMod.lpCmdShow = wCmdShow; wCmdShow[0] = 2; wCmdShow[1] = SW_SHOWNORMAL; if (LoadModule (fpsBase, &LoadMod) <= (HINSTANCE) 32) fpSendDlg->lRtn = -1L; *(fpVmData->sExec) = 0; // Back to the old drive & dir _dos_setdrive (toupper (sCwd[0]) - 'A' + 1, (unsigned *) &iNum); _chdir (sCwd); } |
The DOS box title tracking is fairly straightforward. Whenever Win-IPC believes that the running application has changed it sets the title to the string found in the memory arena for the currently selected PSP.
The one weird thing here is you can't track the set PSP call because there are usually TSRs or device drivers that temporarily change it.
You will find that the title constantly changes as you sit at the DOS prompt.
The print intercepting is probably the most complicated part of the entire program. It involves intercepting various interrupts, time-outs, and its own printer driver. A thorough discussion of it could be a book by itself. And, unfortunately, I do not have permission to include the sources to the raw printer driver. However, all RAW.DRV does is properly implement the PASSTHROUGH escape command; most 3.1 printer drivers also do that.
The rest of Win-Link is pretty dull. There is the code to handle the dialog box and the other details of a standard Windows program. I hope by explaining how a commercial program works, I have provided a different viewpoint into VxDs than you get from sample programs. I also hope that if you ever have to write a program like this that the code presented here will give you a head start. I can tell you from experience that attacking this for the first time is not the best way to learn about VxDs.