`through VMM-Based “Out-of-the-Box”
`Semantic View Reconstruction
`
`12
`
`XUXIAN JIANG
`North Carolina State University
`XINYUAN WANG
`George Mason University
`and
`DONGYAN XU
`Purdue University
`
`An alarming trend in recent malware incidents is that they are armed with stealthy techniques
`to detect, evade, and subvert malware detection facilities of the victim. On the defensive side, a
`fundamental limitation of traditional host-based antimalware systems is that they run inside the
`very hosts they are protecting (“in-the-box”), making them vulnerable to counter detection and
`subversion by malware. To address this limitation, recent solutions based on virtual machine (VM)
`technologies advocate placing the malware detection facilities outside of the protected VM (“out-of-
`the-box”). However, they gain tamper resistance at the cost of losing the internal semantic view of
`the host, which is enjoyed by “in-the-box” approaches. This poses a technical challenge known as
`the semantic gap.
`In this article, we present the design, implementation, and evaluation of VMwatcher—an “out-
`of-the-box” approach that overcomes the semantic gap challenge. A new technique called guest
`view casting is developed to reconstruct internal semantic views (e.g., files, processes, and ker-
`nel modules) of a VM nonintrusively from the outside. More specifically, the new technique casts
`semantic definitions of guest OS data structures and functions on virtual machine monitor (VMM)-
`level VM states, so that the semantic view can be reconstructed. Furthermore, we extend guest
`view casting to reconstruct details of system call events (e.g., the process that makes the system
`
`This work was supported in part by the US National Science Foundation (NSF) under Grants
`CNS-0716376, CNS-0716444 and CNS-0546173. Any opinions, findings, and conclusions or recom-
`mendations expressed in this material are those of the authors and do not necessarily reflect the
`views of the NSF.
`Authors’ addresses: Xuxian Jiang, Department of Computer Science, North Carolina State Uni-
`versity, 890 Oval Drive, Raleigh, NC 27695; email: jiang@cs.ncsu.edu. Xinyuan Wang, Depart-
`ment of Computer Science, George Mason University, 4400 University Drive, Fairfax, VA 22030;
`email: xwangc@gmu.edu. Dongyan Xu, Department of Computer Science and CERIAS, Purdue
`University, 305 N. University Street, West Lafayette, IN 47907; email: dxu@cs.purdue.edu.
`Permission to make digital or hard copies of part or all of this work for personal or classroom use
`is granted without fee provided that copies are not made or distributed for profit or commercial
`advantage and that copies show this notice on the first page or initial screen of a display along
`with the full citation. Copyrights for components of this work owned by others than ACM must be
`honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers,
`to redistribute to lists, or to use any component of this work in other works requires prior specific
`permission and/or a fee. Permissions may be requested from Publications Dept., ACM, Inc., 2 Penn
`Plaza, Suite 701, New York, NY 10121-0701 USA, fax +1 (212) 869-0481, or permissions@acm.org.
`C(cid:2) 2010 ACM 1094-9224/2010/02-ART12 $10.00
`DOI 10.1145/1698750.1698752 http://doi.acm.org/10.1145/1698750.1698752
`
`ACM Transactions on Information and System Security, Vol. 13, No. 2, Article 12, Publication date: February 2010.
`
`WIZ, Inc. EXHIBIT - 1043
`WIZ, Inc. v. Orca Security LTD.
`
`
`
`12:2
`
`•
`
`X. Jiang et al.
`
`call as well as the system call number, parameters, and return value) in the VM, enriching the
`semantic view. With the semantic gap effectively narrowed, we identify three unique malware de-
`tection and monitoring capabilities: (i) view comparison-based malware detection and its demon-
`stration in rootkit detection; (ii) “out-of-the-box” deployment of off-the-shelf anti malware software
`with improved detection accuracy and tamper-resistance; and (iii) nonintrusive system call mon-
`itoring for malware and intrusion behavior observation. We have implemented a proof-of-concept
`VMwatcher prototype on a number of VMM platforms. Our evaluation experiments with real-
`world malware, including elusive kernel-level rootkits, demonstrate VMwatcher’s practicality and
`effectiveness.
`Categories and Subject Descriptors: D.4.6 [Operating System]: Security and protection—Inva-
`sive software; K.6.5 [Management of Computing and Information Systems]: Security and
`protection
`General Terms: Security
`Additional Key Words and Phrases: Malware detection, rootkits, virtual machines
`ACM Reference Format:
`Jiang, X., Wang, X., and XU, D. 2010. Stealthy malware detection and monitoring through VMM-
`based “out-of-the-box” semantic view reconstruction. ACM Trans. Info. Syst. Sec. 13, 2, Article 12
`(February 2010), 28 pages.
`DOI = 10.1145/1698750.1698752 http://doi.acm.org/10.1145/1698750.1698752
`
`1. INTRODUCTION
`Internet malware (e.g., rootkits, worms, and bots) is getting increasingly
`stealthy and elusive: They try to hide their presence from detection facilities
`and even detect and subvert any existing anti malware software in the compro-
`mised system. For example, a detailed analysis of an Agobot variant [Agobot
`2004] has revealed that the malware contains malicious logic to detect and
`remove more than 105 antivirus processes in the victim machine.
`The threat described earlier in the text is partly attributed to a fundamental
`limitation on the defensive side: Most host-based antimalware systems are in-
`stalled and executed inside the very hosts that they are monitoring and protect-
`ing (Figure 1(a)). Although such “in-the-box” deployment provides an antimal-
`ware system with a native, semantic-rich view of the host, it in the meantime
`makes the antimalware system visible, tangible, and potentially subvertable to
`advanced malware residing in the host.
`To address this problem, there have recently been a number of solutions
`[Dunlap et al. 2002; Garfinkel and Rosenblum 2003; Joshi et al. 2005] that ad-
`vocate placing the intrusion detection facilities outside of the (virtual) machine
`being monitored. Based on virtual machine technologies [Barham et al. 2003;
`Dike 2002], such an “out-of-the-box” approach significantly improves the tam-
`per resistance of intrusion detection facilities. A virtual machine (VM) achieves
`strong isolation and confines processes running inside the VM such that, even
`if they are compromised by malware, it will be hard, if not impossible, to com-
`promise systems outside of the VM.
`However, a dilemma exists in switching from the in-the-box approach to
`the out-of-the-box approach: It is well known that there exists a “semantic
`gap” [Chen and Noble 2001] between the view of the VM from the outside and
`the view from the inside—the latter being seen by the traditional, in-the-box
`
`ACM Transactions on Information and System Security, Vol. 13, No. 2, Article 12, Publication date: February 2010.
`
`
`
`Stealthy Malware Detection and Monitoring
`
`•
`
`12:3
`
`Fig. 1. Malware detection in traditional “in-the-box” approach and in VMwatcher approach.
`
`antimalware systems. For example, instead of seeing semantic-level objects,
`such as processes, files, and kernel modules, we only see memory pages, reg-
`isters, and disk blocks from outside the VM, making out-of-the-box malware
`detection difficult. In other words, the out-of-the-box approach gains tamper
`resistance at the cost of losing the internal semantic view of the host enjoyed
`by the in-the-box approaches.
`The previously described dilemma motivates us to explore the possibility
`of gaining the advantages of both camps, namely enabling tamper-resistant
`malware detection without losing the semantic view. In this article, we present
`the design, implementation, and evaluation of VMwatcher—a VMM-based, out-
`of-the-box approach that overcomes the semantic gap challenge. More specifi-
`cally, VMwatcher instantiates the general virtual machine introspection (VMI)
`[Garfinkel and Rosenblum 2003] methodology in a nonintrusive manner, so
`that it can inspect the low-level VM states and events without perturbing the
`VM’s execution. A new technique called guest view casting is developed to sys-
`tematically reconstruct the VM’s internal semantic view (e.g., files, directories,
`processes, and kernel-level modules) for out-of-the-box malware detection. Fur-
`thermore, we extend guest view casting to reconstruct details of system call
`events in the VM (e.g., the calling process as well as the system call num-
`ber, parameters, and return value). The new technique is based on the key
`observation that the guest OS of a VM provides all necessary semantic defi-
`nitions of guess OS data structures, functions, and system calls to construct
`the VM’s semantic view. As such, we can cast these definitions on the VMM-
`level observations and externally derive the semantic view of the target VM
`(Figure 1(b)).
`VMwatcher enables new malware detection and monitoring capabilities that
`are previously difficult or impossible to achieve. In this article, we identify and
`demonstrate three such capabilities: (i) view comparison-based stealthy mal-
`ware detection, which involves comparing a VM’s semantic views obtained from
`both inside and outside for possible discrepancy detection; (ii) out-of-the-box ex-
`ecution of unmodified, off-the-shelf antimalware software with improved detec-
`tion accuracy. This is an extreme test to VMwatcher’s semantic gap-narrowing
`technique and, interestingly, it further enables cross-platform malware scan-
`ning where antimalware software developed for one platform can be readily
`used for another platform; (iii) nonintrusive system call monitoring in a produc-
`tion or honeypot VM, which elevates the tamper resistance of malware behavior
`observation and experimentation.
`
`ACM Transactions on Information and System Security, Vol. 13, No. 2, Article 12, Publication date: February 2010.
`
`
`
`12:4
`
`•
`
`X. Jiang et al.
`
`We have implemented a VMwatcher prototype on a number of VMM plat-
`forms and evaluated it with a collection of real-world malware instances (e.g.,
`kernel and user level rootkits). Experiments with these elusive rootkits demon-
`strate VMwatcher’s unique capability of view comparison-based malware de-
`tection. The VMwatcher prototype also supports out-of-the-box deployment of a
`variety of off-the-shelf antimalware software such as Symantec AntiVirus and
`Microsoft Windows Defender.
`The rest of this article is organized as follows: Section 2 presents the design
`of VMwatcher, followed by the implementation details in Section 3. We present
`evaluation results in Section 4 and discuss possible limitations in Section 5.
`Section 6 discusses related work, and Section 7 concludes this article.
`
`2. VMWATCHER OVERVIEW
`
`2.1 Design Goals and Assumption
`Figure 1 illustrates the key difference between the traditional in-the-box
`approach and the VMwatcher approach for malware detection. VMwatcher
`achieves stronger tamper resistance by moving malware monitoring facili-
`ties out of the VM being monitored. VMwatcher is based on two key enabling
`techniques: (i) nonintrusive VM introspection for the procurement of low-level
`(VMM-level) VM states and system call events, without deploying any facility
`inside the VM (Section 2.2.1) and (ii) guest view casting for external reconstruc-
`tion of VM internal semantic view (Section 2.2.2). VMwatcher has the following
`three design goals:
`
`—First, VMwatcher should not perturb the system state of the target VM. This
`will prevent VMwatcher from affecting the normal execution of the VM and
`causing adverse side effects (e.g., system inconsistency [Joshi et al. 2005]) in
`the VM. This goal is realized by our technique for nonintrusive inspection and
`analysis of low-level VM observations. Nonintrusiveness also makes it hard
`for internal malicious processes to infer (external) VMwatcher activities.
`—Second, VMwatcher should significantly narrow the semantic gap such that
`the same malware detection system that runs inside the VM can also run
`outside of the VM. As to be shown, this goal is critical to the new malware
`detection capabilities. The goal is realized by our guest view casting technique
`for external reconstruction of VM semantic view. Based on the reconstructed
`view, antimalware systems can perform file or memory scanning operations
`as if they were inside the VM.1
`—Third, VMwatcher should be generic and applicable to a number of exist-
`ing VMMs. Currently there exist two mainstream virtualization approaches:
`full virtualization and paravirtualization. Full virtualization (as in VMware
`[VMware 2008] and QEMU [Bellard 2005]) transparently supports legacy
`OSs without modifying the guest OS code; while paravirtualization (as in
`
`1We need to point out that some hooking-based features of antimalware systems are hard to support
`by VM introspection. Certain high-level events (e.g., Windows API calls or hooks), which are of
`interest to some antivirus software, may not be captured from low-level VMM observations.
`
`ACM Transactions on Information and System Security, Vol. 13, No. 2, Article 12, Publication date: February 2010.
`
`
`
`Stealthy Malware Detection and Monitoring
`
`•
`
`12:5
`
`Xen [Barham et al. 2003] and User-Mode Linux [Dike 2002]) is less trans-
`parent as it needs to modify the guest OS source code. VMwatcher aims at
`supporting VMMs in both categories.
`We also note that different VMMs choose to implement VMs at different lev-
`els, imposing varying complexity on VMwatcher. More specifically, the lower
`the virtualization level, the wider the semantic gap it will create and, conse-
`quently, the greater the challenge for VMwatcher to bridge the semantic gap.
`For example, because of its system call level virtualization, user-mode linux
`(UML) preserves much of the semantic information (e.g., processes) and thus
`leads to a much narrower semantic gap than VMware, Xen, and QEMU.
`—Assumption on trusted VMM In this article, we assume a trusted VMM that
`achieves VM isolation: A malware instance may compromise arbitrary entity
`and facility inside the VM—including the guest OS kernel itself. However, it
`cannot break out of the VM and corrupt the underlying VMM. This assump-
`tion is based on the observation that the code base of a VMM is much smaller
`and more stable than the legacy OS code. Furthermore, the VMM provides
`a more limited interface (which can be further hardened and validated) to
`untrusted VMs in the form of virtualized underlying physical resources. We
`note that this assumption is consistent with that of many other VM-based
`security research efforts [Dunlap et al. 2002; Garfinkel et al. 2003; Garfinkel
`and Rosenblum 2003; Joshi et al. 2005; Koju et al. 2005]. We will discuss
`possible attacks (e.g., VM fingerprinting) in Section 5.
`
`2.2 Enabling Techniques
`2.2.1 Nonintrusive Virtual Machine Introspection. VMwatcher follows the
`VM introspection methodology to capture low-level VM states and events exter-
`nally. For open-source VMMs such as Xen, QEMU, and UML, we develop VM
`introspection extensions to obtain full VM state, which includes the VM’s reg-
`isters, memory, and disk and to capture system calls made by processes in the
`VM. To achieve nonintrusiveness, we follow the principle of passive, read-only
`observation without inflicting any influence on the VM—this is important, as
`such an influence would lead to undesirable consequences such as inconsistency
`in the VM’s system state or perturbation in the VM’s execution.
`For close-source VMMs, we only have limited access to VMM-level obser-
`vations. For example, with Microsoft Virtual PC, we are not able to read VM
`registers (e.g., the control register CR3) or monitor virtual interrupts. Without
`a VMM’s source code, VMwatcher has to rely on whatever low-level VM state
`abstraction exposed by the VMM. Details of our nonintrusive VM introspection
`technique will be presented in Section 3.1.
`
`2.2.2 Guest View Casting. Given the VMM-level observations of a running
`VM, our second technique, guest view casting, will externally reconstruct the
`internal semantic view of the VM. We observe that the guest OS data structure
`definitions (e.g., files and directories) and function semantics (e.g., semantics of
`file system drivers) can be used as “templates” to interpret low-level VM states.
`As such, we can cast the guest data structure and function definitions on the
`
`ACM Transactions on Information and System Security, Vol. 13, No. 2, Article 12, Publication date: February 2010.
`
`
`
`12:6
`
`•
`
`X. Jiang et al.
`
`VMM-level VM observations to derive the VM’s semantic view. For example,
`given a “live” virtual disk of a running VM, the guest functions, such as guest
`device drivers and related file system drivers, allow us to reconstruct semantic
`information such as files and directories from the “raw” bits and bytes on the
`virtual disk. Similarly, by casting guest memory data structures (e.g., process
`control blocks) and functions to the physical memory pages allocated to a VM by
`the VMM, we can identify each individual running process with its attributes,
`such as process id and name, and derive semantic information about each loaded
`kernel module inside the VM.
`Guest view casting further performs high-fidelity restoration of seman-
`tic objects, so that the restored objects can be presented to an antimalware
`system in exactly the same way as inside the VM. For example, Tripwire
`[Kim and Spafford 1994] assumes a standard UNIX-like file system layout and
`calculates the checksums of files and directories to identify possible changes;
`McAfee VirusScan examines file directories and attempts to spot any exist-
`ing malware in these directories. As such, guest view casting needs to further
`“package” the objects (e.g., files and directories) in the reconstructed semantic
`view and seamlessly present these objects to the antimalware system in their
`native, manipulable form.
`
`2.2.3 Guest System Call Reconstruction. Guest view casting reconstructs
`a VM’s semantic state. For malware monitoring and detection, it is also desir-
`able to capture and interpret the system call events that occur inside the VM.
`To this end, we extend the guest view casting technique to enable guest system
`call reconstruction. More specifically, processes (including the malicious ones)
`inside the VM make system calls by executing system call instructions (e.g.,
`sysenter/sysexit). Such an instruction will be captured by the VMM. Our VM
`introspection technique will then be invoked to further acquire low-level con-
`text information relevant to the system call (e.g., register values and memory
`contents). This context information will be interpreted by our extended guest
`view casting technique, using system call semantics as the casting “templates.”
`Guest system call reconstruction generates detailed system call information, in-
`cluding the process making the system call and the system call number, param-
`eters, and return value. The reconstruction of guest system calls is performed
`in real time and from outside the VM, which improves the tamper resistance
`of existing system call-based monitoring and detection systems.
`
`2.3 New Malware Detection and Monitoring Capabilities
`VMwatcher enables a number of useful malware detection and monitoring ca-
`pabilities. The first capability is view comparison-based detection of elusive
`malware. We have seen an increasing number of elusive malware instances
`that hide themselves (including the related files and processes) by subverting
`antimalware processes running inside the system. For view comparison, we cor-
`roborate an internal view (generated from inside the VM) with an external view
`(generated from outside the VM by VMwatcher) of the same objects of inter-
`est and detect the existence of hidden malware based on any view discrepancy
`exhibited. We note that view comparison can be performed either on the full
`
`ACM Transactions on Information and System Security, Vol. 13, No. 2, Article 12, Publication date: February 2010.
`
`
`
`Stealthy Malware Detection and Monitoring
`
`•
`
`12:7
`
`semantic views of a VM, or on more focused, customized views (e.g., a list of
`files/processes satisfying a certain condition) generated by a malware detection
`function. As an example, running the ls command inside a Linux VM can pro-
`vide an internal view of those files under the current directory. With VMwatcher,
`we can run the same ls command outside of the VM and obtain an external view
`of the files under the same directory. Any difference between the two ls results
`will immediately lead to the detection of hidden files.
`View comparison is not limited to a VM’s persistent states, such as disk
`files. It can also be performed on the VM’s volatile states such as running pro-
`cesses, loaded kernel modules, or even current statistics of a NIC device. We
`find this capability highly valuable, especially when detecting advanced kernel-
`level rootkits that hide running processes or kernel modules (Section 4.1). We
`point out that view comparison would be infeasible without VMwatcher: If sep-
`arated by a semantic gap, the internal and external views of a VM would not
`be directly comparable.
`The second capability is out-of-the-box execution of off-the-shelf antimalware
`systems, which improves the detection accuracy as well as tamper resistance
`of these systems. Moreover, since the guest OS of a VM may be different from
`the host OS, it is possible to perform cross-platform malware detection, where
`antimalware software developed for one platform (e.g., Windows) can be readily
`used for another platform (e.g., Linux). We will present one such experiment in
`Section 4.2.
`The third capability is nonintrusive system call monitoring for malware be-
`havior observation. With VMwatcher’s guest system call reconstruction tech-
`nique, it is possible to monitor system calls made by any process inside a VM,
`without installing any logging module in the VM or modifying the guest OS.
`This capability has direct applications in a number of scenarios, such as system
`call-based anomaly detection [Provos 2003], forensic analysis [King and Chen
`2003], and malware experimentation [Jiang and Xu 2004; Jiang et al. 2005].
`
`3. IMPLEMENTATION
`We have implemented a proof-of-concept VMwatcher prototype on top of four
`existing VMMs: VMware, QEMU, Xen, and UML.2 The prototype is able to
`reconstruct semantic views of a variety of VMs, including Windows 2000/XP,
`Red Hat Linux 7.2/8.0/9.0, and Fedora Core 1/2/3/4. In the following, we describe
`VMwatcher implementation in detail.
`
`3.1 VMM-Level State and Event Procurement
`As mentioned in Section 2.1, VMwatcher is designed to be generically applicable
`to various VMMs. Table I lists the VMM-level VM state and event observation
`offered by the four VMMs. The open-source VMMs—QEMU, Xen, and UML—
`allow full access to low-level VM states and events. The close-source VMware
`typically exposes only the raw disk blocks and raw memory pages allocated to a
`
`2In our current prototype, guest system call reconstruction is supported by VMware and QEMU
`because of their convenient system call instruction interception.
`
`ACM Transactions on Information and System Security, Vol. 13, No. 2, Article 12, Publication date: February 2010.
`
`
`
`12:8
`
`•
`
`X. Jiang et al.
`
`Table I. VMM-Level Observation of VM States and Events
`Full virtualization
`Paravirtualization
`VMware
`QEMU Xen
`UML
`√
`√
`√
`√
`√
`√
`√
`√
`√
`√
`√
`√
`√
`√
`√
`√
`
`VMM-level observation
`Raw VM disk image
`Raw VM memory image
`Other VM hardware states (e.g., registers)
`VM-related low-level events (e.g., interrupts)
`
`VM. We recently obtained the source code of VMware Workstation 6.0 through
`the VMware Academic Program. As a result, our current VMwatcher prototype
`is able to access full VM states and events.
`For the procurement of the VM’s raw disk and memory states, we need to
`access a VM’s raw disk and memory while they are being modified. To ensure
`state consistency, a VMM usually grants exclusive access (e.g., with a write
`lock) to virtualized resources (e.g., memory or disk) to a VM. As a result, it
`could prevent any external process from accessing them. More specifically, the
`file lock in Windows imposed by VMware is mandatory, which means that any
`external process such as VMwatcher is not able to read the locked file. There
`are two possible solutions: One is to follow the same approach taken by cur-
`rent system back-up software, which utilizes the volume shadow copy service
`(more details in [Microsoft 2003]) of Windows to access the locked files. In other
`words, we can create a shadow copy of the locked file and instruct VMwatcher
`to access the shadow copy for VM state procurement. The other approach is to
`develop a device driver that essentially subverts the host Windows kernel and
`allows VMwatcher to read the locked file directly through the device driver. Our
`prototype takes the first approach, which follows the nonintrusive principle, as
`it will not modify the locked file. On UNIX platforms, the file lock is advisory
`by default, which means that we can ignore the lock and just read the locked
`file.
`The previously described strategy resolves the “read–write” conflict between
`running VMs and VMwatcher when both are simultaneously accessing the
`same disk file in the host domain. Note that for a running VM, a file emu-
`lating its virtual disk means a root file system or a hard disk partition; while
`for VMwatcher, it is considered the externally observable VM disk state. We also
`note that VMware, QEMU with KQEMU [Bellard 2006] support, and UML gen-
`erate a temporary memory file to emulate the allocated raw physical memory
`for a VM, which allows for external simultaneous access by VMwatcher. How-
`ever, Xen and QEMU without KQEMU support do not create such memory file.
`As such, we need to extend them to export a VM’s physical memory pages. In our
`prototype, VMwatcher takes advantage of the libxc library [Xen 2004] to access
`the memory of a Xen-based VM (or DomU) by mapping its physical memory
`to its address space with the xc map foreign range() API and then reading the
`content through the mapped memory. Similarly, we build our own library for
`QEMU, which essentially allows for external VMwatcher access to the allocated
`physical memory pages for a QEMU-based VM.
`For the capture of a VM’s low-level events, we first leverage a VMM’s capabil-
`ity of intercepting system call instructions. Such capability is readily available
`
`ACM Transactions on Information and System Security, Vol. 13, No. 2, Article 12, Publication date: February 2010.
`
`
`
`Stealthy Malware Detection and Monitoring
`
`•
`
`12:9
`
`in a number of VMMs, such as VMware and QEMU. Upon the capture of a
`system call instruction, VMwatcher will be invoked to collect relevant low-level
`context information about this system call, such as the values of registers (CR3,
`ESP, EAX, etc.) and certain memory contents in the virtual address space of
`the process that makes the system call. In particular, the procurement of the
`memory contents is guided by certain register values, which supply the virtual
`addresses of the corresponding memory contents. With the low-level register
`and memory states, we extend the guest view casting technique by casting sys-
`tem call semantics to the low-level context information. The extended technique
`will reconstruct detailed semantic information about the system call event
`(Section 3.2.3).
`
`3.2 Semantic View Reconstruction
`Based on raw VM disk and memory states, VMwatcher uses the guest view
`casting technique to extract high-level semantic information (e.g., files and
`processes) and then present them seamlessly to antimalware software. In the
`following, we describe our casting methods for disk and memory state recon-
`struction and for guest system call reconstruction.
`
`It is straightforward to reconstruct the
`3.2.1 Disk State Reconstruction.
`semantic view from the raw virtual disk blocks of a VM if we understand how
`files and directories are organized in the virtual disk. Particularly, our method
`casts the corresponding device drivers and file system drivers of the guest OS
`for disk semantic view reconstruction. For Linux, the casting is convenient as
`the device drivers and file system drivers are likely part of the open-source
`Linux kernel. However, this is not the case for Windows. The reason is that
`the Windows kernel does not have the corresponding file system drivers for the
`Linux root file systems. For the VMwatcher prototype, we have written Windows
`device drivers to interpret Linux file systems (ext2/ext3 root file systems).
`
`It is a more challenging task to re-
`3.2.2 Memory State Reconstruction.
`construct the semantic view of the volatile VM memory. The challenge is that
`it requires accurate casting of guest memory data structures and functions to
`understand how the physical memory pages are utilized. Note that the casted
`guest memory data structures and functions are specific to a VM kernel.
`For ease of presentation, we focus our discussion on Linux for the current
`32-bit architecture (which implies an addressable memory range [0, 4G-1]). In
`Linux, the 4G memory space of a process is split between user space (the bottom
`3GB memory) and kernel space (the top 1GB memory), and the Linux kernel is
`mapped into every user-level process starting at virtual address 0xC0000000.
`Based on the physical memory layout, the first Linux kernel page (with virtual
`address 0xC0000000) is located in the first physical memory page (with phys-
`ical address 0x00000000). This provides the starting point for our guest view
`casting method: If we can access the memory file containing the raw memory of
`a running VM, offset 0 in the memory file will correspond to the current mem-
`ory address 0xC0000000 inside the VM. Next, we utilize the exported symbol
`
`ACM Transactions on Information and System Security, Vol. 13, No. 2, Article 12, Publication date: February 2010.
`
`
`
`12:10
`
`•
`
`X. Jiang et al.
`
`Fig. 2. Guest view casting for volatile VM memory state (Linux).
`
`information3 and apply guest view casting to identify and reconstruct guest
`data structures of interest. Figure 2 shows how guest view casting is applied to
`reconstruct the volatile kernel memory state of a Linux VM. More specifically,
`every process in Linux is represented by a process control block (defined as
`task struct), and all running processes are linked by a doubly linked list. The
`head of this list is kept in a structure called init task union, which is exported
`and can be identified by querying the System.map file. Following this pointer,
`we can further parse the raw memory image and traverse the doubly linked list
`to reconstruct detailed semantic information about each running process (e.g.,
`its page table and memory layout in the mm struct data structure).
`From the same memory image, we can also cast and reconstruct a number of
`other important kernel data structures (e.g., the system call table, the interrupt
`descriptor table, and the kernel module list) and identify the areas that contain
`core kernel instructions or instructions in the loadable kernel modules. It is
`worth mentioning that a user-level memory address (<3G) is usually a virtual
`memory address of a process running in the VM. Since VMwatcher is running
`outside of the VM, it needs to translate the virtual memory address into the
`corresponding physical memory address, which can then be used to access the
`VMM-level memory state.
`We note that existing hardware has the capability of automating the traver-
`sal of page table for address translation. However, it implicitly assumes that the
`
`3For some commercial OSs, the locations of these symbols may not be provided. VMwatcher will
`perform a full scan of the raw memory and identify the symbols by looking for certain “signatures”
`[bugcheck 2006] that are unique to kernel-level data structures of interest. For example, we use
`0x03001b0000000000 to identify potential process instances in the Windows XP raw memory file.
`
`ACM Transactions on Information and System Security, Vol. 13, No. 2, Article 12, Publication date: February 2010.
`
`
`
`Stealthy Malware Detection and Monitoring
`
`•
`
`12:11
`
`virtual address being translated belongs to the current process whose page table
`base is in CR3. For the virtual address of an arbitrary process, VMwatcher will
`have to externally identify and walk through the page table of that process to
`obtain the corresponding physical address and read its content. The code for this
`operation is shown in the following text in function vmwatcher vir mem read32,
`where addr is the virtual address to be accessed; task points to the process con-
`trol block (assuming the task struct data structure in Figure 2) of the process of
`interest; pde and pte refer to a page directory entry and a page table entry as-
`sociated with the process, resp