`Copyright (©)2017 KSI
`
`5642
`
`VirtAV: an Agentless Runtime Antivirus
`System for Virtual Machines
`
`Hongwei Tang’?*", Shengzhong Feng'?", Xiaofang Zhao™ and Yan Jin*™*
`' Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences
`Shenzhen, 518055 - China
`[e-mail: tanghongwei@ict.ac.cn, sz.feng@siat.ac.cn]
`* Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences
`Shenzhen, 518055 — China
`* University of Chinese Academy of Sciences
`Beijing, 100049 - China
`‘Institute of Computing Technology, Chinese Academy of Sciences
`Beijing, 100190 - China
`[e-mail: zhaoxf(@ict.ac.cn, jinyan@ncic.ac.cn]
`*Corresponding author: Hongwei Tang
`
`Received August 9, 2016; revised May 4, 2017; accepted July 10, 2017;
`published November 30, 2017
`
`
`
`Abstract
`
`Antivirus is an important issue to the security of virtual machine (VM). According to where
`the antivirus system resides, the existing approaches can be categorized into three classes:
`internal approach, external approach and hybrid approach. However, for the internal approach,
`it is susceptible to attacks and may cause antivirus storm and rollback vulnerability problems.
`On the other hand, for the external approach, the antivirus systems built upon virtual machine
`introspection (VMI) technology cannot find and prohibit viruses promptly. Although the
`hybrid approach performs virus scanning out of the virtual machine, it
`is still vulnerable to
`attacks since it completely depends on the agent and hooks to deliver events in the guest
`operating system. To solve the aforementioned problems, based on in-memory signature
`scanning, we propose an agentless runtime antivirus system VirtAV, which scans each piece
`of binary codes to execute in guest VMs on the VMM side to detect and prevent viruses. As an
`external approach, VirtAV does not rely on any hooks or agents in the guest OS, and exposes
`no attack surface to the outside world, so it guarantees the security of itself to the greatest
`extent. In addition, it solves the antivirus storm problem and the rollback vulnerability
`problem in virtualization environment. We implemented a prototype based on Qemu/KVM
`hypervisor and ClamAV antivirus engine. Experimental results demonstrate that VirtAV is
`able to detect both user-level and kernel-level virus programs inside Windows and Linux guest,
`no matter whether they are packed or not. From the performance aspect, the overhead of
`VirtAV on guest performance is acceptable. Especially, VirtAV has little impact on the
`performance of common desktop applications, such as video playing, web browsing and
`Microsoft Office series.
`
`
`
`Keywords: agentless, antivirus, antivirus storm, virtual machine, virus signature
`
`
`
`A preliminary version of this paper appeared in IEEE ICACT 2016, Feb 2-4, Korea. This version includes a
`detailed description on the basic idea, design and implementation of VirtAV.
`
`https: //doi.org/10.3837/tiis.2017.11.026
`
`ISSN : 1976-7277
`
`WIZ, Inc. EXHIBIT - 1034
`WIZ, Inc. v. Orca Security LTD. - IPR2024-00220
`
`WIZ, Inc. EXHIBIT - 1034
`WIZ, Inc. v. Orca Security LTD.
`
`
`
`KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS VOL. 11, NO. 11, November 2017
`
`5643
`
`1. Introduction
`
`Antivirus protection is necessary for modern computers. As virtualization technology has
`been extensively adopted to enhance the effectiveness of resource utilization, more and more
`applications are deployed in VMs. Currently three antivirus approaches, i.e., internal approach,
`external approach and hybrid approach, are provided in the industry and academia. In the
`internal
`approach,
`antivirus
`software
`is
`installed
`in
`the VM and
`it
`often
`uses
`signature-matching [1 ][2] technology to search viruses in files [3][4]. When all guest VMs on
`a host machine schedule virus scanning or signature database updating simultaneously, the
`host will be overloaded and corresponding performance of VMs will be degraded dramatically.
`As a result, the ‘antivirus storm’ problem [5]
`is inevitable because of a large number of
`concurrent resource-intensive operations stressing on both computing resource and I/O
`resource. On the other hand, as a useful approach that facilitates VM fault tolerance and
`system maintenance, snapshot & rollback also has been applied. However, when the VM
`rollbacks to a snapshot previously saved, the signature database installed inside may be out of
`date [6][7]. In this case, antivirus software may be unable to detect newer viruses timely,
`which leads to security vulnerabilities of antivirus peculiar to virtualization environment. We
`call it as a ‘rollback vulnerability’ problem. Moreover, antivirus software is highly susceptible
`to attacks when exposed in the VM under protection [8].
`The external approach is inspired by the feature that the VMM has supervisory privilege
`on guest VMs. Antivirus software is totally moved out of the VM [8][9][10], and gets
`necessary information from the guest OS using VMI_
`techniques [11][40][46]. The existing
`work following this approach can only detect virus in guest files or memory, and can’t prohibit
`it properly.
`The hybrid approach is usually composed of two parts: hooks on the guest OS to monitor
`events and capture information, and scanning engine deployed in a dedicated VM [5][12][13].
`The most widely used antivirus product from VMWare and Trend Micro follows this approach
`[13]. Although this approach can avoid the antivirus storm problem and the rollback
`vulnerability problem, the hooks or agents in the guest OS are still vulnerable to attacks.
`Furthermore, some research shows that, over 80% of modern viruses appear to be using
`packing techniques to evade detection by antivirus software. There is also evidence that more
`than 50% of new viruses are generated by simply packing existing ones [48][49]. However,
`traditional approaches widely adopted by antivirus software is using unpacking routines to
`recover the original virus program before scanning it [3]. The limitation of this approach is that
`it needs a specific unpacker for each packer, and moreover, only viruses packed with known
`packers can be unpacked and detected.
`Motivated by the aforementioned observations, an agentless antivirus system VirtAV that is
`based on in-memory signature scanning is proposed in
`this paper. Basically, it can be
`classified
`into external approach since signature scanning and database updating are
`performed completely outside of guest VMs under protection. The main contributions are
`listed as follows.
`。 We designed an agentless approach for runtime antivirus protection on VMs, which
`actively scans guest binary codes on the VMM side, transparently performs virus
`detection and timely prohibits virus found. Such an approach guarantees the security of
`antivirus system to the greatest extent because there is no attack surface exposed to the
`
`
`
`5644
`
`Tang etal.: VirtAV: an Agentless Runtime Antivirus System for Virtual Machines
`
`outside world. It avoids the antivirus storm and the rollback vulnerability problems.
`Moreover, it can also detect packed virus relying on the virus to unpack itself in memory.
`。 We proposed memory signature of virus, variant of file signature used in traditional
`antivirus software, which can be used to uniquely identify virus when it has been loaded
`or even partially loaded into memory. Such an approach helps antivirus system detect
`and prohibit virus in memory before it is activated.
`‧ We implemented a prototype system based on the Qemu/KVM hypervisor and the
`open-source scanning engine of ClamAV. Functionality verification reveals that VirtAV
`is independent of guest operating systems, and is able to detect both non-packed and
`packed viruses running in user space or kernel space inside guest VMs. Moreover,
`benchmarks of typical desktop applications, such as Windows booting, video playing
`and synthetic office workload, are evaluated in both single-VM and multiple-VM
`environments. The performance results show that the overhead of VirtAV is reasonable
`and acceptable. From the results, we can further draw a conclusion that antivirus storm is
`eliminated by VirtAV in the multiple-VM environment.
`The rest of the paper is organized in the following manner: Section 2 gives a brief
`introduction of Qemu/KVM hypervisor and ClamAV antivirus system, and reviews related
`works on security protection on virtual machines. Section 3 discusses the main design
`principles on VirtAV. Section 4 gives the implementation details of VirtAV. Section 5
`presents the experimental results from the viewpoints of function and performance. Finally
`Section 6 concludes the paper and mentions the future work.
`
`2. Background and Related Work
`
`2.1 Qemu/KVM Hypervisor
`
`Qemu/K VM hypervisor is a full virtualization solution based on Linux for X86 platform with
`CPU virtualization extensions (Intel VT or AMD-V). It provides isolated virtualized hardware
`environment for virtual machines running unmodified Linux or Windows as guest OS. Guest
`codes (user-level processes or kernel-level OS) run natively on CPU in non-root mode with the
`exception of sensitive instructions or operations, such as accessing I/O port, which are trapped
`to KVM, and emulated by KVM and Qemu. For each virtual machine, there is an independent
`memory address space that starts from physical address 0x0. On the host, EPT (from Intel, or
`NPT from AMD) is used by MMU to
`translate guest physical memory address (GPA) into host
`physical memory address (HPA) when CPU that is operating in non-root mode accesses guest
`memory. Moreover, if the corresponding host memory page has not been allocated or mapped,
`or there are not sufficient access rights on the page, EPT violation will be generated and
`captured by KVM.
`
`2.2 ClamAV
`
`ClamAV is the most widely used open-source antivirus system, which supports detection of
`both non-polymorphic viruses
`and polymorphic viruses.
`For non-polymorphic viruses,
`signatures are in simple string format, and the Boyer-Moore (BM) algorithm is used to detect
`this kind of viruses. While for polymorphic viruses, signatures could contains regular
`expressions and wildcards, and the Aho-Corasick (AC) algorithm [2] is adopted.
`Specially, the implementation of the AC algorithm in ClamAV uses a
`trie to store the
`finite-state machine (FSM) constructed from the signatures. In addition, 3 helper functions
`including goto, failure and output, are defined accordingly. The goto() function on a given
`
`
`
`KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS VOL. 11, NO. 11, November 2017
`
`5645
`
`state points out which state to move if the next input character matches with any predefined
`input of the state, otherwise, the FSM resorts to the failure() function. The output() function
`summarizes and outputs target patterns matched in the end. At the initialization phase, fixed
`string parts of each polymorphic signatures are loaded from the database, and are used to build
`the
`trie.
`At the scanning phase, ClamAV scans an input
`file byte-by-byte and detects
`occurrences of signatures. In general, fixed parts of signatures are scanned in the trie firstly,
`and after that wildcards and regular expressions are processed. Take a signature with a
`wildcard as an example. The fixed string is split by an asterisk wildcard into two substrings
`that is in the “substring1 *substring2” format. The two substrings are matched individually by
`the AC FSM. When all the two parts are found, it will further verify the order and gap between
`them. ClamAV uses a 256-element array for each node with each element corresponding to an
`ASCII character. The integer value of the input byte is used as the index into the array, and if
`the element is present, a match character is found. Furthermore, ClamAV provides several
`common unpackers, such as MEW, Upack, Petite, yC, FSG, AsPack, WWPack, NsPack and
`etc., to recover the original programs from packed viruses obfuscated with the known packing
`algorithms.
`
`2.3 Related Work on Intrusion Protection of Virtual Machines
`
`Virtualization technology brings opportunities for security and is extensively used to improve
`the anti-attack capability of intrusion or malware detection and protection tools [8 ][26].
`
`A. VMI-based intrusion protection
`Security tools can be installed outside of the VM that need to be protected, and monitors the
`guest via special interfaces, such as VMI [46]. As is isolated from the VM, the security tools
`are immune to attacks even if the VM is compromised by attackers. However, as lacking of
`guest OS-level semantic (known as the semantic gap [40][46]), the accuracy and effectiveness
`of the security tools outside the VM have to be sacrificed. Livewire [8]
`is
`a general IDS
`framework adopting VMI technology, which uses the priori knowledge about the guest OS
`data structures to interpret the hardware-level view in OS-level semantics. Especially, to
`detect signatures of malicious program, it performs a
`full scan of all
`the host memory.
`However, this methodology cannot timely find out virus and prohibit them. VMwatcher [9]
`reconstructs semantic views of guest OS in a non-intrusive manner, and provides objects with
`the guest OS view to external malware detection tools. Just like Livewire, VMwatcher only
`provides malwares detection and cannot intervene execution of the guest. [12] proposed a
`hybrid approach enabling security tools to perform active monitoring while pertaining
`isolation to ensure security. It hooks the guest OS to monitor interested events and handle the
`events in an isolated VM. It solves the semantic gap problem in a simple manner, but it heavily
`depends on integrity of the guest OS. [20] described a framework that enables security
`monitoring applications to be placed in the untrusted guest VM without sacrificing the security
`guarantees. The monitoring application itself is protected by the VMM and should be
`self-contained to ensure its integrity. This method imposes restrictions on the monitoring
`applications and even requires reconstruction of commodity tools to adapt to this framework.
`[27] proposed a mechanism to verify the integrity of user and kernel code at runtime in page
`granularity.
`[50] described a framework that combines intrusion monitoring, evidence
`preservation, in-depth log analysis and decision making on suspicious events handling for
`guest VMs.
`In addition, there are also advances in the research of VMI. VMI technique requires
`privilege access to the virtual machine monitor, so that it
`is usually not provided in public
`
`
`
`5646
`
`Tangetal.: VirtAV: an Agentless Runtime Antivirus System for Virtual Machines
`
`cloud environments. CloudVMI [38] virtualizes the VMI interface and allows cloud users to
`introspecting their own VMs. For live VM monitoring by VMI, VMI tools and guest VMs run
`concurrently which might cause race conditions and data inconsistency. TxIntro [47]
`leverages the atomicity of hardware transactional memory to ensure concurrent and consistent
`VMI. To reduce the overhead of VMI and improves its generality for different OSes, [43]
`presents an approach that redirects the system calls of monitoring tools to the dummy process
`in guest VM which collects guest OS states natively in a lightweight manner. [45] presented
`the observation that the definition and layout of critical information in kernel data structures
`for process is stable as the evolution of kernel versions. The generality of VMI is improved by
`reconstructing the process list basing on only partial information, that is believed as sufficient
`for
`intrusion
`detection.
`[52]
`introduced
`hypervisor
`introspection
`(HVI)
`to
`detect
`hypercall-based attacks with hardware supports, such as nested virtualization and EPT
`protection.
`
`B. Not VMI-based intrusion protection
`
`To solve the antivirus storm problem, VMware & TrendTech [5][13] proposed a so-called
`agentless antivirus framework - vShield Endpoint. In that framework, the primary causes of
`storm including signature scanning and database updating are offloaded to a dedicated VM,
`called secure virtual appliance (SVA). Moreover, there is
`a lightweight agent (called ‘EPsec
`Thin Agent’) deployed in guest OS whose main duties are monitoring file operations in guest
`OS and forwarding files to SVA to be scanned. However, the framework is not truly agentless,
`and indeed, the agent is exposed to attacks from malicious users or malware. Moreover, the
`agent depends on the guest OS. When the agent is not running, for example in booting phase or
`shutting down phase of the guest OS, the guest VM loses protection of antivirus.
`[24] presented an external approach to malware analysis which can hide the analyzer from
`the target. With the help of hardware virtualization technology such as Intel VT, it can offer
`both instruction level and system call level tracing of the target in guest VM. [25] described a
`Qemu-based system that dynamically analyses Windows kernel-level code and extract
`malicious behaviors from rookits. [37] introduced an event-logging based reliability and
`security monitoring framework for virtual machines, which relies on hardware invariant to
`provide an isolated root of trust, so that the events and states about guests cannot be modified
`by attackers and failures inside guests. [41] presented an out-of-VM user-mode process
`execution monitoring which supports existing user-mode process monitoring tools such as
`strace, ltrace and gdb. The suspect process to be monitored is moved out from the production
`VM to the monitor VM, where the user-mode monitoring tools run side-by-side with the
`process. This approach removes semantic gap and can directly intercept the process execution
`at the granularity of user-level function calls. CIVIC [44] creates a
`live replica of the
`production VM and the inspection or analysis operations are performed on the replica. It
`leaves the production VM unmodified without any impact or side-effects during monitoring.
`HyperCoffer [42] is a hardware-software co-designed framework that guards the privacy and
`integrity of tenant’s VMs in cloud environment. By extending processor virtualization with
`memory encryption and integrity checking to secure data communication with off-chip
`memory, it can protect guest VMs against an untrusted hypervisor and physical attacks. [51]
`described a network-based computer worms detection system for virtualization environment,
`which runs isolated from the guest VM on the hypervisor. [6] proposed an audit based
`approach to protect against VM rollback attack which logs all the suspend/resume and
`migration operation
`and
`audits
`the
`log
`for checking malicious
`rollback behaviours.
`Furthermore, there are also researches [21][22][23] focusing on malware analysis with the
`help of virtualization techniques.
`
`
`
`KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS VOL. 11, NO. 11, November 2017
`
`5647
`
`Basic Idea and Challenges
`
`The overall goal of VirtAV is to provide antivirus protection for VMs without introducing the
`antivirus storm problem and the rollback vulnerability problem firstly. In addition, as the
`antivirus agent in the guest OS leads to vulnerabilities and weakness points on security,
`VirtAV is expected to implement an agentless solution. Furthermore, for the purpose of
`VirtAV to be a practical solution, the negative performance impacts on the guest VMs should
`be as low as possible.
`The basic idea of VirtAV is to scan guest binary codes in memory at runtime on the VMM
`side for detecting and preventing viruses. As we all known, memory is the only way that must
`be passed for the execution of any binary codes, specially, in the virtualization environment, it
`is the host memory. Furthermore, the VMM can inspect any location of the host memory, so
`signature-based virus detection in memory is feasible. In this way, concurrent file-based virus
`scanning in co-locating VMs is avoided, and antivirus storm problem is solved. Based on the
`idea, we have to deal with the following key challenges.
`
`A.
`
`Recognition of viruses in memory
`
`In traditional ways, file signature is used to distinguish specific virus from other viruses and
`non-virus files. And in general, periodically file scanning and event-driven (such as launching
`a program, downloading a file from the Internet) file scanning are two common policies in
`commodity antivirus products. However, file signatures are no longer applicable to recognize
`viruses in memory on the VMM side. On the one hand, file signature based virus detection
`needs understanding of OS-level semantics which can’t be obtained by the VMM. The VMM
`is unaware of when the guest is launching a program, or when it is downloading a
`file. On the
`other hand, binary codes are loaded into memory in an on-demand manner in the unit of page.
`As aresult, we usually get partial view of the binary codes in memory even when it has already
`started running. Accordingly, recognition of viruses in memory is one of the key challenges.
`Subsection 3.2 provides the details to solve this problem.
`
`B.
`
`Virus scanning trigger strategy
`
`Unlike the ways of file-based antivirus products, VirtAV scans viruses after the executable
`is loaded into memory. The selection of a right moment to perform virus scanning is crucial to
`the effectiveness of antivirus. Virus scanning on a piece of binary codes should be before it is
`scheduled to run on the CPU. Otherwise, the execution of the binary codes or even part of that
`might damage the system. However, the VMM cannot intercept guest OS-level events, for
`example, process scheduling. Instead, it can only catch hardware-level events, such as EPT
`violation. Therefore, the strategy regarding when and how to trigger virus scanning is another
`challenge. Subsection 3.3 describes how VirtAV utilizes memory virtualization to solve this
`problem.
`
`C. detection of packed viruses
`To bypass virus scanning, viruses usually pack themselves and unpack at runtime.
`Therefore, antivirus software needs to deal with a lot of packers and be prepared for new ones
`every day. Nearly all antivirus products employ unpacking engines [29]. It is difficult or even
`impossible to unpack with only partial memory view of the executable on the VMM side. So
`we rely on the viruses to unpack themselves, and scan the binary codes decoded. To support
`for detection of packed viruses is also a problem, and the related design is in Subsection 3.2
`and 3.3.
`
`
`
`5648
`
`Tang etal.: VirtAV: an Agentless Runtime Antivirus System for Virtual Machines
`
`D. Performance guarantee
`
`To provide runtime antivirus protection, VirtAV detects viruses on the critical path of
`execution of binary codes. Moreover, the execution is paused when VirtAV is scanning virus,
`which means the performance overhead is inevitable. To be a practical antivirus solution,
`VirtAV should reduce the overhead and provide acceptable performance to guest VMs. The
`solution is provided in subsection 3.5.
`
`3. Design and Implementation
`
`an agentless antivirus system for VMs which provides antivirus
`Basically, VirtAV is
`protection by extending KVM and Qemu. It identifies virus by memory signature and
`dynamically scans binary codes in host memory to detect viruses. It interposes synchronous
`virus scanning operations on the critical path of program executions in guest VM and it
`ensures that any code executed by vCPU has been examined. It is transparent to the guest OS
`and requires no modifications or hooks on the guest OS. The design and implementation is
`discussed in the following subsections.
`
`3.1 Overall Architecture
`
`The overall architecture of VirtAV is shown in Fig. 1. VirtAV is built upon KVM/Qemu
`hypervisor and is composed of four parts:
`1) VirtAV-engine embedded in Qemu, 2)
`a
`centralized signature database deployed on the host Linux, 3) VirtAV-stub in KVM, and 4)
`VirtAV-cleaner on host Linux.
`VirtAV-engine: In essence, VirtAV can be integrated with any signature-based scanning
`engine from different antivirus solutions. In our prototype system, we use the open-source
`scanning engine of ClamAV [3] and integrate it with Qemu, which is designated as
`VirtAV-engine. The core of the VirtAV-engine is the AC finite-state machine for signature
`pattern matching, which supports exact match, wildcard match and regular expression match
`[3]. It runs in the same context with Qemu’s vCPU thread and scans host memory footprints of
`executables in guest VMs to search memory signatures of virus. Furthermore, the vCPU
`thread will be paused when it scans host memory.
`Memory Signature Database: We propose memory signature to recognize virus. The
`definition of memory signatures are presented in subsection 3.2. A new database (‘.msdb’) for
`memory signatures is created in ClamAV. Moreover, various databases of ClamA Vare shared
`by all the VirtAV-engines on the same host. Especially, it is independent of any guest VM, and
`can be kept up-to-date by the host, so that the rollback vulnerability and updating storm are
`both avoided.
`VirtAV-stub: To trigger virus scanning at the right moment, we extended the memory
`virtualization module in KVM to trap specific memory access events. Once trapped, the vCPU
`is paused and VirtAV-stub transfers the binary codes in the corresponding page frame to
`VirtAV-engine for scanning. When there is virus found, VirtAV will take actions according to
`predefined policies, e.g., killing the guest process and isolating or removing the executable by
`default. To kill the process, it manipulates the guest page table entry of the process to revoke
`the execute permission on the guest page frame using VMI techniques. After the vCPU
`resumes, the process is automatically killed by the guest OS due to general protection fault.
`VirtAV-cleaner: It is an auxiliary utility for virus treatment. It locates the executable of the
`virus according to information provided by VirtAV-stub, and takes actions to isolate or
`remove the virus. It
`is built upon libguestfs [14] which provides interfaces to access and
`modify guest file-system from outside of the VM.
`
`
`
`KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS VOL. 11, NO. 11, November 2017
`
`5649
`
`
`Virtual Machine/QEMU
`
`1 Virtual Machine/QEMU
`Virtual Machine/QEMU
`
`(7
`
`1
`
`}
`
`/
`
`Ly
`
`|
`
`Guest Operating
`System
`
`_
`
`.
`VirtAV-
`engine
`
`| |
`
`
`
`本 |
`
`一
`
`Virus Memory
`Signature DB
`
`一
`
`
`
`
`
`
`
`VirtAV-
`cleaner
`
`‘44
`~ -|
`me
`
`
`
`VirtAV-stub
`
`
`|
`
`Linux/KVM
`
`
`
`Fig. 1. Overall architecture of VirtAV
`
`3.2 Memory Signature of Virus
`
`To find virus accurately and timely in memory, VirtAV identifies virus using memory
`signature which is slightly different from file signatures. File signature is usually picked up
`from the text section of virus’ executable. However, the way to generate memory signature is
`different from that of file signature. The reason can be explained in detail as follows.
`In modern operating systems, such as Windows and Linux, memory management is based
`on paging which divides the continuous system memory into fixed-sized page frames. Page
`frames are assigned and mapped to processes’ virtual address spaces on demand. When a
`process is created, the OS establishes mappings between the process’ virtual memory areas
`and sections of executable and libraries. By that time, page frames are not assigned to the
`process, and sections are not loaded into memory either. When the process starts to run,
`especially the time instance that the CPU attempts to fetch instructions from the virtual
`memory area mapped to the text section, page fault will be generated as page frame is not
`present. Then the OS will assign a page frame to the process and eventually load a block of
`contents from the executable. The text section may reside in more than one page frames and
`should be loaded multiple times. So we cannot rely on scanning the whole executable to detect
`virus because part of the virus could have been executed and may have infected the system
`before the file signature is loaded in.
`There is no essential difference between memory signature and file signature of virus,
`which are both small pieces from the binary codes of virus and can be used to distinguish the
`virus from benign programs and other viruses. The only difference is that memory signature is
`consist of a collection of sub-signatures with each of them extracted from a corresponding
`page of the text section of virus, as shown in Fig. 2. A sub-signature can identify virus
`uniquely, so that VirtAV can detect virus with only one page of the text section loaded into
`memory. Generally, the page where the entrypoint of virus locates (as we called “the
`entrypoint page”) is first loaded and executed, and the sub-signature from the entrypoint page
`is usually first detected by VirtAV. In this case, other pages are not loaded because the virus is
`prevented from running once found. Just like file signature, sub-signatures could contain fixed
`hexadecimal strings, wildcards, or even regular expression.
`We need not to load the virus into memory for memory signature extraction. Instead,
`memory signature is calculated by statically analyzing the on-disk virus executable. Take
`PE-format executables in Windows as the example, which is composed of DOS header, NT
`header, section headers and several sections, such as code section (.text) and data section
`(.data)
`[15], as
`is shown in
`the figure. Furthermore, Table 1 shows the most related
`
`
`
`5650
`
`Tangetal.: VirtAV: an Agentless Runtime Antivirus System for Virtual Machines
`
`information in the PE headers for extracting memory signature. The raw length of the binary
`codes in the text section is given by VirtualSize, while SizeOfRawData indicates the section
`size in the executable. After loaded into memory, the length of the text section might change
`because the section is re-aligned to the SectionAlignment which is usually set to the page size
`(e.g. 4096 bytes) of the system. Specifically, if the size of the raw binary codes in the last page
`is less than 4096 bytes, the remainder of the page will be filled with zeros.
`
`DOS HEADER
`
`NT HEADER
`
`SECTION HEADER
`
`
`
`MS-DOS Header
`
`DOS Stub
`
`PE Flag
`
`
`
`PE File Header
`
`
`
`PE Optional Header
`
`
`
`Sections Table
`
`
`
`.
`load into memory
`
`Sub-Signaturel
`
`
`
`
`|
`
`
`
`
`
`
`| $$
`
`-text
`
`data
`
`
`
`Sub-Signature2
`
`Sub-SignatureN
`
`memory pages for .text section
`
`SECTIONS
`
`<
`
`
`
`人
`
`
`
`Fig. 2. Memory signature for virus in PE format
`
`Table 1. Part of information in PE Headers [30]
`
`
`
`Item
`
`Meaning
`
`PE Header
`
` ImageBase The preferred virtual address of PE Optional Header
`
`the first byte of the executable
`when it is loaded in memory.
`
`
`
` SectionAlignment The alignment of sections loaded | PE Optional Header
`
`in memory, in bytes. The default
`value is the page size of the
`system.
`
`
`
`FileAlignment
`
`| PE Optional Header
`
`The alignment of the raw data of
`sections in the executable, in
`bytes.
`
`
`
`Name
`
`The name of the section, such
`as .text, .data, .bss, and ete.
`
`Section Header
`
`
`
`VirtualSize
`
`The size of the section before
`aligned to FileAlignment.
`
`Section Header
`
`
`
`
`
`
`
`Virtual Address
`
`The address of the first byte of this | Section Header
`section when loaded into memory.
`This value is relative to the image
`base.
`
`
`
`KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS VOL. 11, NO. 11, November 2017
`
`5651
`
`
`SizeOfRawData
`The size of the section in the
`Section Header
`executable when aligned to
`FileAlignment. If VirtualSize <
`SizeOfRawData, the remainder is
`filled with zeroes.
`
`PointerToRawData | The offset of the first byte of the
`Section Header
`section from the start of the
`executable. The value is a multiple
`of FileAlignment.
`
`
`
`
`
`The process of calculating memory signature for virus is described as follows (shown in Fig.
`3).
`
`1)
`
`2)
`
`3)
`
`4)
`
`in the
`
`Locate the section header in the section table for the “.text” section, and get the start
`and the end offsets of the section from the start of the executable. The start offset is
`indicated
`by PointerToRawData, and
`the
`end
`offset
`is PointerToRawData +
`SizeOfRawData.
`Iteratively copy the binary codes of the executable into 4096-byte sized temporary files,
`sequentially from the start offset until to the end offset. Specifically, the first byte of the
`text section is copied to the offset (ImageBase + VirtualAddress) % 0x1000 of the