`International Conference on
`Measurement and Modeling oj
`Proceedings
`
`PERFORMANCE EVALUATION REVIEW
`SPECIAL ISSUE
`
`VOLUME 25 NO.l JUNE 1997
`
`1997 ACM SIGMETRICS
`INTERNATIONAL CONFERENCE ON
`MEASUREMENT AND MODELING
`OF COMPUTER SYSTEMS
`
`PROCEEDINGS
`
`1^ '
`I
`I
`
`Iz O F i
`" ••
`"
`L/
`' 5 f997
`
`i
`
`June 15-18, 1997
`Seattle, Washington, U.S.A.
`
`y nj
`
`HPE, Exh. 1011, p. 1
`
`
`
`The Association for Computing Machinery
`1515 Broadway
`New York, N,Y. 10036
`
`Copyright © 1997 by Association for Computing Machinery, Inc.(ACM). Permission to make
`digital or hard copies of part or all of this work for personal or classroom use is granted without
`fee provided that the copies are not made or distributed for profit or commercial advantage and
`that copies bear this notice and the full citation on the first page. Copyrights for components of
`this work owned by others than ACM must be honored. Abstracting with credit is permitted. To
`copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific
`permission and/or a fee. Request permission to republish from: Publications Dept. ACM, Inc. Fax
`+1 (212) 869-0481 or <permissions@acm.org>
`For other copying of articles that carry a code at the bottom of the first or last page or screen
`display, copying is permitted provided that the per-copy fee indicated in the code is paid through
`the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923.
`
`ACM ISBN: 0-89791-909-2
`
`Additional copies may be ordered prepaid from:
`
`ACM Order Department
`P.O. Box 12114
`Church Street Station
`New York, N.Y. 10257
`
`Phone 1-800-342-6626
`(U.S.A. and Canada)
`+1-212-626-0500
`(All other countries)
`Fax: +1-212-944-1318
`E-mail: acmpubs@acm.org
`
`ACM European Service Center
`108 Cowley Road
`Oxford OX 4 1 JF UK
`
`Phone: +44-1-865-382338
`Fax: +44-1-865-381338
`E-mail: acm_europe@acm.org
`URL: http://www.acm.org
`
`ACM Order Number: 488970
`
`Printed in the U.S.A.
`
`HPE, Exh. 1011, p. 2
`
`
`
`TABLE OF CONTENTS
`
`Organizing Committees
`Referees
`Message from the General Chair
`Message from the Program Chair
`
`Keynote Address
`(Chair: Albert Greenberg, AT&T Labs - Research)
`
`Architecture and Performance of Large Internets,
`Based on Terrestrial and Satellite Infrastructure
`Hans-Werner Braun, Teledesic Corporation
`Network Measurement and Modeling
`(Chair: Don Towsley, University of Massachusetts)
`
`Analyzing Stability in Wide-Area Network Performance
`Hari Balakrishnan, Mark Stemm, University of California at Berkeley
`Srinivasan Seshan, IBM
`Randy H. Katz, University of California at Berkeley
`
`Performance Issues of Enterprise Level Web Proxies
`Carlos Maltzahn, Kathy J, Richardson, Digital Equipment Corporation
`Dirk Grunwald, University of Colorado
`
`A New Method for Analysing Feedback-Based Protocols with
`Applications to Engineering Web Traffic over the Internet
`D.P. Heyman, AT&T Labs
`T.V. Lakshman, Bell Labs
`Arnold L. Neidhardt, Bellcore
`Network Resource Management
`(Chair: Carey Williamson, University of Saskatchewan)
`
`Queue Management for Explicit Rate Based Congestion Control
`Qingming Ma, Carnegie Mellon University
`K.K. Ramakrishnan, AT&T Labs Research
`
`VI1
`
`iii
`iv
`v
`vi
`
`1
`
`2
`
`13
`
`24
`
`39
`
`HPE, Exh. 1011, p. 3
`
`
`
`TCP over ATM: ABR or UBR?
`Teunis J. Ott and Neil Aggarwal, Bellcore
`
`Scalable Reliable Multicast Using Multiple Multicast Groups
`Sneha K. Kasera, Jim Kurose and Don Towsley, University of Massachusetts
`
`Parallel Computer Systems
`(Chair: David Wood, University of Wisconsin-Madison)
`
`Performance Debugging Shared Memory Parallel Programs
`Using Run-Time Dependence Analysis
`(Best Student Paper Award)
`Ramakrishnan Rajamony and Alan L. Cox, Rice University
`
`52
`
`64
`
`75
`
`Preprototyping SIMD Coprocessors Using Virtual Machine Emulation and Trace Compilation ... 88
`Martin C. Herbordt, Owais Kidwai, University of Houston
`Charles C. Weems, University of Massachusetts
`
`Hot Topics: Network Support for Video
`(Chair; K.K. Ramakrishnan, AT&T Labs - Research)
`
`Memory Management
`(Chair: Anna Karlin, University of Washington)
`
`Informed Multi-Process Prefetching and Caching
`Andrew Tomkins, R. Hugo Patterson and Garth Gibson, Carnegie Mellon University
`
`Adaptive Page Replacement Based on Memory Reference Behavior
`Gideon Glass and Pei Cao, University of Wisconsin-Madison
`
`Managing Server Load in Global Memory Systems
`Geoffrey M. Voelker, University of Washington
`Herve A. Jamrozik, Amazon.com
`Mary K. Vernon, University of Wisconsin-Madison
`Henry M. Levy and Edward D. Lazowska, University of Washington
`Modeling
`(Chair: Gisli Hjalmtysson, AT&T Labs - Research)
`
`100
`
`115
`
`127
`
`Size-Limited Batch Movement in Product-Form Closed Discrete-Time Queueing Networks
`Michael E. Woodward, Loughborough University
`
`139
`
`Bounding of Performance Measures for a Threshold-based Queueing System with Hysteresis ... 147
`Leana Golubchik, Columbia University
`John C.S. Lui, The Chinese University of Hong Kong
`
`Using Real-Time Queueing Theory to Control Lateness in Real-Time Systems
`John P. Lehoczky, Carnegie Mellon University
`
`158
`
`Vlll
`
`HPE, Exh. 1011, p. 4
`
`
`
`Network Modeling
`(Chair: James Salehi, Hewlett Packard Labs)
`
`Cache Behavior of Network Protocols
`Erich Nahum, David Yates, Jim Kurose and Don Towsley, University of Massachusetts
`
`Second Moment Resource Allocation in Multi-Service Networks
`Edward W. Knightly, Rice University
`
`On the Characterization of VBR MPEG Streams
`Marwan Krunz, University of Arizona
`Satish K. Tripathi, University of Maryland
`
`Benchmarking
`(Chair: Margaret Martonosi, Princeton University)
`
`File System Aging-Increasing the Relevance of File System Benchmarks
`Keith A. Smith and Margo I. Seltzer, Harvard University
`
`Operating System Benchmarking in the Wake of Lmbench: A Case Study of the
`Performance of NetBSD on Intel x86 Architecture
`Aaron B. Brown and Margo I. Seltzer, Harvard University
`
`Work in Progress
`(Chair: John Zahorjan, University of Washington)
`
`Computing and Switching Platforms
`(Chair: Mary K. Vernon, University of Wisconsin-Madison)
`
`The Utility of Exploiting Idle Workstations for Parallel Computation
`Anurag Acharya, Guy Edjlali and Joel Saltz, University of Maryland
`
`A Performance Evaluation of Cluster-Based Architectures
`Xiaohan Qin and Jean-Loup Baer, University of Washington
`
`Design and Evaluation of a DRAM-based Shared Memory ATM Switch
`Tzi-Cker Chiueh, Srinidhi Varadarajan, State University of New York at Stony Brook
`Storage Systems
`(Chair: John C.S. Lui, Chinese University of Hong Kong)
`
`Efficient Retrieval of Composite Multimedia Objects in the JINSIL Distributed System
`Junehwa Song, University of Maryland
`Asit Dan and Dinkar Sitaram, IBM
`
`File Server Scaling with Network-Attached Secure Disks
`Garth A. Gibson, David F. Nagle, Khalil Amiri, Fay W. Chang, Eugene M. Feinberg,
`Howard Gobioff, Chen Lee, Berend Ozceri,Erik Riedel, David Rochberg, and Jim Zelenka,
`Carnegie Mellon University
`
`169
`
`181
`
`192
`
`203
`
`214
`
`225
`
`237
`
`248
`
`260
`
`272
`
`IX
`
`HPE, Exh. 1011, p. 5
`
`
`
`Group-Guaranteed Channel Capacity in Multimedia Storage Servers
`Athanassios K. Tsiolis and Mary K. Vernon, University of Wisconsin-Madison
`
`TUTORIAL ABSTRACTS
`AUTHOR INDEX
`
`285
`
`298
`302
`
`X
`
`HPE, Exh. 1011, p. 6
`
`
`
`File Server Scaling with Network-Attached Secure Disks
`
`Garth A. Gibsont, David F. Nagle*, Khalil Amiri*, Fay W. Changt, Eugene M. Feinberg*, Howard Gobiofft,
`Chen Leef, Berend Ozceri*, Erik Riedel*, David Rochbergt, Jim Zelenkaf
`
`*Department of Electrical and Computer Engineering
`tSchool of Computer Science
`Camegie Mellon University
`Pittsburgh, PA 15213-3890
`garth+nasd@ cs.cmu.edu
`http://www.cs.cmu.eduAVeb/Groups/NASD/
`
`Abstract
`By providing direct data transfer between storage and client, net-
`work-attached storage devices have the potential to improve scal
`ability for existing distributed file systems (by removing the server
`as a bottleneck) and bandwidth for new parallel and distributed file
`systems (through network striping and more efficient data paths).
`Together, these advantages influence a large enough fraction of the
`storage market to make commodity network-attached storage fea
`sible. Realizing the technology's full potential requires careful
`consideration across a wide range of file system, networking and
`security issues. This paper contrasts two network-attached storage
`architectures—(1) Networked SCSI disks (NetSCSI) are network-
`attached storage devices with minimal changes from the familiar
`SCSI interface, while (2) Network-Attached Secure Disks (NASD)
`are drives that support independent client access to drive object
`services. To estimate the potential performance benefits of these
`architectures, we develop an analytic model and perform trace-
`driven replay experiments based on AFS and NFS traces. Our
`results suggest that NetSCSI can reduce file server load during a
`burst of NFS or AFS activity by about 30%. With the NASD archi
`tecture, server load (during burst activity) can be reduced by a fac
`tor of up to five for AFS and up to ten for NFS.
`
`1 Introduction
`Users are increasingly using distributed file systems to access
`data across local area networks; personal computers with hundred-
`plus MIPS processors are becoming increasingly affordable; and
`the sustained bandwidth of magnetic disk storage is expected to
`exceed 30 MB/s by the end of the decade. These trends place a
`pressing need on distributed file system architectures to provide
`
`This research was sponsored by DARPA/TTO through ARPA Order D306 under con
`tract N00174-96-0002 and in part by an ONR graduate fellowship. The project team is
`indebted to generous contributions from the member companies of the Parallel Data
`Consortium: Hewlett-Packard, Symbios Logic Inc., Data General, Compaq, IBM Cor
`poration, EMC Corporation, Seagate Technology, and Storage Technology Corpora
`tion. The views and conclusions contained in this document are those of the authors
`and should not be interpreted as representing the official policies, either expressed or
`implied, of any supporting organization or the U.S. Government.
`
`Permission to make digital/hard copy of part or ail this work for
`personal or classroom use is granted without fee provided that
`copies are not made or distributed for profit or commercial advan
`tage, the copyright notice, the title of the publication and its date
`appear, and notice is given that copying is by permission of ACM,
`Inc. To copy otherwise, to republish, to post on servers, or to
`redistribute to lists, requires prior specific permission and/or a fee.
`SIGMETRICS '97 Seattle, WA, USA
`© 1997 ACM 0-89791-909-2/97/0006...$3.50
`
`clients with efficient, scalable, high-bandwidth access to stored
`data. This paper discusses a powerful approach to fulfilling this
`need. Network-attached storage provides high bandwidth by
`directly attaching storage to the network, avoiding file server
`store-and-forward operations and allowing data transfers to be
`striped over storage and switched-network links.
`The principal contribution of this paper is to demonstrate the
`potential of network-attached storage devices for penetrating the
`markets defined by existing distributed file system clients, specifi
`cally the Network File System (NFS) and Andrew File System
`(AFS) distributed file system protocols. Our results suggest that
`network-attached storage devices can improve overall distributed
`file system cost-effectiveness by offloading disk access, storage
`management and network transfer and greatly reducing the amount
`of server work per byte accessed.
`We begin by charting the range of network-attached storage
`devices that enable scalable, high-bandwidth storage systems. Spe
`cifically, we present a taxonomy of network-attached storage —
`server-attached disks (SAD), networked SCSI (NetSCSI) and net
`work-attached secure disks (NASD) — and discuss the distributed
`file system functions offloaded to storage and the security models
`supportable by each.
`With this taxonomy in place, we examine traces of requests
`on NFS and AFS file servers, measure the operation costs of com
`monly used SAD implementations of these file servers and
`develop a simple model of the change in manager costs for NFS
`and AFS in NetSCSI and NASD environments. Evaluating the
`impact on file server load analytically and in trace-driven replay
`experiments, we find that NASD promises much more efficient
`file server offloading in comparison to the simpler NetSCSI. With
`this potential benefit for existing distributed file server markets,
`we conclude that it is worthwhile to engage in detailed NASD
`implementation studies to demonstrate the efficiency, throughput
`and response time of distributed file systems using network-
`attached storage devices.
`In Section 2, we discuss related work. Section 3 presents our
`taxonomy of network-attached storage architectures. In Section 4,
`we describe the NFS and AFS traces used in our analysis and
`replay experiments and report our measurements of the cost of
`each server operation in CPU cycles. Section 5 develops an ana
`lytic model to estimate the potential scaling offered by server-off
`loading in NetSCSI and NASD based on the collected traces and
`the measured costs of server operations. The trace-driven replay
`experiment and the results are the subject of Section 6. Finally,
`Section 7 presents our conclusions and discusses future directions.
`
`272
`
`HPE, Exh. 1011, p. 7
`
`
`
`2 Related Work
`Distributed file systems provide remote access to shared file
`storage in a net\vorked environment [Sandberg85, Howard88,
`Minshall94]. A principal measure of a distributed file system's
`cost is the computational power required from the servers to pro
`vide adequate performance for each client's work [Howard88,
`Nelson88]. While microprocessor performance is increasing dra
`matically and raw computational power would not normally be a
`concern, the work done by a file server is data- and intermpt-inten-
`sive and, with the poorer locality typical of operating systems,
`faster microprocessors will provide much less benefit than their
`cycle time trends promise [Ousterhout91, Anderson91, Chen93].
`Typically, distributed file systems employ client caching to
`reduce this server load. For example, APS clients use local disk to
`cache a subset of the global system's files. While client caching is
`essential for high performance, increasing file sizes, computation
`sizes, and workgroup sharing are all inducing more misses per
`cache block [Ousterhout85, Baker91]. At the same time, increased
`client cache sizes are making these misses more bursty.
`When the post-client-cache server load is still too large, it can
`either be distributed over multiple servers or satisfied by a custom-
`designed high-end file server. Multiple-server distributed file sys
`tems attempt to balance load by partitioning the namespace and
`replicating static, commonly used files. This replication and parti
`tioning is too often ad-hoc, leading to the "hotspot" problem famil
`iar in multiple-disk mainframe systems [Kim86] and requiring
`frequent user-directed load balancing. Not surprisingly, custom-
`designed high-end file servers more reliably provide good perfor
`mance, but can be an expensive solution [Hitz90, Drapeau94].
`Experience with disk arrays suggests another solution. If data
`is striped over multiple independent disks of an array, then a high-
`concurrency workload will be balanced with high probability as
`long as individual accesses are small relative to the unit of inter
`leaving [Linvy87, Patterson88, Chen90]. Similarly, striping file
`storage across multiple servers provides parallel transfer of large
`files and balancing of high concurrency workloads [Hartman93];
`striping of metadata promises further load-balancing [Dahlin95].
`Scalability prohibits the use of a single shared-media net
`work; however, with the emergence of switched network fabrics
`based on high-speed point-to-point links, striped storage can scale
`bandwidth independent of other traffic in the same fabric
`[Amould89, Siu95, Boden95]. Unfortunately, current implementa
`tions of Internet protocols demand significant processing power to
`deliver high bandwidth — we observe as much as 80% of a 233
`MHz DEC Alpha consumed by UDP/IP receiving 135 Mbps over
`155 Mbps ATM (even with adaptor support for packet reassem
`bly). Improving this bandwidth depends on interface board designs
`[Steenkiste94, Cooper90], integrated layer processing for network
`protocols [Clark89], direct application access to the network inter
`face [vonEiken92, Maeda93], copy avoiding buffering schemes
`[Dmschel93, Brustoloni96], and routing support for high-perfor
`mance best-effort traffic [Ma96, Traw95]. Perhaps most impor
`tantly, the protocol stacks resulting from these research efforts
`must be deployed widely. This deployment is critical because the
`comparable storage protocols, SCSI, and soon. Fibre Channel, pro
`vide cost-effective hardware implementations routinely included
`in client machines. For comparison, a 175 MHz DEC Alpha con
`sumes less than 5% of its processing power fetching 100 Mbps
`from a 160 Mbps SCSI channel via the UNIX raw disk interface.
`
`To exploit the economics of large systems resulting from the
`cobbling together of many client purchases, the xFS file system
`distributes code, metadata and data over all clients, eliminating the
`need for a centralized storage system [Dahlin95]. This scheme nat
`urally matches increasing client performance with increasing
`server performance. Instead of reducing the server workload, how
`ever, it takes the required computational power from another, fre
`quently idle, client. Complementing the advantages of filesystems
`such as xFS, the network-attached storage architectures presented
`in this paper significantly reduce the demand for server computa
`tion and eliminate file server machines from the storage data path,
`reducing the coupling between overall file system integrity and the
`security of individual client machines.
`As distributed file system technology has improved, so have
`the storage technologies employed by these systems. Storage den
`sity increases, long a predictable 25% per year, have risen to 60%
`increases per year during the 90s. Data rates, which were con
`strained by storage interface definitions until the mid-80s, have
`increased by about 40% per year in the 90s [Grochowski96]. The
`acceptance, in all but the lowest cost market, of SCSI, whose inter
`face exports the abstraction of a linear array of fixed-size blocks
`provided by an embedded controller [ANSI86], catalyzed rapid
`deployment of technology advances, resulting in an extremely
`competitive storage market.
`The level of indirection introduced by SCSI has also led to
`transparent improvements in storage performance such as RAID;
`transparent failure recovery; real-time geometry-sensitive schedul
`ing; buffer caching; read-ahead and write-behind; compression;
`dynamic mapping; and representation migration [Patterson88,
`Gibson92, Massiglia94, StorageTek94, Wilkes95, Ruemmler91,
`Varma95]. However, in order to overcome the speed, addressabil
`ity and connectivity limitations of current SCSI implementations
`[Sachs94, ANSI95], the industry is turning to high-speed pack-
`etized interconnects such as Fibre Channel at up to 1 Gbps
`[Benner96]. The disk drive industry anticipates the marginal cost
`for on-disk Fibre Channel interfaces, relative to the common sin
`gle-ended SCSI interface in use today, to be comparable to the
`marginal cost for high-performance differential SCSI (a difference
`similar to the cost of today's Ethernet adapters) while their host
`adapter costs are expected to be comparable to high-performance
`SCSI adapters [Anderson95].
`The idea of simple, disk-like network-based storage servers
`whose functions are employed by higher-level distributed file sys
`tems, has been around for a long time [Birrel80, Katz92]. The
`Mass Storage System Reference Model (MSSRM), an early archi
`tecture for hierarchical storage subsystems, has advocated the sep
`aration of control and data paths for almost a decade [Miller88,
`IEEE94]. Using a high-bandwidth network that supports direct
`transfers for the data path is a natural consequence [Kronenberg86,
`Drapeau94, Long94, Lee95, Menasce96, VanMeter96]. The
`MSSRM has been implemented in the High Performance Storage
`System (HPSS) [Watson95] and augmented with socket-level
`striping of file transfers [Berdahl95, Wiltzius95], over the multiple
`network interfaces found on mainframes and supercomputers. ^
`
`'Following Van Meter's [VanMeter96] definition of network-attached
`peripherals, we consider only networks that are shared with general local
`area network traffic and not single-vendor systems whose interconnects are
`fast, isolated local area networks [Horst95, IEEE92].
`
`273
`
`HPE, Exh. 1011, p. 8
`
`
`
`• ••••••••••••
`
`File Server
`
`Controller
`
`• •
`
`wifM
`
`aifi D'
`
`Controller
`
`J'
`
`t
`iAy'L
`[fhfj' >
`
`(Packetized) SCSI
`
`frt;
`
`• a
`
`N
`•f
`
`M
`' vm^;,4|^|ii'
`'"; ^JT
`'
`•n }#W
`
`Backplane Bus
`
`Network File System Protocol
`
`[Network Protocol
`
`[Local File System | ^
`
`[Network Driver w.
`
`SCSI Driver
`®
`
`4.
`:•©
`:t
`©
`[System Memory | [SCSI Interface
`©
`
`[Network Interface p
`
`Local Area Network © |
`
`®
`
`Figure 1: Server-attached disks (SAD) are the familiar local area network distributed file
`systems. A client wanting data from storage sends a message to the file server (1), which sends a
`message to storage (2), which accesses the data and sends it back to the file server (3), which
`finally sends the requested data back to the client (4). Server-integrated disk (SID) is logically the
`same except that hardware and software in the file server machine may be specialized to the file
`service function.
`
`Striping data across multiple storage servers with indepen
`dent ports into a scalable local area network has been advocated as
`a means of obtaining scalable storage bandwidth [Hartman93]. If
`the storage servers of this architecture are network-attached
`devices, rather than dedicated machines between the network and
`storage, efficiency is further improved by avoiding store-and-for-
`ward delays through the server.
`Our notion of network-attached storage is consistent with
`these projects. However, our analysis focuses on the evolution of
`commodity storage devices rather than niche-market, very high-
`end systems, and on the interaction of network-attached storage
`with common distributed file systems. Because all prior work
`views the network-based storage as a function provided by an
`additional computer, instead of the storage devices itself, cost-
`effectiveness has never been within reach. Our goal is to chart the
`way network-attached storage is likely to appear in storage prod
`ucts, estimate its scalability implications, and characterize the
`security and file system design issues in its implementation.
`3 Taxonomy of Network-Attached Storage
`Simply attaching storage to a network underspecifies net
`work-attached storage's role in distributed file systems' architec
`tures. In the following subsections, we present a taxonomy for the
`functional composition of network-attached storage. Case 0, the
`base case, is the familiar local area network with storage privately
`connected to file server machines — we call this server-attached
`disks. Case 1 represents a wide variety of current products, server-
`integrated disks, that specialize hardware and software into an
`integrated file server product. In Case 2, the obvious network-
`attached disk design, network SCSI, minimizes modifications to
`the drive command interface, hardware and software. Finally,
`Case 3, network-attached secure disks, leverages the rapidly
`increasing processor capability of disk-embedded controllers to
`restructure the drive command interface.
`3.1 Case 0: Server-Attached Disks (SAD)
`This is the system familiar to office and campus local area
`networks as illustrated in Figure 1. Clients and servers share a net
`work and storage is attached directly to general-purpose worksta
`tions that provide distributed file services.
`
`3.2 Case 1: Server Integrated Disks (SID)
`Since file server machines often do little other than service
`distributed file system requests, it makes sense to construct spe
`cialized systems that perform only file system functions and not
`general-purpose computation. This architecture is not fundamen
`tally different from SAD. Data must still move through the server
`machine before it reaches the network, but specialized servers can
`move this data more efficiently than general-purpose machines.
`Since high performance distributed file service benefits the pro
`ductivity of most users, this architecture occupies an important
`market niche [Hitz90, Hitz94]. However, this approach binds stor
`age to a particular distributed file system, its semantics, and its
`performance characteristics. For example, most server-integrated
`disks provide NFS file service, whose inherent performance has
`long been criticized [HowardSS]. Furthermore, this approach is
`undesirable because it does not enable distributed file system and
`storage technology to evolve independently. Server striping, for
`instance, is not easily supported by any of the currently popular
`distributed file systems. Binding the storage interface to a particu
`lar distributed file system hampers the integration of such new fea
`tures [BirrellSO].
`3.3 Case 2: Network SCSI (NetSCSI)
`The other end of the spectrum is to retain as much as possible
`of SCSI, the current dominant mid- and high-level storage device
`protocol. This is the natural evolution path for storage devices;
`Seagate's Barracuda FC is already providing packetized SCSI
`through Fibre Channel network ports to directly attached hosts
`[Seagate96]. NetSCSI is a network-attached storage architecture
`that makes minimal changes to the hardware and software of SCSI
`disks. File manager software translates client requests into com
`mands to disks, but rather than returning data to the file manager to
`be forwarded, the NetSCSI disks send data directly to clients, sim
`ilar to the support for third-party transfers already supported by
`SCSI [Drapeau94]. The efficient data transfer engines typical of
`fast drives ensure that the drive's sustained bandwidth is available
`to clients. Further, by eliminating the file manager from the data
`path, its workload per active client decreases. However, the use of
`third-party transfer changes the drive's role in the overall security
`of a distributed file system. While it is not unusual for distributed
`file systems to employ a security protocol between clients and
`
`274
`
`HPE, Exh. 1011, p. 9
`
`
`
`Private Peripheral Channel
`
`Local Area Network
`
`File l\^nager
`[Net I [File System
`Access Control
`[[Security
`
`Net
`
`|
`
`[Net I [Security
`
`Figure 2: Network SCSI (NetSCSI) is a network-
`attached disk architecture designed for minimal changes
`to the disk's command interface. However, because the
`network port on these disks may be connected to a hostile,
`broader network, preserving the integrity of on-disk file
`system structure requires a second port to a private (file
`manager-owned) network or cryptographic support for a
`virtual private channel to the file manager. If a client
`wants data from a NetSCSI disk, it sends a message (1) to
`the distributed file system's file manager which processes
`the request in the usual way, sending a message over the
`private network to the NetSCSI disk (2). The disk
`accesses data, transfers it directly to the client (3), and
`sends its completion status to the file manager over the
`private network (4). Finally, the file manager completes
`the request with a status message to the client (5).
`
`servers (e.g. Kerberos authentication), disk drives do not yet par
`ticipate in this protocol.
`We identify four levels of security within the NetSCSI model:
`(1) accident-avoidance with a second private network between file
`manager and disk, both locked in a physically secure room; (2)
`data transfer authentication with clients and drives equipped with a
`strong cryptographic hash function; (3) data transfer privacy with
`both clients and drives using encryption and; (4) secure key man
`agement with a secure coprocessor.
`Figure 2 shows the simplest security enhancement to
`NetSCSI: a second network port on each disk. Since SCSI disks
`execute every command they receive without an explicit authori
`zation check, without a second port even well-meaning clients can
`generate erroneous commands and accidentally damage parts of
`the file system. The drive's second network port provides protec
`tion from accidents while allowing SCSI command interpreters to
`continue following their normal execution model. This is the
`architecture employed in the SIOF and HPSS projects at LLNL
`[Wiltzius95, Watson95]. Assuming that file manager and NetSCSI
`disks are locked in a secure room, this mechanism is acceptable for
`the trusted network security model of NFS [Sandberg85].
`Because file data still travels over the potentially hostile gen
`eral network, NetSCSI disks are likely to demand greater security
`than simple accident avoidance. Cryptographic protocols can
`strengthen the security of NetSCSI. A strong cryptographic hash
`function, such as SHA [NIST94], computed at the drive and at the
`client would allow data transfer authentication (i.e., the correct
`data was received only if the sender and receiver compute the
`same hash on the data).
`For some applications, data transfer authentication is insuffi
`cient, and communication privacy is required. To provide privacy,
`a NetSCSI drive must be able to encrypt and decrypt data.
`NetSCSI drives can use cryptographic protocols to construct pri
`vate virtual channels over the untrusted network. However, since
`keys will be stored in devices vulnerable to physical attack, the
`servers must still be stored in physically secure environments. If
`we go one step further and equip NetSCSI disks with secure copro
`cessors [Tygar95], then keys can be protected and all data can be
`encrypted when outside the secure coprocessor, allowing the disks
`to be used in a variety of physically open environments. There are
`now a variety of secure coprocessors [NIST94a, Weingart87,
`
`White87, National96] available, some of which promise crypto
`graphic accelerators sufficient to support single-disk bandwidths.
`3.4 Case 3: Network-attached Secure Disks (NASD)
`With network-attached secure disks, we relax the constraint
`of minimal change from the existing SCSI interface and imple
`mentation. Instead we focus on selecting a command interface that
`reduces the number of client-storage interactions that must be
`relayed through the file manager, offloading more of the file man
`ager's work without integrating file system policy into the disk.
`Common, data-intensive operations, such as reads and writes,
`go straight to the disk, while less-common ones, including
`namespace and access control manipulations, go to the file man
`ager. As opposed to NetSCSI, where a significant part of the pro
`cessing for security is performed on the file manager, NASD
`drives perform most of the processing to enforce the security pol
`icy. Specifically, the cryptographic functions and the enforcement
`of manager decisions are implemented at the drive, while policy
`decisions are made in the file manager. Because clients directly
`request access to data in their files, a NASD drive must have suffi
`cient metadata to map and authorize the request to disk sectors.
`Authorization, in the form of a time-limited capability applicable
`to the file's map and contents, should be provided by the file man
`ager to protect higher-level file systems' control over storage
`access policy. The storage mapping metadata, however, could be
`provided dynamically [VanMeter96a] by the file manager or could
`be maintained by the drive. While the latter approach asks distrib
`uted file system authors to surrender detailed control over the lay
`out of the files they create, it enables smart drives to better exploit
`detailed knowledge of their own resources to optimize data layout,
`read-ahead, and cache management [deJonge93, Patterson95,
`Golding95]. This is precisely the type of value-added opportunity
`that nimble storage vendors can exploit for market and customer
`advantage. With mapping metadata at the drive controlling the lay
`out of files, a NASD drive exports a namespace of file-like objects.
`Because control o