IPR2018-00150, No. 1018-18 Exhibit - Pang et al, 80211 User Fingerprinting 2007 (P.T.A.B. Nov. 3, 2017)

802.11 User Fingerprinting
`
`Ramakrishna Gummadi‡
`Ben Greenstein†
`Jeffrey Pang∗
`David Wetherall†§
`Srinivasan Seshan∗
`†Intel Research Seattle
`∗Carnegie Mellon University
`§University of Washington
`‡University of Southern California
`jeffpang@cs.cmu.edu
`benjamin.m.greenstein@intel.com gummadi@usc.edu
`srini@cmu.edu
`djw@cs.washington.edu
`
`ABSTRACT
`The ubiquity of 802.11 devices and networks enables anyone to
`track our every move with alarming ease. Each 802.11 device
`transmits a globally unique and persistent MAC address and thus
`is trivially identiﬁable. In response, recent research has proposed
`replacing such identiﬁers with pseudonyms (i.e., temporary, un-
`linkable names). In this paper, we demonstrate that pseudonyms
`are insufﬁcient to prevent tracking of 802.11 devices because im-
`plicit identiﬁers, or identifying characteristics of 802.11 trafﬁc, can
`identify many users with high accuracy. For example, even with-
`out unique names and addresses, we estimate that an adversary can
`identify 64% of users with 90% accuracy when they spend a day
`at a busy hot spot. We present an automated procedure based on
`four previously unrecognized implicit identiﬁers that can identify
`users in three real 802.11 traces even when pseudonyms and en-
`cryption are employed. We ﬁnd that the majority of users can be
`identiﬁed using our techniques, but our ability to identify users is
`not uniform; some users are not easily identiﬁable. Nonetheless,
`we show that even a single implicit identiﬁer is sufﬁcient to distin-
`guish many users. Therefore, we argue that design considerations
`beyond eliminating explicit identiﬁers (i.e., unique names and ad-
`dresses), must be addressed in order to prevent user tracking in
`wireless networks.
`Categories and Subject Descriptors:
`C.2.1 Computer-Communication Networks: Network Architecture
`and Design
`General Terms: Measurement, Security
`Keywords: privacy, anonymity, wireless, 802.11
`
`INTRODUCTION
`1.
`The alarming ease with which third parties can track our ev-
`ery move has drawn the concern of the popular media [1, 2], the
`United States government [22, 40], and technical standards bod-
`ies [17]. The fear is that we are sacriﬁcing our location privacy
`due to the ubiquity of wireless devices that disclose our locations,
`identities, or both. Though this fear has focused on large scale
`wireless systems, such as cellular phone networks, the capability
`
`Permission to make digital or hard copies of all or part of this work for
`personal or classroom use is granted without fee provided that copies are
`not made or distributed for proﬁt or commercial advantage and that copies
`bear this notice and the full citation on the ﬁrst page. To copy otherwise, to
`republish, to post on servers or to redistribute to lists, requires prior speciﬁc
`permission and/or a fee.
`MobiCom’07, September 9–14, 2007, Montréal, Québec, Canada.
`Copyright 2007 ACM 978-1-59593-681-3/07/0009 ...$5.00.
`
`to track user location in such systems has typically been limited
`to service providers that are legally bound to protect our privacy.
`In contrast, the low cost of 802.11 hardware and ease of access to
`network monitoring software—all that is required for someone to
`locate others nearby and eavesdrop on their trafﬁc—enable any-
`one to track users. Furthermore, although the popular press raised
`awareness about tracking threats posed by emerging wireless tech-
`nologies, such as RFID [13], no such campaign has been waged to
`educate users about 802.11 devices and networks, which pose the
`same threats today.
`The best practices for securing 802.11 networks, embodied in
`the 802.11i standard [16], provide user authentication, service au-
`thentication, data conﬁdentiality, and data integrity. However, they
`do not provide anonymity, a property essential to prevent location
`tracking. For example, it is trivial to track an 802.11 device today
`since each device advertises a globally unique and persistent MAC
`address with every frame that it transmits. To mask this identiﬁer,
`researchers have proposed applying pseudonyms [14, 18] (i.e., tem-
`porary, unlinkable names) by having users periodically change the
`MAC addresses of their 802.11 devices.
`In this paper, we demonstrate that pseudonyms are insufﬁcient
`to provide anonymity in 802.11. Even without a unique address,
`characteristics of users’ 802.11 trafﬁc can identify them implicitly
`and track them with high accuracy. An example of such an im-
`plicit identiﬁer is the IP address of a service that a user frequently
`accesses, such as his or her email server. In a population of sev-
`eral hundred users, this address might be unique to one individual;
`thus, the mere observation of this IP address would indicate the
`presence of that user. Of course, in a wireless network that em-
`ploys link-layer encryption, IP addresses would not be visible to
`an eavesdropper. However, other implicit identiﬁers would remain
`and these identiﬁers can be used in combination to identify users
`accurately.
`This paper quantiﬁes how well a passive adversary can track
`users with four implicit identiﬁers visible to commodity hardware.
`We thereby place a lower bound on how accurately users can be
`identiﬁed implicitly, as more implicit identiﬁers and more capable
`adversaries exist in practice. We make the following contributions:
`
`• We identify four previously unrecognized implicit identiﬁers:
`network destinations, network names advertised in 802.11
`probes, differing conﬁgurations of 802.11 options, and sizes
`of broadcast packets that hint at their contents.
`
`• We develop an automated procedure to identify users. This
`procedure allows us to quantify how much information im-
`plicit identiﬁers, both alone and in combination, reveal about
`several hundred users in three empirical 802.11 traces.
`
`IA1018
`
`Page 1 of 12
`
`

`• Our evaluation shows that users emit highly discriminating
`implicit identiﬁers, and, thus, even a small sample of network
`trafﬁc can identify them more than half (56%) of the time in
`public networks, on average. Moreover, we will almost never
`mistake them as the source of other network trafﬁc (1% of the
`time). Since adversaries will obtain multiple trafﬁc samples
`from a user over time, this high accuracy in trafﬁc classi-
`ﬁcation enables them to track many users with even higher
`accuracy in common wireless networks. For example, an ad-
`versary can identify 64% of users with 90% accuracy when
`they spend a day at a busy hot spot that serves 25 concurrent
`users each hour.
`
`• To our knowledge, we are the ﬁrst to show with empirical
`evidence that design considerations beyond eliminating ex-
`plicit identiﬁers, such as unique names and addresses, must
`be addressed to protect anonymity in wireless networks.
`
`In Section 2 we illustrate the power of implicit identiﬁers with
`several real examples. Section 3 covers related work. Section 4 ex-
`plains our experimental methodology. Section 5 describes our em-
`pirical 802.11 traces. Section 6 analyzes how well 802.11 users can
`be identiﬁed using each implicit identiﬁer individually. Section 7
`examines how accurately an adversary can track people using these
`implicit identiﬁers in public, home, and enterprise networks. We
`conclude in Section 8.
`
`2. THE IMPLICIT IDENTIFIER PROBLEM
`How signiﬁcantly do implicit identiﬁers erode location privacy?
`Consider the seemingly innocuous trace of 802.11 trafﬁc collected
`at the 2004 SIGCOMM conference, now anonymized and archived
`for public use [31]. Interestingly, hashing real MAC addresses to
`pseudonyms is also the best practice for anonymizing traces such
`as this. Unfortunately, implicit identiﬁers remain and they are suf-
`ﬁcient to identify many SIGCOMM attendees. For example:
`Implicit identiﬁers can identify us uniquely. One particular at-
`tendee’s laptop transmitted requests for the network names “MIT,”
`“StataCenter,” and “roofnet,” identifying him or her as someone
`probably from Cambridge, MA. This occurred because the default
`behavior of a Windows laptop is to actively search for the user’s
`preferred networks by name, or Service Set Identiﬁer (SSID). The
`SSID “therobertmorris” perhaps identiﬁes this person uniquely [26].
`A second attendee requested “University of Washington” and “djw.”
`The last SSID is unique in the SIGCOMM trace and suggests that
`this person may be University of Washington Professor David J.
`Wetherall, one of our coauthors. More distressingly, Wigle [39],
`an online database of 802.11 networks observed around the world,
`shows that there is only one “djw” network in the entire Seattle
`area. Wigle happens to locate this network within 192 feet of David
`Wetherall’s home.
`Implicit identiﬁers remain even when counter measures are em-
`ployed. Another SIGCOMM attendee transferred 512MB of data
`via BitTorrent (this user contacted hosts on the typical BitTorrent
`port, 6881). A request for the SSID “roofnet” [32] from the same
`MAC address suggests that this user is from Cambridge, MA. Sup-
`pose that this user had been more stealthy and changed his or her
`MAC address periodically. In this particular case, since the user
`had not requested the SSID during the time he or she had been
`downloading, the MAC address used in the SSID request would
`have been different from the one used in BitTorrent packets. There-
`fore, we would not be able to use the MAC address to explicitly link
`“roofnet” to this poor network etiquette. However, the user does ac-
`cess the same SSH and IMAP server nearly every hour and was the
`
`only user at SIGCOMM to do so. Thus, this server’s address is an
`implicit identiﬁer, and knowledge of it enables us to link the user’s
`sessions together.
`Now suppose that the network employed link-layer encryption
`scheme, such as WPA, that obscures network addresses. Even then,
`we could link this user’s sessions together by employing the fact
`that, of the 341 users that sent 802.11 broadcast packets, this was
`the only one that sent broadcast packets of sizes 239, 245, and 257
`bytes and did so repeatedly throughout the entire conference. Fur-
`thermore, the identical 802.11 capabilities advertised in each ses-
`sion’s management frames improves our conﬁdence of this link-
`age because these capabilities differentiate different 802.11 cards
`and drivers. Prior research has shown that peer-to-peer ﬁle shar-
`ing trafﬁc can be detected through encryption [42]. Thus, even if
`pseudonyms and link-layer encryption were employed, we could
`still implicate someone in Cambridge.
`Implicit identiﬁers are exposed by design ﬂaws. These exam-
`ples illustrate three shortcomings of the 802.11 protocol beyond
`exposing explicit identiﬁers, none of which is trivially ﬁxed. These
`shortcomings afﬂict not only 802.11 but many wireless protocols,
`including Bluetooth and ZigBee.
`Identifying information exposed at higher layers of the network
`stack is not adequately masked. For example, even with encryption,
`packet sizes can be identifying. Padding, decoy transmissions, and
`delays may hide information exposed by size and timing channels,
`but increase overhead. For example, Sun et al. [34] found that 8 to
`16 KB of padding is required to hide the identity of web objects.
`The performance penalty due to this overhead would be especially
`acute in wireless networks due to shared nature of the medium.
`Identifying information during service discovery is not masked.
`802.11 service discovery can not be encrypted since no shared keys
`exist prior to association. This raises the more general problem
`of how two devices can discover each other in a private manner,
`which is expensive to solve [4]. This problem arises not only when
`searching for access points, but also when clients want to locate
`devices in ad hoc mode, such as when using a Microsoft Zune to
`share music or a Nintendo DS to play games with friends.
`Identifying information exposed by variations in implementation
`and conﬁguration is not masked. Each 802.11 implementation typ-
`ically supports different 802.11 features (e.g., supported rates) and
`has different timing characteristics. This problem is difﬁcult to
`solve due to the inherent ambiguity of human speciﬁcations and
`manufacturers’ and network implementers’ desire for ﬂexibility to
`meet differing constraints.
`Balancing the costs involved in rectifying these shortcomings
`with the incentives necessary for deployment is itself a challenge.
`Nonetheless, rectifying these ﬂaws at the protocol level is impor-
`tant so that users need not limit their activities in order to protect
`their location privacy. By measuring the magnitude with which
`each ﬂaw contributes to the implicit identiﬁer problem, our study
`provides insight into the proper trade-offs to make when correcting
`these design ﬂaws in future wireless protocols. In the short term,
`our study may give guidance to individuals that are willing to pro-
`actively hide their identity in existing wireless networks.
`In the remainder of this paper, we examine how these shortcom-
`ings impact the location privacy of a large number of users in differ-
`ent 802.11 networks and demonstrate that the examples described
`in this section are not isolated anomalies.
`
`3. RELATED WORK
`The challenge of hiding a user’s identity has been examined in
`three different contexts: location privacy, identity hiding designs,
`
`IA1018
`
`Page 2 of 12
`
`

`and the study of other implicit identiﬁers. In this section, we de-
`scribe the previous work in each of these areas.
`Location Privacy. Location privacy has recently received signiﬁ-
`cant attention, most notably in the RFID [13] and pervasive com-
`puting [7] ﬁelds. The concern is that location-aware applications,
`which use GPS and other positioning technologies, might reveal
`this information in undesirable ways. However, location privacy
`is threatened even by devices that do not explicitly track location.
`Since 802.11 users usually associate with access points that are less
`than tens of meters away, knowing the access point that a user is as-
`sociated with gives away a coarse estimate of his location, such as
`his home or workplace. Moreover, systems that can employ multi-
`ple monitoring locations can use wireless signal strength to obtain
`an even more accurate estimate of a user’s location [6, 35]. An
`added complication is that wireless devices are rapidly becoming
`integral parts of our daily lives. A resulting trend, which is evident
`from examining databases of access point locations [39], is the in-
`creasing availability of service, which is increasing the number of
`location tracking opportunities. Unfortunately, identifying individ-
`ual users is often trivial since the 802.11 devices that they use are
`uniquely named by their MAC addresses.
`Identity Hiding. Pseudonyms are widely used in systems, such
`as the GSM cellular phone network [15] to hide user identities.
`Gruteser et al. [14] and Jiang et al. [18] proposed using pseudonyms
`within 802.11 networks, and Stajano et al. [41] proposed a similar
`mechanism for Bluetooth. Using pseudonyms is a necessary ﬁrst
`step to make tracking in these networks more difﬁcult. However,
`we show that it is insufﬁcient to protect location privacy because
`implicit identiﬁers can be sufﬁcient to track users in many real sce-
`narios.
`Implicit Identiﬁers. Fingerprinting devices using implicit identi-
`ﬁers is not a new concept. For example, Franklin et al. [11] showed
`that it is possible to ﬁngerprint device drivers using the timing of
`802.11 probes. In contrast, our work attempts to pin down actual
`user identities rather than selecting among a few dozen drivers.
`Kohno et al. [21] showed that devices could be ﬁngerprinted us-
`ing the clock skew exposed by TCP timestamps. We introduce new
`implicit identiﬁers that are useful in identifying users and, in con-
`trast to TCP timestamps, three of our identiﬁers are still visible in
`wireless networks using link-layer encryption. Moreover, Kohno et
`al. note that one limitation of their work is that an adversary can not
`passively obtain timestamps from devices running the most preva-
`lent operating system, Windows XP. For example, in two of our
`empirical traces, only 32% and 15% of the users sent TCP times-
`tamps. All our identiﬁers have much at least 55% coverage.
`Padmanabhan and Yang [29] explored ﬁngerprinting users with
`“clickprints,” or the paths that users take through a website. Their
`techniques rely on data from many user sessions collected at ac-
`tual web servers. Our techniques can be employed passively by
`anyone with a wireless card without even associating to a network.
`These three research efforts compliment ours, since the procedure
`we develop for identifying users enables an adversary to use these
`implicit identiﬁers in combination with ours, yielding even more
`accurate user ﬁngerprints. None of these previous efforts offer a
`formal method to combine multiple pieces of evidence. Moreover,
`to our knowledge, we are the ﬁrst to evaluate the how well users
`are identiﬁed by implicit identiﬁers observed in empirical wireless
`data.
`Implicit identiﬁers also reveal identity in other contexts. Security
`tools like nmap [12] and p0f [28] leverage differences in network
`stack behaviors to determine a device’s operating system. Key-
`stroke dynamics have been shown to accurately identify users [24,
`
`33]. The timing and sizes of Web transfers often uniquely identify
`websites, even when transmitted over encrypted channels [8, 34].
`Finally, there has been a large body of research in identifying appli-
`cations from implicit identiﬁers in encrypted trafﬁc [19, 20, 25, 42,
`43]. Like many of these techniques which succeed in classifying
`applications accurately, we use a Bayesian approach.
`
`4. EXPERIMENTAL SETUP
`This section describes the evaluation criteria we use to determine
`how well several implicit identiﬁers can be used to track users.
`The Adversary. Strong adversaries, such as service providers and
`large monitoring networks, obviously pose a large threat to our lo-
`cation privacy. However, the signiﬁcance of the threat posed by
`802.11 is that anyone that wishes to track users can do so.
`Therefore, we consider an adversary that runs readily available
`monitoring software, such as tcpdump [37], on one or more lap-
`tops or on less conspicuous commodity 802.11 devices [3]. We
`further restrict adversaries by assuming that their devices listen
`passively. That is, they never transmits 802.11 frames, not even
`to associate with a network. This means that the adversary can not
`be detected by other radios. The adversary deploys monitoring de-
`vices in one or more locations in order to observe 802.11 trafﬁc
`from nearby users. By considering a weak adversary, we place a
`lower bound on the accuracy with which users can be tracked, as
`stronger adversaries would be strictly more successful.
`The Environments. An adversary’s tracking accuracy will depend
`on the 802.11 networks he or she is monitoring. Since implicit
`identiﬁers are not perfectly identifying, it will be more difﬁcult to
`distinguish users in more populous networks. In addition, different
`networks employ different levels of security, making some implicit
`identiﬁers invisible to an adversary. We consider the three domi-
`nant forms of wireless deployments today: public networks, home
`networks, and enterprise networks.
`Public networks, such as hot spots or metro-area networks [27],
`are typically unencrypted at the link-layer. Although many public
`networks employ access control—for example, to allow access to
`only a provider’s customers—most do so via authentication above
`the link-layer (e.g., through a web page) and by using MAC address
`ﬁltering thereafter. Very few use 802.11i-compliant protocols that
`also enable encryption. Hence, identifying features at the network,
`link, and physical layers would be visible to an eavesdropper in
`such an environment. Unfortunately, this is the most common type
`of network today due to the challenge of secure key distribution.
`Home and small business networks are small, but detecting when
`speciﬁc users are present is increasingly challenging due to the
`high density of access points in urban areas [5]. In addition, these
`networks are more likely to employ link-layer encryption, such
`as WEP or WPA, because the set of authorized users is typically
`known and is small. In cases where link-layer encryption is em-
`ployed, an eavesdropper will not be able to view the payloads of
`data packets. However, features that are derived from frame sizes
`or timing, which are not masked by encryption, or from 802.11
`management frames, which are always sent in the clear, remain
`visible.
`Finally, security conscious enterprise networks are likely to em-
`ploy link-layer encryption. Moreover, if the only authorized de-
`vices on the network are provided by the company, there will be
`less diversity in the behavior of wireless cards. For example, Intel
`corporation issues similar corporate laptops to its employees. We
`consider a enterprise network where only one type of wireless card
`and conﬁguration is in use, so users can not be identiﬁed by differ-
`ences in device implementation. However, features derived from
`
`IA1018
`
`Page 3 of 12
`
`

`the networks that users visit or the applications and services they
`run remain visible.
`The Monitoring Scenario. We assume that users use different
`pseudonyms during each wireless session in each of these environ-
`ments, as Gruteser et al. [14] propose. As a result, explicit iden-
`tiﬁers can not link their sessions together. Sessions can vary in
`length, so we assume that every hour, each user will have a differ-
`ent pseudonym. We deﬁne a trafﬁc sample to be one user’s network
`trafﬁc observed during one hour.
`Although it is possible for users to change their MAC addresses
`more frequently, this is unlikely to be very useful in practice be-
`cause other features, such as received signal strength, can link
`pseudonyms together at these timescales [6, 35]. Moreover, chang-
`ing a device’s MAC address forces a device to re-associate with
`the access point and, thus, disrupts active connections.
`In addi-
`tion, it may require users to revisit a web page to re-authenticate
`themselves, since MAC addresses are tied to user accounts in many
`public networks. Users are unlikely to tolerate these annoyances
`multiple times per session.
`Of course, the ability to link trafﬁc samples together does not
`help an adversary detect a user’s presence unless the adversary is
`also able to link at least one sample to that user’s identity. In Sec-
`tion 2, we showed that identity can sometimes be revealed by cor-
`relating implicit identiﬁers with out-of-band information, such as
`that provided by the Wigle [39] location database. However, if the
`adversary knows the user he wishes to track, he can likely obtain a
`few trafﬁc samples known to come from that user’s device. For ex-
`ample, an adversary could obtain such samples by physically track-
`ing a person for a short time. We assume the adversary is able to
`obtain this set of training samples either before, during, or after
`the monitoring period. Our results show that on average, only 1 to
`3 training samples are sufﬁcient to track users with each implicit
`identiﬁer (see Section 6.2.3). The monitor itself collects samples
`that the adversary wants to test, which we call validation samples.
`Evaluation Criteria. There are a number of questions an adver-
`sary may wish to answer with these validation samples. Who was
`present? When was user U present? Which samples came from
`user U? Essential to answering all these questions is the ability to
`classify samples by the user who generated them. In other words,
`given a validation sample, the adversary needs to answer the fol-
`lowing question for one or more users U:
`
`Question 1 Did this trafﬁc sample come from user U?
`
`Section 6 evaluates how well an adversary can answer this question
`with each of our implicit identiﬁers.
`To demonstrate how well implicit identiﬁers can be used for
`tracking, we also evaluate the accuracy in answering the following:
`
`Question 2 Was user U here today?
`
`This question is distinct from Question 1 because an adversary can
`observe many trafﬁc samples at any given time, any one of which
`may be from the target user U. In addition, a single afﬁrmative
`answer to Question 1 does not necessitate a afﬁrmative answer to
`Question 2 because an adversary may want to be more certain by
`obtaining multiple positive samples. Section 7 details the interac-
`tion between these questions and evaluates how many users can
`be tracked with high accuracy in each of the 802.11 networks de-
`scribed above.
`
`5. WIRELESS TRACES
`We evaluate the implicit identiﬁers of users in three 802.11 traces.
`We consider sigcomm, a 4 day trace taken from one monitoring
`point at the 2004 SIGCOMM conference [31], ucsd, a trace of all
`802.11 trafﬁc in U.C. San Diego’s computer science building on
`November 17, 2006 [10], and apt, a 19 day trace monitoring all
`networks in an apartment building, which we collected. All traces
`were collected with tcpdump-like tools and only contain informa-
`tion that can be collected using standard wireless cards in monitor
`mode. The ucsd trace is the union of observations from multiple
`monitoring points. IP and MAC addresses are anonymized but are
`consistent throughout each trace (i.e., there is a unique one-to-one
`mapping between addresses and anonymized labels). Link-layer
`encryption (i.e., WEP or WPA) was not employed in either the
`sigcomm or ucsd network and neither trace recorded application
`packet payloads. In our analysis, we show that implicit identiﬁers
`remain even when we emulate link layer encryption and that we
`do not need packet payloads to identify users accurately. The apt
`trace only recorded broadcast management packets due to privacy
`concerns; hence, we only use it to study the one implicit identiﬁer
`that is extracted from these packets.
`We distinguish unique users by their MAC address since it is not
`currently common practice to change it. To simulate the effect of
`using pseudonyms, we assume that every user has a different MAC
`address each hour. Hence, we have one sample per user for each
`hour that they are active. To simulate the training samples collected
`by an adversary, we split each trace into two temporally contiguous
`parts. Samples from the ﬁrst part are used as training samples and
`the remainder are validation samples. We choose a training period
`in each trace long enough to proﬁle a large number of users. For
`the sigcomm trace, the training period covers the time until the
`end of the ﬁrst full day of the conference. For the ucsd trace, the
`training period covers the time until just before noon. We skip one
`hour between the training and validation periods so user activities
`at the end of the training period are less likely to carry over to the
`validation period. For the apt trace, the training period covers the
`ﬁrst 5 days. We consider a user to be present during an hour if and
`only if she sends at least one data or 802.11 probe packets during
`that time; i.e., if the user is actively using or searching for a wireless
`network.1
`Table 1 shows the relevant statistics about each trace. Note that
`since can we only compute accuracy for users that were present in
`both the training and validation data, those are the only users that
`we proﬁle. Therefore, results in this paper refer to ‘Proﬁled Users’
`as the total user count and not ‘Total Users.’
`
`6.
`
`IMPLICIT IDENTIFIERS
`In this section, we describe four novel implicit identiﬁers and
`evaluate how much information each one reveals. Our results show
`that (1) many implicit identiﬁers are effective at distinguishing in-
`dividual users and others are effective at distinguishing groups of
`users; (2) a non-trivial fraction of users are trackable using any one
`highly discriminating identiﬁer; (3) on average, only 1 to 3 train-
`ing samples are required to leverage each implicit identiﬁer to its
`full effect; and (4) at least one implicit identiﬁer that we examine
`accurately identiﬁes users over multiple weeks.
`1We ignore samples that only contain other 802.11 management
`frames, such as power management polls. Including samples with
`these frames would not appreciably change the characteristics of
`the sigcomm workload, but would double the number of total
`“users” in the ucsd workload. This is because many devices ob-
`served in the ucsd trace were never actively using the network; we
`ignore these idle devices.
`
`IA1018
`
`Page 4 of 12
`
`

`apt
`ucsd
`sigcomm
`validation
`training
`training
`validation
`training
`validation
`345
`119
`10
`11
`37
`54
`Duration (hours)
`Total Samples
`1473
`638
`587
`1240
`1974
`3391
`Frames Per Sample (median)
`92
`57
`1227
`1128
`289
`284
`196
`97
`225
`371
`377
`412
`Total Users
`Proﬁled Users
`39
`39
`153
`153
`337
`337
`Samples Per Proﬁled User (mean)
`32.2
`14.7
`3.1
`4.7
`5.5
`9.1
`Users Per Hour (mean)
`4
`5
`59
`113
`53
`64
`Table 1—Summary of relevant workload statistics and parameters. The duration reports only hours with at least one active user.
`
`6.1 Identifying Trafﬁc Characteristics
`Network Destinations. We ﬁrst consider netdests, the set of IP
`<address, port> pairs in a trafﬁc sample, excluding pairs that are
`known to be common to all users, such as the address of the local
`network’s DHCP server. There are several reasons to believe that
`this set is relatively unique to each user. It is well known that the
`popularity of web sites has a Zipf distribution [9], so many sites are
`visited by a small number of users. In fact, in the sigcomm and
`ucsd training data, each <address, port> pair is visited by 1.15
`and 1.20 users on average, respectively. The set of sites that a user
`visits is even more likely to be unique. In addition, users are likely
`to visit some of the same sites repeatedly over time. For example,
`a user generally has only one email server and a set of bookmarked
`sites they check often [36].
`An adversary could obtain network addresses in any wireless
`network that does not enable link layer encryption. Even if users
`sent all their trafﬁc through VPNs, the case for several users in
`the sigcomm trace, the IP addresses of the VPN servers would be
`revealing. No application or network level conﬁdentiality mecha-
`nisms, such as SSL or IPSec, would mask this identiﬁer either.
`SSID Probes. Next we consider ssids, the set of SSIDs in 802.11
`probes observed in a trafﬁc sample. Windows XP and OS X add
`the SSID of a network to a preferred networks list when the client
`ﬁrst associates with the network. To simplify future associations,
`subsequent attempts to discover any network will try to locate this
`network by transmitting the SSID in a probe request. As we ob-
`served in Section 2, SSID names can be distinguishing.2 In addi-
`tion, probes are never encrypted because active probing must be
`able to occur before association and key agreement.
`There are two practical issues that limit the use of ssids as an
`implicit identiﬁer. First, the preferred networks list changes each
`time a user adds a network, and thus a proﬁle may degrade over
`time. Second, clients transmit the SSIDs on their preferred net-
`works lists only when attempting to discover service. Therefore,
`clients may not probe for distinguishing SSIDs very often. While
`this is true, our results show that when distinguishing SSIDs are
`probed for, they can often uniquely identify a user. Since all users
`in the monitoring area are likely to use the SSIDs of the networks
`being monitored, these SSIDs are not distinguishing and we do not
`include them in the ssids set.
`Broadcast Packet Sizes. We now consider bcast, the set of 802.11
`broadcast packet sizes in each trafﬁc sample. Many applications
`broadcast packets to advertise their existence to other machines on
`the local network. Due to the nature of this function, these packets
`
`2A recent patch [23] to Windows XP allows a user to disable ac-
`tive probing, but it remains enabled by default because disabling it
`would break association in networks where the access point does
`not announce itself. In addition, revealing probes or beacons are
`still required for devices to discover each other in ad hoc mode.
`
`Application
`Port Number of Sizes
`NA
`14
`wireless driver or OS
`DHCP
`67
`14
`sunrpc
`111
`1
`NetBIOS
`138
`7
`groove-dpp
`1211
`1
`Microsoft Ofﬁce v.X 2222
`1
`FileMaker Pro
`5003
`7
`X Windows
`6000
`1
`Table 2—A list of the most unique broadcast packets observed in
`the sigcomm trace. The third column shows the number of packet
`sizes that were emitted by at most 2 users.
`
`often contain naming information. For example, in our traces, we
`observed many Windows machines broadcasting NetBIOS naming
`advertisements and applications such as FileMaker and Microsoft
`Ofﬁce advertising themselves.
`Since these packets vary in length, their sizes can reveal infor-
`mation about their content even if the content itself is encrypted.
`Packet sizes alone appear to distinguish users almost as well as
`<application, size> tuples. For

This document is available on Docket Alarm but you must sign up to view it.

Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

Up-to-date information for this case.
Email alerts whenever there is an update.
Full text search for other cases.
Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.

Access Government Site

We are redirecting you
to a mobile optimized page.

Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket

Supplemental Search

Search for PTAB Motions

PTAB Analytics

TTAB Analytics

Basic Search

Filters

Party Search

Advanced

Selected Courts

Recently Selected Courts

Find PTAB Decisions

PTAB Analytics

Special PTAB Alerts

Orange Book

Directly Search Federal Courts

Search Trademark ...

This document is available on Docket Alarm but you must sign up to view it.

Accessing this document will incur an additional charge of $.

Still Working On It

A few More Minutes ... Still Working

This document could not be displayed.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

One Moment Please

Your document is on its way!

Sealed Document

We are redirecting youto a mobile optimized page.

Document Unreadable or Corrupt

We are unable to display this document.

STEP 2 of 2

Choose your membership type

Flat-Fee

Pay-As-You-Go

Add your payment information

Login or Join

Enter your corporate Email

Thousands of your peers are saving time and gaining a competitive advantage with Docket Alarm.

Join Docket Alarm to perform smarter legal research.

Download this document and millions of others instantly with a Docket Alarm membership.

Join Docket Alarm and start performing smarter legal research.

Start tracking this docket instantly with a Docket Alarm membership.

Join thousands of your peers and start performing smarter legal research.

STEP 1 of 2

Millions of Documents | 15 Seconds to Signup

Hi !

Welcome to Docket Alarm

Welcome to Docket Alarm!

Explore Litigation Insights andManage Your Cases

Reset Password

What is PACER?

Why do I need it?

What will I be charged?

Do other courts have fees?

Basic Free Access

Welcome

Thank you

Check Firm Account

We are redirecting you
to a mobile optimized page.

Explore Litigation Insights and
Manage Your Cases