`
`Toward a
`Synergy Between
`P2P and Grids
`
`Domenico Talia and Paolo Trunfio • University of Calabria
`
`P eer-to-peer (P2P) networks and
`
`grids are distributed computing
`models that enable decentral-
`ized collaboration by integrating com-
`puters into networks in which each can
`consume and offer services. P2P is a
`class of self-organizing systems or
`applications that takes advantage of dis-
`tributed resources — storage, processing,
`information, and human presence —
`available at the Internet’s edges. A grid
`is a geographically distributed compu-
`tation platform comprising a set of het-
`erogeneous machines that users can
`access through a single interface.
`Both are hot research topics
`because they offer promising para-
`digms for developing efficient dis-
`tributed systems and applications.
`Unlike the classic client–server model,
`in which roles are well separated, P2P
`and grid networks can assign each
`node a client or server role according
`to the operations they are to perform
`on the network — even if some nodes
`act more as server than as client in
`current implementations.
`In analyzing both models, we dis-
`cover that grids are, in essence, P2P
`systems. Although many aspects of
`today’s grids are based on hierarchical
`services, this is an implementation
`detail that should be removed in the
`near future. As grids used for complex
`applications increase from tens to thou-
`
`sands of nodes, we should decentralize
`their functionalities to avoid bottle-
`necks. The P2P model could thus help
`to ensure grid scalability: designers
`could use the P2P philosophy and tech-
`niques to implement nonhierarchical
`decentralized grid systems.
`In spite of current practices and
`thoughts, the grid and P2P models
`share several features and have more
`in common than we perhaps general-
`ly recognize. As Ian Foster and Adri-
`ana Iamnitchi point out (dsl.cs.
`uchicago.edu), a broader recognition
`of key commonalities could acceler-
`ate progress in both communities. It
`is time to consider how to integrate
`these two models. A synergy between
`the two research communities, and
`the two computing models, could
`start with identifying the similarities
`and differences between them.
`
`Basics
`In the past few years, P2P has attract-
`ed enormous media attention and
`gained popularity by supporting two
`main classes of applications:
`
`• file sharing, in which peers share
`files with each other (Napster and
`Gnutella for music, for example)
`• highly parallel computing, in which
`an (inherently) parallel application
`runs on available nodes (SETI@
`
`home and FightAIDS@home, for
`example).
`
`Apart from these well-known systems,
`the P2P model is emerging as a new
`distributed paradigm because of its
`potential to harness the computing,
`storage, and communication power of
`hosts in the network to make their
`underutilized resources available to oth-
`ers. P2P shares this goal with the Grid,
`which was designed to provide access
`to remote computing resources for
`high-performance applications, data-
`intensive applications, or both.
`Although originally intended for
`advanced scientific applications, grid
`computing has emerged as a para-
`digm for coordinated resource shar-
`ing and problem solving in dynamic,
`multi-institutional, virtual organiza-
`tions in industry and business. Grid
`computing can be seen as an answer
`to drawbacks such as overloading,
`failure, and low QoS, which are inher-
`ent to centralized service provision-
`ing in client–server systems. Such
`problems can occur in the context of
`high-performance computing, for
`example, when a large set of remote
`users accesses a supercomputer.
`Grid nodes typically make their own
`resources available at the same time
`they are accessing resources on other
`continued on p. 94
`
`96
`
`JULY • AUGUST 2003
`
`Published by the IEEE Computer Society
`
`1089-7801/03/$17.00©2003 IEEE
`
`IEEE INTERNET COMPUTING
`
`Authorized licensed use limited to: IEEE Staff. Downloaded on August 06,2021 at 16:29:58 UTC from IEEE Xplore. Restrictions apply.
`
`Netflix, Inc. - Ex. 1012, Page 000001
`IPR2022-00322 (Netflix, Inc. v. CA, Inc.)
`
`
`
`Peer to Peer
`
`continued from p. 96
`nodes. The grid model thus removes
`the definite distinction between client
`and server machines. However, current
`grid environments delegate specific
`management or coordination func-
`tions to certain nodes that are required
`to take “major responsibility.” Some
`recently developed P2P systems also
`require nodes to act as servers, at least
`when joining the network.
`P2P comprises several kinds of appli-
`cations with different design goals, such
`as anonymity (typically in file-sharing
`applications), scalability (typically in
`highly parallel computing applications),
`
`Security
`Security is a central theme in grids,
`and several efforts are devoted to inte-
`grating relevant mechanisms for
`authentication, authorization, integri-
`ty, and confidentiality in grid plat-
`forms. Nevertheless, such mechanisms
`are designed mainly for “closed com-
`munities,” in which designers have
`devoted some effort to letting users
`participate without accounts or trust
`relationships. By their nature, such
`security mechanisms allow anonymi-
`ty of neither users nor resources.
`In contrast, P2P systems originate
`in “open communities,” in which users
`
`The grid approach could benefit from the
`more flexible connectivity models used in P2P.
`
`or availability (in both application class-
`es). Moreover, P2P systems are based on
`several different designs:
`
`• systems such as Napster use cen-
`tralized resource indexes,
`• systems such as Gnutella use flood-
`ing-based search,
`• some experimental systems such as
`Gridella use structures with distrib-
`uted resource indexes, and
`• hybrid networks, such as the super-
`peer model (described later), com-
`bine the P2P and client–server
`models.
`
`As mentioned before, the identification
`of similarities and differences between
`grid and P2P systems is a good start-
`ing point for finding a convergence.
`
`Similarities
`and Differences
`In analyzing the P2P and grid models,
`we must consider several significant
`aspects and issues. Here we discuss
`some of the main issues that determine
`features of distributed computing mod-
`els. The techniques that the P2P and grid
`models use to handle those issues are
`key to finding a common foundation.
`
`share more generic goals (such as
`retrieving music from the Internet),
`rather than specific objectives (such as
`participating in high-energy physics
`simulations). For this reason, security
`mechanisms in the most widespread
`P2P systems generally don’t address
`authentication and content validation,
`but rather offer protocols that assure
`anonymity and censorship resistance.
`Although the two models currently
`handle security differently, it should be
`interesting to analyze how to exploit
`the approaches to create a security
`model for P2P grids.
`
`Connectivity
`Grids generally include powerful
`machines that are statically connected
`through high-performance networks
`with high levels of availability. On the
`other hand, the number of accessible
`nodes is generally low because access
`to grid resources is bonded to rigorous
`accounting mechanisms.
`Conversely, P2P systems are com-
`posed mainly of common desktop
`computers that are connected inter-
`mittently to the network, remaining
`available for a limited time with re-
`duced reliability. The number of
`
`nodes connected in a P2P network at
`a given time is much greater than in
`a grid. Thus, the grid connectivity
`approach is still too stiff for new
`nodes and user access and account-
`ing; it could benefit from the more
`flexible connectivity models used in
`P2P networks today.
`
`Access Services
`Access to remote resources was the
`main motivation for building grids,
`and it remains the primary goal today.
`Grid toolkits provide secure services
`for submitting batch jobs or executing
`interactive applications on remote
`machines; they also include mecha-
`nisms for efficiently sharing and mov-
`ing data across nodes.
`Current P2P systems do not support
`mechanisms for explicitly allocating
`remote cycles and storage, but they do
`provide protocols for sharing and
`exchanging data among nodes. P2P
`job-submission models and P2P job
`scheduling might thus be very attrac-
`tive topics for research into applying
`the P2P approach to grid scheduling
`and job management.
`
`Resource Discovery
`and Presence Management
`Resource discovery in grid environ-
`ments is based mainly on centralized
`or hierarchical models. In the Globus
`Toolkit (www.globus.org/toolkit), for
`instance, a user or an application can
`directly gain information about a
`given node’s resources by querying a
`server application running on it or
`running on a node that retrieves and
`publishes information about a given
`organization’s node set. Because such
`information systems are built to
`address the requirements of organiza-
`tional-based grids, they do not deal
`with more dynamic, large-scale dis-
`tributed environments, in which use-
`ful information servers are not known
`a priori. The number of queries in
`such environments quickly makes a
`client–server approach ineffective.
`Resource discovery includes, in part,
`the issue of presence management —
`discovery of the nodes that are current-
`
`94
`
`JULY • AUGUST 2003
`
`http://computer.org/internet/
`
`IEEE INTERNET COMPUTING
`
`Authorized licensed use limited to: IEEE Staff. Downloaded on August 06,2021 at 16:29:58 UTC from IEEE Xplore. Restrictions apply.
`
`Netflix, Inc. - Ex. 1012, Page 000002
`
`
`
`Synergy between P2P and Grids
`
`ly available in a grid — because global
`mechanisms are not yet defined for it.
`On the other hand, the presence-man-
`agement protocol is a key element in
`P2P systems: each node periodically
`notifies the network of its presence, dis-
`covering its neighbors at the same time.
`Future grid systems should implement
`a P2P-style decentralized resource dis-
`covery model that can support grids as
`open resource communities.
`
`Fault Tolerance
`The dynamic nature of grids necessi-
`tates some level of fault tolerance —
`especially for highly distributed code,
`such as parameter-sweep applications,
`which can fork numerous similar, inde-
`pendent jobs on many nodes.
`Beyond simple checkpointing and
`restarting, reliability and fault toler-
`ance are largely unexplored in grid
`models and tools. The Globus infor-
`mation system allows fault detection,
`for instance, but developers must
`implement fault tolerance at the
`application level. For greater reliabil-
`ity, designers of fault-tolerance mech-
`anisms and policies for grids should
`consider using decentralized P2P
`algorithms, which avoid centralized
`services that can represent critical
`failure points.
`
`Where We Should Go
`Despite the interest in P2P and grid net-
`works, few noteworthy research efforts
`are currently devoted to finding com-
`monalities and synergies between them.
`In a significant exception, Fox and col-
`leagues have sketched a P2P architec-
`ture for grid-connected resources (www.
`communitygrids.iu.edu), but much more
`remains to be done by members of both
`communities.
`We believe a P2P approach is need-
`ed both to
`
`• implement grid tools and services,
`and
`• design and develop grid applica-
`tions that must access and coordi-
`nate remote resources and services.
`
`Two core Globus Toolkit components —
`
`the monitoring and discovery service
`(MDS) and the replica management ser-
`vice — could be effectively redesigned
`using a P2P approach, for example. If
`we view current grids as federations of
`smaller grids managed by diverse orga-
`nizations, we can rethink the Globus
`MDS for a large-scale grid by adopting
`the super-peer network model (www-db.
`stanford.edu/~byang/pubs/superpeer.
`pdf). In this approach, each super peer
`operates as a server for a set of clients
`and as an equal among other super
`peers. This topology provides a useful
`balance between the efficiency of cen-
`tralized search and the autonomy, load
`balancing, and robustness of distributed
`search. In a grid information service
`based on the super-peer model, each
`participating organization would con-
`figure one or more of its nodes to oper-
`ate as super peers. Nodes within each
`organization would exchange monitor-
`ing and discovery messages with a ref-
`erence super peer, and super peers from
`different organizations would exchange
`messages in a P2P fashion.
`Grid applications should be de-
`signed according to a decentralized
`model. This can require additional
`effort to develop because of the cur-
`rent lack of P2P–grid middleware, but
`P2P–grid tools and services could
`greatly simplify such tasks in the
`future we envision.
`
`Aligning Technologies
`The grid community recently initiated a
`development effort to align grid tech-
`nologies with Web services: the Open
`Grid Services Architecture (OGSA) lets
`developers integrate services and re-
`sources across distributed, heteroge-
`neous, dynamic environments and com-
`munities. The OGSA model adopts the
`Web Services Description Language
`(WSDL) to define the concept of a grid
`service using principles and technologies
`from both the grid and Web services
`communities. Web services and the
`OGSA both seek to enable interoper-
`ability between loosely coupled services,
`independent of implementation, loca-
`tion, or platform.
`OGSA provides an opportunity to
`
`integrate P2P and the Grid. The archi-
`tecture defines standard mechanisms
`for creating, naming, and discovering
`persistent and transient grid-service
`instances. It will be an interesting chal-
`lenge to determine how to use OGSA-
`oriented grid protocols to build P2P
`applications. By implementing service
`instances in a P2P manner within such
`a framework, developers can provide
`P2P service configuration and deploy-
`ment on the grid infrastructure. A peer
`could thus invoke a grid service by
`exchanging a specified sequence of
`messages with a service instance,
`which might invoke another grid ser-
`vice published by another peer through
`an associated grid service interface.
`Developers and users could exploit
`the many contact points between P2P
`and grid networks by recognizing P2P’s
`relevance to corporations and public
`organizations rather than viewing it as
`just a home computing technology.
`They also could exploit P2P protocols
`and models to face grid-computing
`issues such as scalability, connectivity,
`and resource discovery. A synergy
`between P2P and grids could lead to
`new highly distributed systems in which
`each computer contributes to solving a
`problem or implementing a system
`while also using services offered by
`other computers in the network. Enter-
`prises, public institutions, and private
`companies could find it both useful and
`profitable to develop distributed appli-
`cations on a world-wide Grid.
`
`Domenico Talia is a professor of computer sci-
`ence at the University of Calabria. His
`research interests include grid computing,
`parallel systems, and data mining. He
`received a Laurea degree in physics from the
`University of Calabria. Talia is a member of
`the IEEE Computer Society and the ACM.
`Contact him at talia@deis.unical.it.
`
`Paolo Trunfio is a PhD student in computer engi-
`neering at the University of Calabria. His
`research interests include grid computing,
`peer to peer, and parallel systems. He
`received a Laurea degree in computer engi-
`neering from the University of Calabria.
`Contact him at trunfio@deis.unical.it.
`
`IEEE INTERNET COMPUTING
`
`http://computer.org/internet/
`
`JULY • AUGUST 2003
`
`95
`
`Authorized licensed use limited to: IEEE Staff. Downloaded on August 06,2021 at 16:29:58 UTC from IEEE Xplore. Restrictions apply.
`
`Netflix, Inc. - Ex. 1012, Page 000003
`
`



