throbber
Page 1
`
`INTELLECTUAL VENTURES EX. 2013
`EMC v. Intellectual Ventures
`IPR2016-01106
`
`

`

`104
`
`-
`
`Philip H. Enslow Jr.
`
`CONTENTS
`
`INTRODUCTION
`Motivations
`Multiple-Computer Systems
`Definition of a Multiprocessor
`MULTIPROCESSOR HARDWARE SYSTEM
`ORGAN IZATIONS
`Time-Shared/Ccmmon-Bus Systems
`Crossbar SWItch Systems
`Multiport Memory Systems
`Comparison of the Three Basic System Organisation
`MULTIPROCESSOR OPERATING SYSTEMS
`Operating System Fatalities Prowded
`Organization of Multiprocessor Operating Systems
`PAST, PRESENT. AND FUTURE OF
`MULTIPROCESSING
`Development of Multiprocessing
`Current Multiprocessors
`System Performance
`Coat Efiecuveness
`Future Trends
`FURTHER READINGS
`REFERENCES
`
`+
`
`This paper is concerned with improvements
`at the level of system organization; it deals
`specifically with a special class of system
`organizations—systems known as multi-
`processors.
`
`Multiple-Computer Systems
`
`Of course, not all multiple-computer systems
`are multiprocessors. An obvious example of a
`multiple-computer system that
`is not a
`multiprocessor is a system with a stand-
`alone peripheral or satellite processor. Per-
`haps less obvious examples are the various
`forms of coupled systems (both loosely and
`closely coupled) such as the IBM ASP (At-
`tached Support Processor) System and
`others having direct electrical connections.
`Specific examples and a complete discussion
`of the evolution of multiplecomputer sys-
`tems, as well as an introduction to paral-
`
`Computina Survey, Vol. 9, No. 1, March 1977
`
`lelisnl in the basic unit-processor, are given
`in Enslow [9].
`there are many similarities
`Naturally,
`between multiple-computer
`systems and
`multiprocessors since both are motivated by
`the same basic goal—the support of simul-
`taneous operations in the system; in fact
`the distinctions are often not clear-cut, as is
`exemplified by the frequent use of the term
`“multiprocessing” in instances where it is
`not appropriate. However, there is an im-
`portant dilierence between multiple-com-
`puter systems and multiprocessors, and it is
`based on the extent and degree of sharing
`[8]: A multiple-computer system consists of
`several separate and discrete computers (even
`though there may be direct communica-
`tion between them), whereas a multiproc-
`essor is a single computer with multiple
`processing units.
`
`Definition of a Multiprocessor
`
`A multiprocessor is defined in the American
`National Standard Vocabulary for Informa-
`tion Processing as “a computer employing
`two or more processing units under inte-
`grated control.” That definition is good as
`far as it goes, but it does not go far enough.
`Certainly the requirement that a multi-
`processor have “integrated control” is ex-
`tremely important, for a multiprocessor must
`have a single integrated operating system;
`however, the concepts of sharing and inter-
`action, which are at the core of the tech-
`niques of multiprocessing, are not included
`in the ANSI definition.
`
`With respect to the hardware, a multi-
`processor must have the capability for the
`direct sharing of main memory by all proc‘
`essors
`(arithmetic/logic unit and control
`unit only) and the sharing of input/output
`devices by all memory and processor com-
`binations (Figure 1). Although there may
`be some qualifications on the sharing of all
`resources of a particular type (one example—-
`private memory—is discussed below), the
`basic concept of total sharing is still valid.
`The important aspect of “interaction” is
`the level at which it occurs. In multiple-
`computer systems the physical unit of inter-
`action is usually the complete file or data
`set. In a true multiprocessor the level of
`
`
`
`
`
`
`
`Page 2
`
`

`

`
`
`Multiprocessor Organization—A Survey
`
`o
`
`105
`
`Interconnection
`
`system
`
`
`
`Processor
`
`11 nits
`
`
`FIGURE 1.
`
`Basic multiprocessor organization.
`
`interaction aDOWed must be more flexible
`and in fact must be allowed to descend to
`
`even the smallest physical unit. Interaction
`must be possible with files, data sets, and
`even data elements. From the control point
`of View,
`interaction must be possible be-
`tween complete jobs and tasks as well as
`between individual job steps.
`It is the combination of these expanded
`concepts of sharing and of interaction at all
`levels
`that completely characterizes
`the
`hardware and software required to provide
`a "true" multiprocessor, which can now be
`defined by the following system character-
`istics:
`
`I A multiprocessor contains two or more
`processors
`of
`approxiinately
`com-
`parable capabilities.
`. All processors share access to common
`memory.
`I All processors share access to input/
`output channels, control units, and
`devices.
`
`0 The entire system is controlled by one
`operating system providing interac-
`tion between processors and their
`programs at the job, task, step, data
`set, and data element levels.
`
`MULTIPROCESSOR HARDWARE SYSTEM
`ORGANIZATIONS
`
`The key to the classification of multiproc-
`essor systems is the interconnection sub-
`
`system, and the factors that are most im-
`portant are the topology and the operation
`of this central portion of the system. There
`have been a number of very good taxono-
`mies prepared which focus on interconnec-
`tion networks [1, 5, 18]; however, these re-
`views have been concerned primarily with
`the details of interconnection itself rather
`than with the characterization of the sys-
`tems in which particular interconnection
`subsystems are embedded. Examining the
`nature of the processor-to-memory switch
`in a manner similar to Conway [6], this
`author has identified three fundamentally
`different
`system organizations used in
`multiprocessors :
`I ‘Timesshared or common bus
`0 Crossbar switch matrix
`
`I Multiport memories
`Although the entire scope of interconnection
`schomas is much larger and certainly mueh
`more complex than the coverage presented
`here,
`these categories nonetheless form a
`useful base for a discussion of the organiza-
`tion of multiprocessor systems, and each of
`these interconnection techniques is pre-
`sented below in the context of providing
`the central portion of such systems.
`It should also be noted that there are
`
`several other system organizations that have
`been utilized to achieve parallelism. Some
`of these are: asymmetrical or nonhomoge-
`neous systems (e.g. CDC 6000 series); array
`
`Computint Surveys, Vol. 9, No. 1, March [977
`
`
`
`
`
`Page 3
`
`

`

`
`
`106
`
`-
`
`Philip H. Enslmu Jr.
`
`IIO
`processor
`
`
`
`
`
` Processor
`
` Processor
`
`I/O
`processor
`
`
`
`
`
`FIGURE 2. Time-shared/common bus system organisation—single bus.
`
`and vector processors; pipeline processors;
`and associative processors.
`The first of these other systems may be
`quite close to a "true" multiprocessor if the
`operating system supports the proper levels
`of interaction. The latter three examples of
`system organization are discussed in other
`papers in this special
`issue. In addition,
`there is one other class of system organiza-
`tion that exhibits many of the characteristics
`of a multiprocessor, and that is fault-tolerant
`systems; however, the motivation for the
`development of such systems and the goals
`of their design are quite different from those
`of a true multiprocessor system.
`
`Time-Shored/Common-Bus Systems
`
`interconnection system for
`The simplest
`either single or multiple processors is a
`common communication path connecting
`all of the fonctional units. This technique
`has been used to assemble some simple
`multiprocessors
`(Figure 2)—“simple”
`in
`that the interconnection subsystem can be
`merely a multiconductor cable. Such an
`interconnection system is often a totally
`passive unit having no active components
`such as switches or amplifiers. Transfer
`operations are controlled completely by the
`bus interfaces of the sending and receiving
`units. The unit wishing to initiate a trans-
`fer, a processor or an 1/0 unit, must first
`determine the availability status of the bus,
`then address the destination unit, deter-
`mine its availability and capability to re-
`ceive the transfer, tell the destination what
`to do with the data being transferred. and
`finally initiate the transfer. A receiving unit
`only has to recognize its address and re-
`
`spend to the control signals from the sender.
`These are the basic concepts, although the
`entire operation is not actually that simple.
`(The single bus in the PDP-ll, the Uranus,
`has 56 lines to provide the control lines and
`data paths necessary to transfer words of
`only 16 bits.) It is possible to simplify this
`process somewhat by the use of a centralized
`bus controller/arbiter, but such an ap-
`proach does have negative efiects on system
`reliability and flexibility.
`The hardware changes required to add or
`remove functional units are usually quite
`minimal. Often all
`that is required is to
`physically attach or detach the unit. The
`units in the system are required to know
`what other units are present and to know
`their unit and internal location addresses,
`but that requirement is basically a software
`problem. The quantity and types of func-
`tional units are transparent to the intercon-
`nection subsystem. This type of intercon-
`nection subsystem is, by its very nature,
`quite reliable, and its cost is relatively low;
`however, it does introduce a single critical
`component in the system that can cause a
`system failure as a result of a malfunction
`in any of the bus interface circuits.
`Of course these benefits of simplicity and
`low cost do not accrue without entailing
`other limitations—in particular the serious
`limitation on overall system performance
`that results from having only one path for
`all transfers—since the total overall transfer
`
`rate within the system is limited by the
`bandwidth and speed of this single path.
`Interconnection techniques that overcome
`this weakness add to the complexity of
`the system.
`The first step in solving this problem
`
`Computing Surveys, Vol. 9. No. 1. March 1977
`
`
`
`Page 4
`
`

`

`
`
`Multiprocessor Organizatimt—A Survey
`
`-
`
`107
`
`
`
`l/O
`devices
`
`Memory
`Units
`
`
`
`Bus
`modifier
`
`Control
`109w
`
`Pr
`
`USP-55°F
`
`P
`
`racessor
`
`FIGURE 3. Time-shared/common bus system organization—unidirectional buses.
`
`
`
`FIGURE 4. Time—shared/common bus system organizationumultipla two-way husaa.
`
`might be to provide two one-way paths
`(Figure 3), since this addition does not
`appreciably increase
`system complexity
`or diminish reliability. On the other hand, a
`single transfer operation in such a system
`usually requires the use of both buses, so
`not much is actually gained.
`The next step would be to provide mul-
`tiple two-way buses (Figure 4)
`to allow
`multiple simultaneous transfers; however, the
`complexity of the system would be greatly in-
`creased. No longer would the interconnec-
`tion subsystem be a totally passive unit;
`logic, switching, and other control functions
`would now have to be associated with each
`
`point at which functional units Were at-
`tached to the transfer buses.
`
`An example of a system utilizing separate
`time—shared buses for memory access and
`for input/output transfers is the MIT/IL
`ACGN computer, a fault-tolerant system
`designed for space applications (Figure 5).
`Another system utilizing a set of multiple
`transfer buses is the Plessy System 250,
`which was developed for communications
`applications‘ (Figure 6). The 250 has one
`bus per processor
`in the system. Other
`
`1 The System 250 is also well known for the hard-
`ware included in the system to support the direct
`implementation of protection through the use of
`capabllities.
`
`examples of systems employing the time-
`shared bus technique for interconnection
`are the IBM STRETCH, the Univac LARC,
`the CDC 6600 (for transfers betWeen main
`memory and the peripheral processors), and
`seVeral minicomputsr systems such as the
`Lockheed Sun.
`
`A recently developed system utilizing a
`series of multiple and separate buses is the
`PLURIBUS minicomputer multiprocessor [16,
`22]. The basic processor is the Lockheed
`SUE minicornputer. There are three types of
`buses—processor, memory, and input/out-
`put. There are seven processor buses with
`two processors and two 4K memories at-
`tached to each (Figure 7(a)); there are also
`two memory buses, each with two 8K
`memory units (Figure 703)); and, finally,
`there is one input/output bus plus an input/
`output bus extension (Figure 7(c)). These
`buses are all
`interconnected to form a
`single system of 14 processors (Figure 8).
`
`Crossbar Switch Systems
`
`If the number of buses in a time-shared bus
`system is increased, a point is reached at
`which there is a separate path available for
`each memory unit (Figure 9). The inter-
`connection subsystem is then a “nonb10ck-
`ing” crossbar. The adjective “nonblocking”
`
`Computing Surveys. Vol. 9, No. 1, March 1977
`
`
`
`
`
`Page 5
`
`

`

`Philip H. Enslm» Jr.
`
`Processor
`memOry
`
`Processor
`memory
`
`Processor
`memory
`
`Processor
`
`Data
`memory
`
`Data
`memory
`
`Data
`memory
`
`IIO
`controller
`
`Processor I Processor
`
`
`
`FIGURE 5. MIT/IL ACGN computer.
`
`Paper tape
`punch
`
`Serial
`interface
`unit
`
`Teleprmter
`
`Low
`activity
`peripheral dewces
`
`Serial
`interface
`unit
`
`Paper tape
`reader
`
`Parallel
`interface
`unit
`
`Magnetic
`tape
`
`Parallel
`interface
`
`Disk or Drum Line Printer
`
`Parallel
`interface
`
`Parallel
`interface
`
`Banal-
`parallel
`adaptor
`
`Serial-
`parallel
`adaptor
`
`Secondary
`switches
`
`Primary
`switches
`
`Peripheral
`buses
`
`multiplexer multiplexer
`
`PP250
`processor
`unrt
`
`processor
`unit
`
`processor
`unit
`
`Processor
`
`buses
`
`FIGURE 6. Plessy System 250—medium system canfiguration [25].
`
`Computing Surveys. Vol. 9, No. I, March 1917
`
`Page 6
`
`

`

`
`
`Multiprocessor Organization—A Suwey
`
`o
`
`109
`
`Processor bus
`
`Memory bus
`
`110 bus
`
`
`
`Bus coupler
`
`o 0
`
`Bus coupler
`
`Pseudo interrupt
`
`
`
`
`
`unications
`Comm
`lnterlace
`
`oea
`
`Communications
`interface
`
`Bus extender
`
`
`
`I device
`
`
`
`
`
`Communications
`Interface
`
`
`
`
`
`
`is)
`
`
`
`
`
`(a)
`
`
`
`FIGURE 7. Fantastic bus structures [16].
`
`is usually omitted since one characteristic of
`the crossbar switches used in multiprocessor
`systems is that they are “complete” with
`respect to the memory units (i.e., there is a
`separate bus associated with each memory,
`and the maximum number of transfers that
`
`memory cycle. These conflicting requests
`are usually handled on a predetermined
`priority basis, e.g., input/output has highest
`priority in all conflicts, processor number 2
`has primary access priority to memory 2,
`etc. The result of the inclusion of such a
`
`can take place simultaneously is limited by
`the number of memory boxes and the band
`width-speed product of
`the buses rather
`than by the number of paths available).
`The important characteristics of a system
`utilizing a crossbar interconnection matrix
`are the extreme simplicity of the switch
`to-functional unit interfaces and the ability
`to support simultaneous transfers for all
`memory units. To provide these features
`requires major hardware capabilities in the
`switch. Not only must each cross-point be
`capable of switching parallel transmissions,
`but it must also be capable of resolving
`multiple requests for access to the same
`memory module occurring during a single
`
`capability is that the hardware required to
`implement
`the switch can become quite
`large and complex. An example that has been
`cited is a system with twenty-four 32-bit
`processors and 32 memory units (Lehman,
`cited in [2]). The number of circuits re-
`quired in the switch matrix for this system
`would be two to three times the number
`
`required for an IBM System 360 Model '15.
`Although large scale integration (LSI) can
`reduce the size of the switch, it will have
`little efl‘ect on its cemplexity.
`A characteristic of somewhat lesser gen-
`eral importance, but one which can be sig-
`nificant
`in specific instances,
`is the ca-
`pability to expand the size of the system
`
`Computing Surveys, Vol. 9, No. 1. March 19??
`
`
`
`Page 7
`
`

`

`110
`
`'
`
`Philip H. Enslow Jr.
`
`Power
`
`Processor busses [7)
`H
`
`I:
`
`;
`
`:
`
`EEHEEEEEI
`
`Extender
`
`A
`
`'3 Bus
`E
`[Eilpcemra'
`
`Dame
`
`I
`
`Intartace
`
`Frocemor
`End
`
`Time
`Ciock
`
`Communicauon
`[5:52:31
`Elana m
`c539: Memory
`
`rocasaor H Memawem
`
`
`
`Computing Surveys. Vol. 9. No. II Hum]: 19??
`
`_ -'
`Power
`*
`
`'
`
`3K
`
`Memory
`
`'-
`' g B BK
`I
`IIIIIII
`“ A
`Busses
`'1'
`h—
`IIIIIII-II
`-
`-II
`LII-d
`.IIIIIIIII II|_---
`III
`III—
`IIIIIIIIII III I ' "h I
`
`---I
`
`-
`
`PowerCCCB
`':"'
`Power
`n fitfifififitfififi EEHHEHI
`
`“005
`
`FIGURE 8. PLUmBUs prototype system [16].
`
`
`
`FIGURE 9. Crossbar (nunblocking) switch system organization.
`
`
`
`Page 8
`
`

`

`
`
`111
`
`Multiprocessor Organization—A Survey
`
`-
`
`FIGURE 10. Crossbar switch system organization with 1/0 crossbar switch matrix.
`
`Memories
`
`Processors {Including U0)
`
`
`
`
`FIGURE 11. Burroughs multiple interpreter system organization.
`
`merely by increasing the capacity of the
`switch. Usually there are no changes re-
`quired in any of the functional units be-
`cause of the very simple interfaces utilized,
`and the switch may be designed so that its
`capacity can be increased simply by adding
`additional modules of crosspoints. One ef-
`fect of LSI on the crossbar interconnection
`
`system is the feasibility of designing crossbar
`matrices for a larger capacity than initially
`required,
`equipping them only for
`the
`present
`requirements. Expansion would
`then be facilitated, requiring only the addi-
`tion of the missing crosspoints. This dis-
`cussion of “easy” expansion has addressed
`only the hardware. The modification of the
`operating system to Support
`the larger
`system may prove to be difficult if it was not
`properly designed for expansion; however,
`this is true for all multiprocessor system
`
`the inter-cen-
`
`organizations regardless of
`nection technique employed.
`In order to provide the flexibility re-
`quired in access to the input/output de-
`vices, a natural extension of the crossbar
`switch concept is to use a similar switch on
`the device side of the I/O processor or
`channel (Figure 10). The hardware required
`for the implementatiOn is quite difi'erent
`and not nearly so complex because con-
`trollers and devices are normally designed
`to recognize their own unique addresses.
`The effect.
`is the same as if there were a
`primary bus associated with each I/O chan-
`nel and crossbuses for each controller/device.
`A system utilizing a variation of
`the
`crossbar is the Burroughs Mold-Interpreter
`System [7, 23}. For the purpose of
`this
`discussion, which is concerned primarily
`with interconnection systems, it is sufiicient
`
`
`
`
`
`Computing Surveys, Vol. 9, No. 1, March I977
`
`Page 9
`
`

`

`Philip H. Emlmv Jr.
`
`x
`
`distribution
`u nit
`
`slngfe
`executive
`
`Recovery
`nucleus
`
`Main memory
`units
`131 .072 words
`
`There are a number of examples of sys-
`
`(b) Signal distribution unit
`
`FIGURE 12. RCA 215 multiprocessor.
`
`to note that the basic building block of the
`system i a microprogrammed “interpreter”
`(Figure 11). The microprograms in these
`units can be changed dynamically so that a
`single interpreter can function as a FORTRAN
`translation machine at one time, later change
`to an ALGOL execution machine, then change
`to function as an input/output processor,
`etc. It is obvious that the interconnecting
`switch for this system must- be extremely
`flexible. As can be seen in the figure, the
`switch resembles the earlier diagram of a
`
`multiple-shared-bns system; however, it can
`also be considered as a crossbar 1)" there are
`enough independent paths for all memories
`to be accessed simultaneomly. This is an
`excellent example of the blending together of
`these two concepts and the major character-
`istic that differentiates them: the crossbar
`
`provides nonblocking simultaneous memory
`access, and the miflfiple-shared bus pro-
`vides flexibility in the routing of intercon-
`nection paths.
`
`Computing Survelfll. Vol. 9, No. 1, March 1971'
`
`Page 10
`
`

`

`Multiprocessor Organizatz'm—A Survey
`
`-
`
`113
`
`16 X 16 Crossbar Interconnect
`
`processor-lo-memory only
`
`translator Address
`
`Address
`translator
`
`Address
`
`translator
`
`Interprocessor
`interrupt
`controller
`
`lnterprocessor Interrupt bus
`
`FIGURE- 13. Cman—the Carnegie-Mellon multi-mini processor.
`
`terns utilizing crossbar interconnection sys4
`tems. The first
`true multiprocessor,
`the
`Burroughs D-825 [AN/GYK-3(V)], had the
`switch distributed among the memory
`modules with five cables entering each
`switch module (three of these were normally
`used for processors with the other two for
`input/output). The switch module inter-
`faces had the logic circuitry necessary to
`accommodate and queue simultaneous mem-
`ory access requests. The 13-825 also utilized
`a crossbar switch matrix for connection to
`selected input/output devices. The organi-
`zation of the 13—825 then is quite similar to
`that shown in Figure 10. A system employ-
`ing a classic separately identifiable crossbar
`matrix is the RCA 215 (Figure 12). In this
`particular system the switch is designated
`the signal distribution unit (SDU). Figure
`12(b) illustrates the assignment of memory
`access priorities Within the SDU.
`A major
`research project
`involving a
`crossbar
`interconnection system is
`the
`C.mmp,
`the Carnegie-Mellon multi-mini-
`processor (Figure 13). The scope of this re-
`search project cncompasses the investiga-
`
`tion of economical techniques for intercon-
`neetion as well as in-depth studies of the
`operating system and of overall system per-
`formance [27—29]. The processor units uti~
`lized in the system are various models of the
`DEC PDP-ll. The Specific features of the
`implementation of
`this processor exhibit
`some deviations from the classic design for a
`multiprocessor. The first of these is that
`each processor has associated with it a block
`of dedicated private memory. This block is
`used to support the dedicated memory loca-
`tions used in the PDP—ll interrupts and
`traps; however, it would have been possible
`to send the traps through the crossbar.
`Another feature is a separate unit function-
`ing as the address translator for all accesses
`to the shared memory, since the address
`space of the main memory greatly exceeds
`that of
`the PDP-ll itself. Finally, each
`input/output device is associated with a
`single processor and cannot be shared. This
`again is an accommodation to the UNIBUS
`structure of the basic PDP-ll. The UNIsus
`is also used for access to a special bus that
`supports interprocessor communication.
`
`Computing Surveys. Vol. 9, No. 1. March 1971
`
`
`
`Pa
`
`
`
`Page 11
`
`

`

`
`
`114
`
`-
`
`Pass; H. Enslow Jr.
`
` Computer
`
`Central exchange
`
`Computer
`
`Peripheral
`device
`
`
`
`Peripheral
`device
`
`
`
`
`
`Peripheral
`device
`
`Peripheral
`devrce
`
`
`
`FIGURE 14. Rmo-Wooldridge RW-400 system—the “polymorphic computer.”
`
`A short historical digression: The earliest
`known system employing a crossbar-type
`interconnection switch Was
`the Ramo-
`Wooldridge RW-400,
`the
`“Polymorphic
`Computer” system, developed for the U. S.
`Air Force for large command and control
`installations (Figure 14). The primary em-
`phasis in this system design was on at-
`taining very high system availability, par-
`ticularly as seen from each control position.
`Perhaps the most important. feature of the
`system was the real-time interaction of
`users through large display and control
`consoles; the purpose of the “central ex-
`change” shown in the figure was to permit
`the consoles to be connected to any com-
`puter to change functions and to provide
`backup. It can be seen that the system does
`not fit the definition of a multiprocessor
`ofiered earlier, for it is not possible to share
`memory. The RW-400 was thus only an-
`other multiple-computer system; however,
`it served an important role in the develop-
`ment of the crossbar concept.
`
`Mulliporl Memory Systems
`
`If the control, switching, and priority arbi-
`tration logic that is distributed throughout
`
`the crossbar switch matrix is concentrated
`
`a.
`at the interface to the memory units,
`multiport memory system is
`the result
`(Figure 15). This system organization is
`well suited to both uni— and multiprocessor
`system organizations and is used in both.
`The method often utilized to resolve mem-
`
`ory access conflicts is to assign permanently
`designated priorities to each memory port;
`the system can then be configured as neces-
`sary at each installation to provide the ap—
`propriate priority access to various memory
`boxes for each functional unit (Figure 16).
`Except for the priority associated with each,
`all of the ports are usually electrically and
`operationally identical. In fact, the ports are
`often merely a row of identical cable con—
`nectors, and electrically it makes no differ-
`ence whether an 1/0 or central processor is
`attached. Specifically, a system utilizing
`8-port memory units may have any mix-
`ture of processor and I/O units subject to
`the restrictions that there must be at least
`
`one of each and the total is eight or less.
`The priority for memory access associated
`with each processor or input/output channel
`is then established by the selection of the
`
`Computing Surveys. Vol. 9. No 1. March 1977
`
`Pa
`
`Page 12
`
`

`

`Multiprocessor Organization—A Survey
`
`-
`
`115
`
`
`
`FIGURE 16. Multiport—memory system organization—assignment of memory port priorities.
`
`connector used for cabling that unit to the
`memory.
`The flexibility possible in configuring the
`system also makes it possible to designate
`portions of memory as “private” to certain
`processors,
`I /0 units, or
`combinations
`thereof (Figure 17). This type of system
`organization can have definite advantages in
`increasing security against unauthorized ac-
`cess and may also permit the storage of re-
`covery routines in memory areas that are
`not susceptible to modification by other
`processors; however, there are also serious
`disadvantages in system recovery if the other
`processors are not able to access control and
`
`status information in a memory block as-
`sociated with a failed processor. One system
`that utilizes the private memory concept is
`the PRIME system, designed at the University
`of California at Berkeley [4].
`The multiport memory system organiza-
`tion also can support nonblocking access to
`the memory if a “full-connected” topology
`is utilized. Since each word access is a
`
`separate operation, it also permits "the ex-
`ploitation of interleaved memory addresses
`for access by a single processor; however,
`for multiple processors,
`interleaving may
`actually degrade memory performance by
`increasing the number of memory access
`
`Computing Surveys, Vol. 9, No. II March 1977
`
`
`
`Pa
`
`
`
`
`Page 13
`
`

`

`116
`
`-
`
`Philip H. Enslow Jr.
`
`
`
`FIGURE 17. Multiport-memory system organisation—~including priVate memories.
`
`conflicts that occur as all processors cycle
`through all memory following a sequence of
`consecutive addresses. Interleaving also re-
`sults in the efiective loss of more than one
`module of memory when there is a failure.
`With multiple processors it is often prefer-
`able to utilize the property of “locality of
`reference” and not attempt to increase the
`effective memory speed by interleaving.
`There are a number of examples of multi-
`port memory systems. Because of its flexi-
`bility and low cost for uniprocessor organi-
`zations,
`it is the most commonly found
`organization in large systems. The larger
`Honeywell
`systems utilize
`system con—
`trollers, each with eight ports for connection
`to processors,
`input/output multiplexers,
`front-end processors, and bulk storage sub-
`systems (Figure 18). The UNIVAC 1108 has a
`Multiple Modular Access (MMA) unit as-
`sociated with each memory bank of 65K
`words (Figure 19). The MMA has five
`priority-ordered ports. Another example of
`a multiported memory system utilized to
`interconnect multiple processors is the IBM
`System 360 Model 67 (Figure 2D).
`
`Compariscm of the Three Basie System
`Organizations
`
`specific applications. The most obvious are
`cost, reliability, flexibility, growth potential,
`and system throughput and transfer ca-
`pacity.
`
`Time-shared bus:
`
`I Lowest overall system cost for hard—
`ware.
`
`0 Least complex (the interconnection
`bus may be totally passive).
`0 Very easy to physically modify the
`hardware system configuration by
`adding or removing functional units.
`0 Overall system capacity limited by
`the bus transfer rate (this may be a
`severe restriction on overall system
`performance).
`0 Failure of the bus is a catastrophic
`system failure.
`0 Expanding the system by the addition
`of functional units may degrade over-
`all system performance (throughput).
`. The
`system efficiency
`attainable
`(based on the simultaneous use of all
`available units) is the lowest of all
`three basic interconnection systems.
`0 This organization is usually appropri—
`ate for smaller systems only.
`
`Crossbar:
`
`A number of factors can be considered in
`comparing the three basic organizations
`described above or evaluating their use in
`
`I This is the most complex interconnec-
`tion system.
`0 The functional units are the simplest
`
`Computing Surveys, Vol. 9, No. 1, March 19??
`
`
`
`Pa
`
`
`
`
`Page 14
`
`

`

`
`
`Multiprocessor Organization—A Survey
`

`
`117
`
`
`
`
` Processor
`
`Bulk store
`subsystem
`
`
`Processor
`
`
`
`centroller
`controller
`
`
`Datanet
`MO
`355
`multiplexer
`front end
`(IOM)
`network
`
`
`processor
`
`
`
`
`
`
`
`Peripherals
`
`Terminals
`
`FIGURE 18. Honeywell multiprocessor system organization.
`
`Bank 1
`Bank 4
`
`Mam
`Main
`storage
`storage
`
`
`
`
`
`
`Avai Iabllrty
`control
`u mt
`
`[ACU]
`
`- rocessor
`
`(CPU)
`
`
`
`
`
`16
`do channels
`
`Console
`
`
`
`0
`Controller
`IOC
`
`
`
`
`Ill—15
`
`0—15I
`
`To/irorn peripheral device interfaces
`FIGURE 19. Human 1108 multiprocessor system.
`
`Computing Surveys, Vol 9, No. 1, March 1977
`
`Pa
`
`Page 15
`
`

`

`118
`
`0
`
`Philip H. Enslow Jr.
`
`
`
`Processor
`
`controller
`
`
`
`
`
`
`
`Multiplex
`channel
`
`
`
`
`
`
`
`Selector
`channel
`
`Multiplex
`channel
`
`
`
`Selector
`channel
`
`FIGURE 20.
`
`IBM System 360 Model 67.
`
`and cheapest since the control and
`switching logic is in the switch. (The
`interfaces to the switch are simple and
`usually require no bus couplers.)
`Because a basic switching matrix is
`required to assemble any functional
`units into a working configuration,
`this organization is usually cost-effec-
`tive for multiprocessors only.
`There is a potential for the highest
`total transfer rate.
`
`for
`
`System expansion (addition of func-
`tional units) usually improves overall
`performance.
`There is the highest potential
`system efiiciency.
`There is a potential for system ex-
`pansion without
`reprogramming of
`the operating system.
`Theoretically, expansion of the system
`is limited only by the size of the
`switch matrix, which can often be
`modularly expanded within initial
`design or other engineering limitations.
`The reliability of the switch, and there-
`fore the system, can be improved by
`segmentation
`and/or
`redundancy
`within the switch.
`
`It is usually quite easy to partition
`the system to remove malfunctioning
`units or to establish separate systems.
`
`Multiport memory:
`
`Requires the most expensive memory
`units since most of the control and
`
`switching circuitry is included in the
`memory unit.
`The characteristics of the functional
`units permit a relatiVely low-cost uni-
`processor to be assembled from them.
`There is a potential for a very high
`total transfer rate in the overall sys-
`tern.
`
`The size and configuration options
`possible are determined (limited) by
`the number and type of memory ports
`available; this design decision is made
`quite early in the overall design
`process and is diflicult
`to modify.
`A large number of cables and con-
`nectors are required.
`All of the above characteristics are self-
`
`explanatory except those dealing with the
`overall system transfer capacity and the
`resulting system throughput. The time-
`shared bus organization obviously places the
`
`Computing Surveys, Vol. 9. No 1, March 1977
`
`
`
`Page 16
`
`

`

`Multiprocessor Organization—A Survey
`
`-
`
`119
`
`most severe limitations on the total transfer
`
`capability of the system, since it is limited
`by the bandwidth and speed of the single
`transfer bus. Consideration should be given
`to the number of units or modules attached
`
`to the bus, for each attachment loads the
`bus and reduces its speed while increasing
`the number of control signals on the bus.
`For example, since there can be only one
`memory transfer proceeding at any one
`time,
`there are advantages to using large
`memory modules rather than small ones.
`Also, as additional active functional units
`such as processors and input/output con-
`trollers are added to the bus, the number of
`requests for bus access increases, resulting
`in an increase in the number of conflicts
`
`that must be reolved; the overall effect is a
`Smaller number of effective transfers.
`(A
`similar result
`is obtained with a heavily
`loaded telephone exchange.) Many of these
`problems are avoided in the crossbar be-
`cause of the nature of the distribution of
`
`the conflict resolution circuitry. In the cross-
`bar system there is also the extremely im-
`portant advantage of having separate paths
`to each memory unit so that all memory
`modules are aVailable for simultaneous use.
`
`The multiport memory organization cen-
`tralizcs the cenfiict resolution function at the
`interface to the memory, and, although it is
`possible to configure a fully connected to-
`pology with a multiport system, the prob—
`lem

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket