throbber
S—OOSO PCT
`
`OPTIMIZATION OF MULTI-STAGE HIERARCHICAL NETWORKS FOR
`
`PRACTICAL ROUTING APPLICATIONS
`
`Venkat Konda
`
`5
`
`CROSS REFERENCE TO RELATED APPLICATIONS
`
`This application is Continuation In Part Application to and claims priority of the
`
`US. Provisional Patent Application Serial No. 61/ 531,615 entitled "OPTIMIZATION
`
`OF MULTI—STAGE HIERARCHICAL NETWORKS FOR PRACTICAL ROUTING
`
`APPLICATIONS" by Venkat Konda assigned to the same assignee as the current
`
`10
`
`application, filed September 7, 2011.
`
`This application is related to and incorporates by reference in its entirety the US
`
`Application Serial No. 12/ 530,207 entitled "FULLY CONNECTED GENERALIZED
`
`MULTI—STAGE NETWORKS" by Venkat Konda assigned to the same assignee as the
`
`current application, filed September 6, 2009, the US. Provisional Patent Application
`
`15
`
`Serial No. 60/905,526 entitled "LARGE SCALE CROSSPOINT REDUCTION WITH
`
`NONBLOCKING UNICAST & MULTICAST IN ARBITRARILY LARGE MULTI-
`
`STAGE NETWORKS" by Venkat Konda assigned to the same assignee as the current
`
`application, filed March 6, 2007, and the US. Provisional Patent Application Serial No.
`
`60/940, 383 entitled "FULLY CONNECTED GENERALIZED MULTl-STAGE
`
`20
`
`NETWORKS" by Venkat Konda assigned to the same assignee as the current application,
`
`filed May 25, 2007.
`
`This application is related to and incorporates by reference in its entirety the US
`
`Application Serial No. 12/601,273 entitled "FULLY CONNECTED GENERALIZED
`
`BUTTERFLY FAT TREE NETWORKS" by Venkat Konda assigned to the same
`
`25
`
`assignee as the current application, filed November 22, 2009, the US. Provisional Patent
`
`Application Serial No. 60/940, 387 entitled "FULLY CONNECTED GENERALIZED
`
`BUTTERFLY FAT TREE NETWORKS" by Venkat Konda assigned to the same
`
`assignee as the current application, filed May 25, 2007, and the US Provisional Patent
`
`-1-
`
`Page 1 of 102
`
`FLEX LOGIX EXHIBIT 1006
`
`Page 1 of 102
`
`FLEX LOGIX EXHIBIT 1006
`
`

`

`S—OOSO PCT
`
`Application Serial No. 60 / 940, 390 entitled "FULLY CONNECTED GENERALIZED
`
`MULTI—LINK BUTTERFLY FAT TREE NETWORKS" by Venkat Konda assigned to
`
`the same assignee as the current application, filed May 25, 2007
`
`This application is related to and incorporates by reference in its entirety the US
`
`Application Serial No. 12/ 601,274 entitled "FULLY CONNECTED GENERALIZED
`
`MULTI—LINK MULTI—STAGE NETWORKS" by Venkat Konda assigned to the same
`
`assignee as the current application, filed November 22, 2009, the US. Provisional Patent
`
`Application Serial No. 60 / 940, 389 entitled "FULLY CONNECTED GENERALIZED
`
`REARRANGEABLY NONBLOCKING MULTI-LINK MULTI-STAGE NETWORKS"
`
`10
`
`by Venkat Konda assigned to the same assignee as the current application, filed May 25,
`
`2007, the US. Provisional Patent Application Serial No. 60/ 940, 391 entitled "FULLY
`
`CONNECTED GENERALIZED FOLDED MULTI—STAGE NETWORKS" by Venkat
`
`Konda assigned to the same assignee as the current application, filed May 25, 2007 and
`
`the US. Provisional Patent Application Serial No. 60/940, 392 entitled "FULLY
`
`15
`
`CONNECTED GENERALIZED STRICTLY NONBLOCKING MULTI—LINK MULTI-
`
`STAGE NETWORKS" by Venkat Konda assigned to the same assignee as the current
`
`application, filed May 25, 2007.
`
`20
`
`25
`
`This application is related to and incorporates by reference in its entirety the US
`
`Application Serial No. 12/ 601,275 entitled "VLSI LAYOUTS OF FULLY
`
`CONNECTED GENERALIZED NETWORKS" by Venkat Konda assigned to the same
`
`assignee as the current application, filed November 22, 2009, and the US. Provisional
`
`Patent Application Serial No. 60 / 940, 394 entitled "VLSI LAYOUTS OF FULLY
`
`CONNECTED GENERALIZED NETWORKS" by Venkat Konda assigned to the same
`
`assignee as the current application, filed May 25, 2007.
`
`This application is related to and incorporates by reference in its entirety the PCT
`
`Application Serial No. PCT/ U810 / 52984 entitled "VLSI LAYOUTS OF FULLY
`
`CONNECTED GENERALIZED AND PYRAMID NETWORKS WITH LOCALITY
`
`EXPLOITATION" by Venkat Konda assigned to the same assignee as the current
`
`application, filed October 16, 2010, the US. Provisional Patent Application Serial No.
`
`-2-
`
`Page 2 of 102
`
`Page 2 of 102
`
`

`

`S—OOSO PCT
`
`61/ 252, 603 entitled "VLSI LAYOUTS OF FULLY CONNECTED NETWORKS
`
`WITH LOCALITY EXPLOITATION" by Vcnkat Konda assigncd to thc samc assigncc
`
`as the current application, filed October 16, 2009, and the US. Provisional Patent
`
`Application Serial No. 61 / 252, 609 entitled "VLSI LAYOUTS OF FULLY
`
`CONNECTED GENERALIZED AND PYRAMID NETWORKS" by Venkat Konda
`
`assigned to the same assignee as the current application, filed October 16, 2009.
`
`BACKGROUND OF INVENTION
`
`10
`
`15
`
`20
`
`25
`
`Multi-stage interconnection networks such as Benes networks and butterfly fat
`
`tree networks are widely useful in telecommunications, parallel and distributed
`
`computing. However VLSI layouts, known in the prior art, of these interconnection
`
`networks in an integrated circuit are inefficient and complicated.
`
`Other multi—stage interconnection networks including butterfly fat tree networks,
`
`Banyan networks, Batcher—Banyan networks, Baseline networks, Delta networks, Omega
`
`networks and Flip networks have been widely studied particularly for self—routing packet
`
`switching applications. Also Benes Networks with radix of two have been widely studied
`
`and it is known that Benes Networks of radix two are shown to be built with back to back
`
`baseline networks which are rearrangeably nonblocking for unicast connections.
`
`The most commonly used VLSI layout in an integrated circuit is based on a two—
`
`dimensional grid model comprising only horizontal and vertical tracks. An intuitive
`
`interconnection network that utilizes two—dimensional grid model is 2D Mesh Network
`
`and its variations such as segmented mesh networks. Hence routing networks used in
`
`VLSI layouts are typically 2D mesh networks and its variations. However Mesh
`
`Networks require large scale cross points typically with a growth rate of 0(N2) where N
`
`is the number of computing elements, ports, or logic elements depending on the
`
`application.
`
`Multi—stage interconnection network with a growth rate of 0(N >< log N) requires
`
`significantly small number of cross points. US. Patent 6,185,220 entitled “Grid Layouts
`
`-3-
`
`Page 3 of 102
`
`Page 3 of 102
`
`

`

`S—OOSO PCT
`
`of Switching and Sorting Networks” granted to Muthukrishnan et al. describes a VLSI
`
`layout using existing VLSI grid model for Benes and Butterfly networks. US. Patent
`
`6,940,308 entitled “Interconnection Network for a Field Programmable Gate Array”
`
`granted to Wong describes a VLSI layout where switches belonging to lower stage of
`
`Benes Network are laid out close to the logic cells and switches belonging to higher
`
`stages are laid out towards the center of the layout.
`
`Due to the inefficient and in some cases impractical VLSI layout of Benes and
`
`butterfly fat tree networks on a semiconductor chip, today mesh networks and segmented
`
`mesh networks are widely used in the practical applications such as field programmable
`
`gate arrays (FPGAs), programmable logic devices (PLDs), and parallel computing
`
`interconnects. The prior art VLSI layouts of Benes and butterfly fat tree networks and
`
`VLSI layouts of mesh networks and segmented mesh networks require large area to
`
`implement the switches on the chip, large number of wires, longer wires, with increased
`
`power consumption, increased latency of the signals which effect the maximum clock
`
`speed of operation. Some networks may not even be implemented practically on a chip
`
`due to the lack of efficient layouts.
`
`Fully connected Benes and butterfly fat tree networks are an over kill for certain
`
`practical routing applications and need to be optimized to significantly improve area,
`
`power and performance of the routing network.
`
`SUMMARY OF INVENTION
`
`Significantly optimized multi-stage networks, useful in wide target applications,
`
`with VLSI layouts (or floor plans) using only horizontal and vertical links to route large
`
`scale sub-integrated circuit blocks having inlet and outlet links, and laid out in an
`
`integrated circuit device in a two-dimensional grid arrangement of blocks, (for example
`
`in an FPGA where the sub-integrated circuit blocks are Lookup Tables, or memory
`
`blocks, or DSP blocks) arc prcscntcd. The optimized multi-stagc networks in cach block
`
`employ several rings of stages of switches with inlet and outlet links of sub—integrated
`
`10
`
`15
`
`20
`
`25
`
`Page 4 of 102
`
`Page 4 of 102
`
`

`

`S—OOSO PCT
`
`circuit blocks connecting to rings from either left—hand side only, or from right—hand side
`
`only, or from both left—hand side and right—hand side.
`
`The optimized multi-stage networks with their VLSI layouts employ shuffle
`
`exchange links where outlet links of cross links from switches in a stage of a ring in one
`
`sub—integrated circuit block are connected to either inlet links of switches in the another
`
`stage of a ring in another sub-integrated circuit block or inlet links of switches in the
`
`another stage of a ring in the same sub-integrated circuit block so that said cross links are
`
`either vertical links or horizontal and vice versa.
`
`The VLSI layouts exploit spatial locality so that different sub-integrated circuit
`
`10
`
`blocks that are spatially nearer are connected with shorter shuffle exchange links
`
`compared to the shuffle exchange links between spatially farther sub—integrated circuit
`
`blocks. The optimized multi—stage networks provide high routability for broadcast,
`
`unicast and multicast connections, yet with the benefits of significantly lower cross points
`
`hence smaller area, lower signal latency, lower power and with significant fast
`
`15
`
`compilation or routing time.
`
`The optimized multi-stage networks VComb (N1,N2,d,s) & VD_Comb(N1,N2 ,d,S)
`
`according to the current invention inherit the properties of one or more, in addition to
`
`additional properties, generalized multi—stage and pyramid networks V (N1, N 2 , d, s) &
`
`VP (N1 , N2 , d , s) , generalized folded multi—stage and pyramid networks
`
`Vfald (N1 ,N2 , d, S) & VWW (N1 ,N2 , d ,S), generalized butterfly fat tree and butterfly fat
`
`pyramid networks Vbfl (N1, N2 , d,s) & bep (N1, N2,d, s) , generalized multi—link multi—
`
`stage and pyramid networks lemk (N1,N2,d,s) & Vm
`
`link—p
`
`(N1,N2,61, s), generalized
`
`folded multi-link multi-stage and pyramid networks VMHW (N1 , N2 , d, s) &
`
`VMd_mlmk_p (N1 , N2 , d , S), generalized multi-link butterfly fat tree and butterfly fat
`
`pyramid networks le,,k_bfl (N1 ,N2,61 ,S) & lefn,_bfp (N1 ,N2,d, S), generalized hypercube
`
`networks Vmbe (N1 , N2 , d, s) , and generalized cube connected cycles networks
`
`20
`
`25
`
`VCCC (N1 , N2, d, S) for s = 1,2,3 or any number in general.
`
`Page 5 of 102
`
`Page 5 of 102
`
`

`

`S—OOSO PCT
`
`BRIEF DESCRIPTION OF DRAWINGS
`
`FIG. 1A is a diagram 100A of an exemplary partial multi-stage hierarchical
`
`network corresponding to one block with 4 inputs and 2 outputs of a computational block
`
`connecting only from left—hand side, to route practical applications such as FPGA routing
`
`of hardware designs in accordance with the invention.
`
`FIG. 1B is a diagram 100B of an exemplary partial multi—stage hierarchical
`
`network corresponding to one block with 8 inputs and 4 outputs of a computational block
`
`connecting from both left-hand side and right—hand side, to route practical applications
`
`such as FPGA routing of hardware designs in accordance with the invention.
`
`10
`
`15
`
`20
`
`FIG. 2A is a diagram 200A, in an embodiment of, a stage in a ring of multi-stage
`
`hierarchical network corresponding to one block.
`
`FIG. 2B is a diagram 200B, in an embodiment of, a stage in a ring of multi—stage
`
`hierarchical network corresponding to one block.
`
`FIG. 2C is a diagram 200C, in an cmbodimcnt of, a stagc in a ring of multi-stagc
`
`hierarchical network corresponding to one block.
`
`FIG. 2D is a diagram 200D, in an embodiment of, a stage in a ring of multi-stage
`
`hierarchical network corresponding to one block.
`
`FIG. 2E is a diagram 200E, in an embodiment of, a stage in a ring of multi-stage
`
`hierarchical network corresponding to one block.
`
`FIG. 3A is a diagram 300A, in an embodiment of, all the connections between
`
`two successive stages of two different rings in the same block or in two different blocks
`
`of a multi—stagc hicrarchical nctwork.
`
`FIG. 3B is a diagram 300B, in an embodiment of, all the connections between two
`
`successive stages of two different rings in the same block or in two different blocks of a
`
`25
`
`multi—stage hierarchical network.
`
`Page 6 of 102
`
`Page 6 of 102
`
`

`

`S—OOSO PCT
`
`FIG. 4 is a diagram 400, in an embodiment of, all the connections between two
`
`successive stages of two different rings in the same block or in two different blocks of a
`
`multi—stage hierarchical network.
`
`FIG. 5 is a diagram 500, in an embodiment of, all the connections between two
`
`successive stages of two different rings in the same block or in two different blocks of a
`
`multi-stage hierarchical network.
`
`FIG. 6 is a diagram 600, in an embodiment of, all the connections between two
`
`successive stages of two different rings in the same block or in two different blocks of a
`
`multi-stage hierarchical network.
`
`FIG. 7 is a diagram 700,
`
`is an embodiment of hop wire connection chart
`
`corresponding to a block of multi-stage hierarchical network.
`
`FIG. 8 is a diagram 800, is an embodiment of 2D—grid of blocks with each block
`
`corresponding to a partial multi—stage network to implement an exemplary multi-stage
`
`hierarchical network, in accordance with the invention.
`
`FIG. 9A is a diagram 900A, in an embodiment of, a stage in a ring of multi-stage
`
`hierarchical network corresponding to one block, with delay optimizations.
`
`FIG. 9B is a diagram 900B, in an embodiment of, a stage in a ring of multi—stage
`
`hierarchical network corresponding to one block, with delay optimizations.
`
`FIG. 9C is a diagram 900C, in an embodiment of, a stage in a ring of multi-stage
`
`hierarchical network corresponding to one block, with delay optimizations.
`
`FIG. 9D is a diagram 900D, in an embodiment of, a stage in a ring of multi-stage
`
`hicrarchical nctwork corrcsponding to onc block, with dclay optimizations.
`
`FIG. 9E is a diagram 900E. in an embodiment of, a stage in a ring of multi-stage
`
`hierarchical network corresponding to one block, with delay optimizations.
`
`FIG. 10A is a diagram 1000A, in an embodiment of, a stage in a ring of multi-
`
`stage hierarchical network corresponding to one block, with delay optimizations.
`
`-7-
`
`10
`
`15
`
`20
`
`25
`
`Page 7 of 102
`
`Page 7 of 102
`
`

`

`S—OOSO PCT
`
`FIG. 10B is a diagram 1000B, in an embodiment of, a stage in a ring of multi—
`
`stage hierarchical network corresponding to one block, with delay optimizations.
`
`FIG. 10C is a diagram 1000C, in an embodiment of, a stage in a ring of multi—
`
`stage hierarchical network corresponding to one block, with delay optimizations.
`
`FIG. 10D is a diagram 1000D, in an embodiment of, a stage in a ring of multi-
`
`stage hierarchical network corresponding to one block, with delay optimizations.
`
`FIG. 10E is a diagram 1000B, in an embodiment of, a stage in a ring of multi-
`
`stage hierarchical network corresponding to one block, with delay optimizations.
`
`FIG. 10F is a diagram 1000F, in an embodiment of, a stage in a ring of multi—
`
`stage hierarchical network corresponding to one block, with delay optimizations.
`
`FIG. 11A is a diagram 1100A, in an embodiment of, a stage in a ring of multi—
`
`stage hierarchical network corresponding to one block, with delay optimizations.
`
`FIG. 11B is a diagram 1100B, in an embodiment of, a stage in a ring of multi—
`
`stage hierarchical network corresponding to one block, with delay optimizations.
`
`FIG. 11C is a diagram 1100C, in an cmbodimcnt of, a stagc in a ring of multi—
`
`stage hierarchical network corresponding to one block, with delay optimizations.
`
`FIG. 12 is a diagram 1200, in an embodiment, all the connections between two
`
`successive stages of two different rings in the same block or in two different blocks of a
`
`multi—stage hierarchical network with delay optimizations.
`
`FIG. 13 is a diagram 1300, in one embodiment, all the connections between two
`
`successive stages of two different rings in the same block or in two different blocks of a
`
`multi—stage hierarchical network with delay optimizations.
`
`FIG. 14 is a diagram 1400, in an embodiment of, all the connections between two
`
`successive stages of two different rings in the same block or in two different blocks of a
`
`multi-stage hierarchical network with delay optimizations.
`
`10
`
`15
`
`20
`
`25
`
`Page 8 of 102
`
`Page 8 of 102
`
`

`

`S—OOSO PCT
`
`FIG. 15 is a diagram 1500, in an embodiment of, all the connections between two
`
`successive stages of two different rings in the same block or in two different blocks of a
`
`multi—stage hierarchical network with delay optimizations.
`
`FIG. 16A1 is a diagram 1600Al of an exemplary prior art implementation of a
`
`two by two switch; FIG. 16A2 is a diagram 1600A2 for programmable integrated circuit
`
`prior art implementation of the diagram 1600Al of FIG. 16Al; FIG. 16A3 is a diagram
`
`1600A3 for one-time programmable integrated circuit prior art implementation of the
`
`diagram 1600A1 of FIG. 16A1; FIG. 16A4 is a diagram 1600A4 for intcgratcd circuit
`
`placement and route implementation of the diagram 1600A1 of FIG. 16A1.
`
`10
`
`DETAILED DESCRIPTION OF THE INVENTION
`
`Fully connected multi-stage hierarchical networks are an over kill in every
`
`dimension such as area, power, and performance for certain practical routing applications
`
`and need to be optimized to significantly improve savings in area, power and
`
`performance of the routing network. The present invention discloses several
`
`15
`
`embodiments of the optimized multi—stage hierarchical networks for practical routing
`
`applications along with their VLSI layout (floor plan) feasibility and simplicity.
`
`The multi—stage hierarchical networks considered for optimization in the current
`
`invention include: generalized multi—stage networks V(N1, N2,64] ,s) , generalized folded
`
`multi—stage networks Vfold (N1 , N2 , d, s) , generalized butterfly fat tree networks
`
`20
`
`[in
`Vbfl(N1,N2,d,s) , generalized multi-link multi-stage networks Vm k (N1,N2 , d, S) ,
`
`generalized folded multi-link multi—stage networks ijldwlmk (N1 , N2 , d , S) , generalized
`
`multi-link butterfly fat tree networks Vm,1.nk_bfl (N1 , N2, d, s) , generalized hypercube
`
`networks Vmbe (N1 , N2 , d, s) , and generalized cube connected cycles networks
`
`V (N1 , N2 , d, S) for s = 1,2,3 or any numbcr in gcncral. Altcrnativcly the optimized
`
`25
`
`multi—stage hierarchical networks disclosed in this invention inherit the properties of one
`
`or more of these networks, in addition to additional properties that may not be exhibited
`
`these networks.
`
`Page 9 of 102
`
`Page 9 of 102
`
`

`

`S—OOSO PCT
`
`The optimized multi-stage hierarchical networks disclosed are applicable for
`
`practical routing applications, with several goals such as: 1) all the signals in the design
`
`starting from an inlet link of the network to an outlet link of the network need to be setup
`
`without blocking. These signals may consist of broadcast, unicast and multicast
`
`connections; Each routing resource may need to be used by only one signal or
`
`connection; 2) physical area consumed by the routing network to setup all the signals
`
`needs to be small; 3) power consumption of the network needs to be small, after the
`
`signals are setup. Power may be both static power and dynamic power; 4) Delay of the
`
`signal or a connection needs to be small after it is setup through a path using several
`
`routing resources in the path. The smaller the delay of the connections will lead to faster
`
`performance of the design. Typically delay of the critical connections determines the
`
`performance of the design on a given network; 5) Designs need to be not only routed
`
`through the network (i.e., all the signals need to be setup from inlet links of the network
`
`to the outlet links of the network), but also the routing needs to be in faster time using
`
`efficient routing algorithms; 6) Efficient VLSI layout of the network is also critical and
`
`can greatly influence all the other parameters including the area taken up by the network
`
`on the chip, total number of wires, length of the wires, delay through the signal paths and
`
`hence the maximum clock speed of operation.
`
`The different varieties of multi—stage networks described in various embodiments
`
`in the current invention have not been implemented previously on the semiconductor
`
`chips. The practical application of these networks includes Field Programmable Gate
`
`Array (FPGA) chips. Current commercial FPGA products such as Xilinx’s Vertex,
`
`Altera’s Stratix, Lattice’s ECPX implement island-style architecture using mesh and
`
`segmented mesh routing interconnects using either full crossbars or sparse crossbars.
`
`These routing interconnects consume large silicon area for crosspoints, long wires, large
`
`signal propagation delay and hence consume lot of power.
`
`The cu1re11t invention discloses the optimization of multi—stage hierarchical
`
`networks for practical routing applications of numerous types of multi-stage networks.
`
`The optimizations disclosed in the current invention are applicable to including the
`
`10
`
`15
`
`20
`
`25
`
`-10-
`
`Page 10 of 102
`
`Page 10 of 102
`
`

`

`S—OOSO PCT
`
`numerous generalized multi-stage networks disclosed in the following patent
`
`applications:
`
`1) Strictly and rearrangeably nonblocking for arbitrary fan—out multicast and
`
`unicast for generalized multi-stage networks V(N1, N2 , d, s) with numerous connection
`
`topologies and the scheduling methods are described in detail in the US Application
`
`Serial No. 12/ 530,207 that is incorporated by reference above.
`
`2) Strictly and rearrangeably nonblocking for arbitrary fan-out multicast and
`
`unicast for generalized butterfly fat tree networks be, (N1 , N2 , d, S) with numerous
`
`connection topologies and the scheduling methods are described in detail in the US
`
`Application Serial No. 12/ 601,273 that is incorporated by reference above.
`
`3) Rearrangeably nonblocking for arbitrary fan-out multicast and unicast, and
`
`strictly nonblocking for unicast for generalized multi-link multi-stage networks
`
`mek (N1 , N2 , d, s) and generalized folded multi-link multi-stage networks
`
`0
`Vf
`
`,d_m,mk (N1, N2 , d, S) with numerous connection topologies and the scheduling methods
`
`are described in detail in the US Application Serial No. 12/ 601,274 that is incorporated
`
`by reference above.
`
`4) Strictly and rearrangeably nonblocking for arbitrary fan—out multicast and
`
`unicast for generalized multi—link butterfly fat tree networks lemk_bfl (N1 ,N2 , d ,s) with
`
`numerous conncction topologics and thc schcduling methods are dcscribcd in detail in the
`
`US Application Serial No. 12/ 601,273 that is incorporated by reference above.
`
`5) Strictly and rearrangeably nonblocking for arbitrary fan-out multicast and
`
`unicast for generalized folded multi-stage networks VfoM (N1 , N2 ,d , S) with numerous
`
`connection topologies and the scheduling methods are described in detail in the US
`
`Application Serial No. 12/ 601,274 that is incorporated by reference above.
`
`6) Strictly nonblocking for arbitrary fan—out multicast and unicast for gcncralizcd
`
`multi—link multi—stage networks szmk(N12N22 51,3) and generalized folded multi—link
`
`10
`
`15
`
`20
`
`25
`
`-11-
`
`Page 11 of 102
`
`Page 11 of 102
`
`

`

`S—OOSO PCT
`
`multi—stage networks anldww (N1 , N2 , d, s) with numerous connection topologies and
`
`the scheduling methods are described in detail in the US Application Serial No.
`
`12/ 601,274 that is incorporated by reference above.
`
`7) VLSI layouts of numerous types of multi—stage networks are described in the
`
`US Application Serial No. 12/ 601,275 entitled "VLSI LAYOUTS OF FULLY
`
`CONNECTED NETWORKS" that is incorporated by reference above.
`
`8) VLSI layouts of numerous types of multi-stage networks are described in the
`
`PCT Application Serial No. PCT/ U810 / 52984 entitled "VLSI LAYOUTS OF FULLY
`
`CONNECTED GENERALIZED AND PYRAMID NETWORKS WITH LOCALITY
`
`EXPLOITATION" that is incorporated by reference above.
`
`In addition the optimization with the VLSI layouts disclosed in the current
`
`invention are also applicable to generalized multi-stage pyramid networks
`
`VP (N1, N2 , d, s) , generalized folded multi-stage pyramid networks V/old—p (N1,1V2,d,S),
`
`gcncralizcd buttcrfly fat pyramid nctworks VW (N1 , N2 , d, s) , gcncralizcd multi—link
`
`multi-stage pyramid networks Vm
`
`link—p
`
`(N1 , N2 , d, s) , generalized folded multi-link multi-
`
`stage pyramid networks Vfoli,,_,,m.nk_p (N1, N2 , d, s) , generalized multi—link butterfly fat
`
`pyramid networks lemmfl, (N1 , N2 , d, s), generalized hypercube networks
`
`Vmbe (N1 , N2 , d, s) and generalized cube connected cycles networks VCCC(N1, N2 , (1,3)
`
`for s = 1,2,3 or any number in general.
`
`Finally the current invention discloses the optimizations and VLSI layouts of
`
`multi-stage hierarchical networks VComb (N1 ,N2 , d, s) and the optimizations and VLSI
`
`layouts of multi—stage hierarchical networks V040,“, (N1, N2 , d, s) for practical routing
`
`applications (particularly to set up broadcast, unicast and multicast connections), where
`
`“Comb” denotes the combination of and “D-Comb” denotes the delay optimized
`
`combination of any of the generalized multi-stage networks V(N1, N2 , d ,s) , generalized
`
`folded multi-stage networks Vfold(Z\/'1,N2 , d, s) , generalized butterfly fat tree networks
`
`10
`
`15
`
`20
`
`25
`
`-12-
`
`Page 12 of 102
`
`Page 12 of 102
`
`

`

`S—OOSO PCT
`
`Vbfl (N1 ,N2 , d , S) , generalized multi-link multi-stage networks me (N1 , N2 , d , S) ,
`
`generalized folded multi-link multi—stage networks Vfoldwlmk (N1 , N2 , d , S) , generalized
`
`multi—link butterfly fat tree networks lemk_bfl (N1 , N2, d, S) , generalized multi-stage
`
`pyramid networks VP (N1 , N2 , d ,S) , generalized folded multi—stage pyramid networks
`
`Vfold—p (N1 , N2 , d, S), gcncralizcd buttcrfly fat pyramid nctworks Vbfi, (Nl , N2 , d, S),
`
`generalized multi-link multi-stage pyramid networks leinkfi, (N1 , N2, d, S) , generalized
`
`folded multi—link multi—stage pyramid networks Vfo,d—mlz'nk—p (N1 , N2, d, S), generalized
`
`multi—link butterfly fat pyramid networks szmabfp (N1, N2 , d, S), generalized hypercube
`
`networks Vmbe (N1 , N2 , d, S), and generalized cube connected cycles networks
`
`10
`
`(‘z‘t‘
`V (N1,N2,d,S) for s = 1,2,3 or any number in general.
`
`Multi—stage hierarchical network chmb (N1 , N2 , d, S) :
`
`15
`
`20
`
`25
`
`Referring to diagram 100A in FIG. 1A, in one embodiment, an exemplary partial
`
`multi—stage hierarchical network VComb (N1 , N2 ,d, S) where N1 = 200; N2 = 400; d = 2;
`
`and s = 1 corresponding to one computational block, with each computational block
`
`having 4 inlet links namely 11, 12, I3, and I4; and 2 outlet links namely 01 and 02. And
`
`for each computational block the corresponding partial multi-stage hierarchical network
`
`VComb (N1 , N2,61 ,S) 100A consists of two rings 110 and 120, where ring 110 consists of
`
`“m+l” stages namely (ring 1, stage 0), (ring 1, stage 1),
`
`(ring 1, stage “m-l”), and
`
`(ring 1, stage “m”), and ring 120 consists of “n+1” stages namely (ring 2, stage 0), (ring
`
`2, stage 1),
`
`(ring 2, stage “n-1”), and (ring 2, stage “n”), where “m” and n are
`
`positive integers.
`
`Ring 110 has inlet links Ri(1,1) and Ri(1,2), and has outlet links Bo(1,1) and
`
`Bo(1,2). Ring 120 has inlet links Fi(2,1) and Fi(2,2), and outlet links Bo(2,1) and
`
`Bo(2,2). And hence the partial multi—stage hierarchical network VComb (N1 , N2 , d , S) 100A
`
`consists of 4 inlet links and 4 outlet links corresponding to the two rings 110 and 120.
`
`Outlet link 01 of the computational block is connected to inlet link Ri(1, 1) of ring 110
`
`and also inlet link of Fi(2, 1) of ring 120. Similarly outlet link 02 of the computational
`
`-13-
`
`Page 13 of 102
`
`Page 13 of 102
`
`

`

`S—OOSO PCT
`
`block is connected to inlet link Ri(1,2) of Ring 110 and also inlet link of Fi(2,2) of Ring
`
`120. And outlet link Bo(1,1) of Ring 110 is connected to inlet link 11 of the
`
`computational block. Outlet link Bo(l ,2) ofRing l 10 is connected to inlet link T2 ofthe
`
`computational block. Similarly outlet link Bo(2, 1) of Ring 120 is connected to inlet link
`
`13 of the computational block. Outlet link Bo(2,2) of Ring 120 is connected to inlet link
`
`14 of the computational block. Since in this embodiment outlet link 01 of the
`
`computational block is connected to both inlet link Ri(1,1) of ring 110 and inlet link
`
`Fi(2,1) of ring 120; and outlet link 02 of the computational block is connected to both
`
`inlet link Ri(1,2) of ring 110 and inlet link Fi(2,2) of ring 120, the partial multi—stage
`
`hierarchical network VComb(N1,N2, d,s) 100A consists of 2 inlet links and 4 outlet links.
`
`The two dimensional grid 800 in FIG. 8 illustrates an exemplary arrangement of
`
`100 blocks arranged in 10 rows and 10 columns, in an embodiment. Each row of 2D—grid
`
`consisting of 10 block numbers namely the first row consists of the blocks (1,1), (1,2),
`
`(1,3),
`
`, (1,9), and (1,10). The second row consists ofthe blocks (2,1), (2,2), (2,3),
`
`,
`
`(2,9), and (2,10). Similarly 2D—grid 800 consists of 10 rows of each with 10 blocks and
`
`finally the tenth row consists ofthe blocks (10,1), (10,2), (10,3),
`
`, (10,9), and (10,10).
`
`Each block of 2D-grid 800, in one embodiment, is part of the die area of a semiconductor
`
`integrated circuit, so that the complete 2D-grid 800 of 100 blocks represents the complete
`
`die of the semiconductor integrated circuit. In one embodiment, each block of 2D-grid
`
`800 consists of one of the partial multi-stage hierarchical network
`
`VCDmb (N1 , N2 ,d, s) 100A with 2 inlet links and 4 outlet links and the corresponding
`
`computational block with 4 inlet links and 2 outlet links. For example block (1,1) of 2D—
`
`grid 800 consists of one of the partial multi-stage hierarchical network
`
`chmb (N1 , N2 ,d, s) 100A with 2 inlet links and 4 outlet links and the corresponding
`
`computational block with 4 inlet links and 2 outlet links. Similarly each of the 100 blocks
`
`of 2D-grid 800 has a separate partial multi-stage hierarchical network
`
`VComb (N1 , N2 ,d, s) 100A with 2 inlet links and 4 outlet links and the corresponding
`
`computational block with 4 inlet links and 2 outlet links. Hence the complete multi—stage
`
`hierarchical network VCmb(N1,N2, d, s) corresponding to 2D-grid 800 has N1 = 200 inlet
`
`links and N2 = 400 outlet links. And there are 100 computational blocks each one
`
`-14-
`
`10
`
`15
`
`20
`
`25
`
`30
`
`Page 14 of 102
`
`Page 14 of 102
`
`

`

`S—OOSO PCT
`
`corresponding to one of the blocks with each computational block having 4 inlet links and
`
`2 outlet links. Also the 2D-grid 800 is organized in the fourth quadrant of the 2D-Plane.
`
`In other embodiments the 2D-grid 800 may be organized as either first quadrant, or
`
`second quadrant or third quadrant of the 2D—P1ane.
`
`Referring to partial multi-stage hierarchical network VComb (N1 ,N2, d, s) 100A in
`
`FIG. 1A, the stage (ring 1, stage 0) consists of 4 inputs namely Ri(l,1), Ri(1,2), Ui(1,l),
`
`and Ui(l,2); and 4 outputs Bo(l,1), Bo(1,2), Fo(1,1), and Fo(1,2). The stage (ring 1, stage
`
`0) also consists of eight 2:1 multiplexers (A multiplexer is hereinafter called a “mux”)
`
`namely R(l,1), R(l,2), F(1,1), F(1,2), U(1,1), U(1,2), B(1,1), and B(1,2). The 2:1 Mux
`
`R(1,1) has two inputs namely Ri(1,1) and Bo(1,l) and has one output Ro(1,1). The 2:1
`
`Mux R(1,2) has two inputs namely Ri(1,2) and Bo(1,2) and has one output Ro(l,2). The
`
`2:1 Mux F(1,1) has two inputs namely Ro(1,l) and Ro(1,2) and has one output Fo(1,1).
`
`The 2:1 Mux F(1,2) has two inputs namely Ro(1,1) and Ro(1,2) and has one output
`
`Fo(1,2).
`
`The 2:1 Mux U(1,1) has two inputs namely Ui(1,1) and Fo(1,1) and has one
`
`output Uo( 1,1). The 2:1 Mux U( 1,2) has two inputs namely Ui(1,2) and Fo(1,2) and has
`
`one output Uo(l,2). The 2:1 Mux B(1,l) has two inputs namely Uo(l,1) and Uo(l,2) and
`
`has one output Bo(1,1). The 2:1 Mux B(1,2) has two inputs namely Uo(1, 1) and Uo(1,2)
`
`and has one output Bo(1,2).
`
`Thc stagc (ring 1, stagc 1) consists of 4 inputs namcly Ri(1,3), Ri(1,4), Ui(l,3),
`
`and Ui(l,4); and 4 outputs Bo(1,3), Bo(1,4), Fo(1,3), and Fo(1,4). The stage (ring 1, stage
`
`1) also consists of eight 2:1 Muxes namely R(l,3), R(l,4), F(l,3), F(l,4), U(l,3), U(l,4),
`
`B(1,3), and B(1,4). The 2:1 Mux R(1,3) has two inputs namely Ri(1,3) and Bo(1,3) and
`
`has one output Ro(1,3). The 2:1 Mux R(1,4) has two inputs namely Ri(l,4) and Bo(l,4)
`
`and has one output Ro(1,4). The 2:1 Mux F(1,3) has two inputs namely Ro(1,3) and
`
`Ro(1,4) and has one output Fo(1,3). The 2:1 Mux F(1,4) has two inputs namely Ro(l,3)
`
`and Ro(1,4) and has one output Fo(1,4).
`
`The 2:1 Mux U(1,3) has two inputs namely Ui(1,3) and Fo(1,3) and has one
`
`output Uo(l,3). The

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket