`
`United States Patent
`Konda
`
`(10) Patent No.:
`(45) Date of Patent:
`
`US 9,509,634 B2
`Nov. 29, 2016
`
`USOO9509634B2
`
`(54) FAST SCHEDULING AND OPTMIZATION OF
`MULT-STAGE HERARCHICAL
`NETWORKS
`
`(71) Applicant: Konda Technologies Inc., San Jose,
`CA (US)
`
`(72) Inventor: Venkat Konda, San Jose, CA (US)
`
`(73) Assignee: Konda Technologies Inc., San Jose,
`CA (US)
`
`(*) Notice:
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 131 days.
`
`(21) Appl. No.: 14/329,876
`
`(22) Filed:
`(65)
`
`(2013.01)
`
`Jul. 11, 2014
`Prior Publication Data
`US 2015/0049768 A1
`Feb. 19, 2015
`Related U.S. Application Data
`(60) Provisional application No. 61/846,083, filed on Jul.
`15, 2013.
`(51) Int. Cl
`itouL 2/933
`(52) U.S. Cl
`CPC
`
`H04L 49/1515 (2013.01); H04L 49/102
`• us
`(2013.01)
`
`- - - - - - - - -
`(58) Field of Classification Search
`CPC
`HO4L 49/1515. HO4L 49/102
`See application file for complete search history.
`References Cited
`
`(56)
`
`U.S. PATENT DOCUMENTS
`5,345,444 A * 9, 1994 Cloonan ............. H04L 12,5601
`370,381
`5,541,914 A * 7/1996 Krishnamoorthy ... H04L 49/254
`37O/427
`
`ck
`
`5,654,695 A * 8/1997 Olnowich .......... H04Q 11/0478
`340,223
`6,091,723 A * 7/2000 Even ................... HO4L 49,1507
`340,221
`1/2002 Lee ....................... H04,319
`6,335,930 B1
`6,469,540 B2 * 10/2002 Nakaya ............ HO3K 19, 17728
`326/38
`7.440,449 B2 * 10/2008 Carson ...................... GO6T 7.20
`257/499
`2003/01 17946 A1* 6/2003 Fontana ................ HO4L 12/437
`370,216
`2/2011 Konda ................ GO6F 17,5077
`326,41
`2012/0269.190 A1* 10/2012 Konda .................. GO6F 17,509
`370,388
`
`2011 0037498 A1
`
`* cited by examiner
`y
`
`Primary Examiner — Rasheed Gidado
`
`ABSTRACT
`(57)
`Significantly optimized multi-stage networks with schedul
`ing methods for faster Scheduling of connections, useful in
`wide target applications, with VLSI layouts using only
`horizontal and Vertical links to route large scale Sub-inte
`grated circuit blocks having inlet and outlet links, and laid
`out in an integrated circuit device in a two-dimensional grid
`arrangement of blocks are presented. The optimized multi
`stage networks in each block employ several slices of rings
`of stages of switches with inlet and outlet links of sub
`integrated circuit blocks connecting to rings from either
`9.
`9.
`ring
`left-hand side only, or from right-hand side only, or from
`both left-hand side and right-hand side; and employ multi
`drop links where outlet links of cross links from switches in
`a stage of a ring in one Sub-integrated circuit block are
`connected to either inlet links of Switches in the another
`stage of a ring in the same or another Sub-integrated circuit
`block.
`
`20 Claims, 30 Drawing Sheets
`
`
`
`9.
`
`Page 1 of 68
`
`FLEX LOGIX EXHIBIT 1032
`
`
`
`U.S. Patent
`
`Nov. 29, 2016
`
`Sheet 1 of 30
`
`US 9,509,634 B2
`
`3
`
`
`
`£
`s
`s
`CO
`
`.
`
`s
`r
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`(2)od r \, ( (.) n.
`
`Computational
`Block
`
`Page 2 of 68
`
`
`
`U.S. Patent
`
`Nov. 29, 2016
`
`Sheet 2 of 30
`
`US 9,509,634 B2
`
`g
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Computational
`Block
`
`Page 3 of 68
`
`
`
`U.S. Patent
`
`Nov. 29, 2016
`
`Sheet 3 of 30
`
`US 9,509,634 B2
`
`i
`
`i
`
`
`
`A efies
`a fully "Z 301S
`(+Axxx)
`
`X, afes
`fully a gos
`
`Lu, afeS
`fu& eos
`
`(kty".
`
`(d+61)a
`
`O
`
`W
`
`-
`
`Page 4 of 68
`
`
`
`U.S. Patent
`U.S. Patent
`
`Nov. 29, 2016
`Nov.29, 2016
`
`Sheet 4 of 30
`Sheet 4 of 30
`
`US 9,509,634 B2
`US 9,509,634 B2
`
`FIG.1C2
`
`
`
`i
`
`Slice2,Ring2,Stage0
`i
`
`NS
`(eto
`
`(zz'Wea
`
`(Ve Heg
`
`(Z'z'2)oa
`
`('e'2ea
`
`FIG.1C1
`
`
`
`Oo
`oDOax
`wo
`
`s
`
`>&w
`1.
`as
`80
`.9
`O
`
`Page 5 of 68
`
`i
`Slice2,Ring1,
`
`Stage0
`
`Page 5 of 68
`
`
`
`U.S. Patent
`
`Nov. 29, 2016
`
`Sheet 5 of 30
`
`US 9,509,634 B2
`
`vOOOL
`
`porOW
`
`€5001
`
`tOLOld
`
`lu,aBes
`
`‘)Bury‘|,eas“Uswiz"yD
`
`‘¢Bury‘|Bons
`
`uUl,,OBEIS
`
`“;Sury‘Zzsong
`
`(,4uz't' Leg
`
`(z+x2'L ‘og
`
`(Lexz"Zr
`
`Cexber
`
`‘ZBury‘Zaos
`
`
`
`A,sabes
`
`gg:=(Zewz'ba
`
`(L+wz' og
`
`culeLtd
`
`)
`
`(L+Ae'eteded
`
`(LeA'2'2)C
`
`(Z4AZet)
`
`S(eke
`
`Page 6 of 68
`
`Page 6 of 68
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`U.S. Patent
`
`Nov. 29, 2016
`
`Sheet 6 of 30
`
`US 9,509,634 B2
`
`HRH
`
`s!
`
`SO I "OIDH
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`l
`A.
`
`=
`
`f
`
`Page 7 of 68
`
`
`
`U.S. Patent
`
`Nov. 29, 2016
`
`Sheet 7 of 30
`
`US 9,509,634 B2
`
`
`
`F.G. 2C
`
`200C
`
`200D
`
`Page 8 of 68
`
`
`
`U.S. Patent
`U.S. Patent
`
`Nov. 29, 2016
`Nov.29, 2016
`
`Sheet 8 of 30
`Sheet 8 of 30
`
`US 9,509,634 B2
`US 9,509,634 B2
`
`FIG. 2E
`FIG. 2E
`
`2
`
`Ri(k,2m+1)
`Fo(k,2m+1)
`|
`
`is
`Ef a
`a is O
`
`
`
`M
`
`Page 9 of 68
`
`Page 9 of 68
`
`
`
`U.S. Patent
`
`Nov. 29, 2016
`
`Sheet 9 Of 30
`
`US 9,509,634 B2
`
`FIG. 3A
`
`39A
`
`Ring "x", Stage "p+1"
`
`Ring "y", Stage"q"
`
`Ring "y", Stage "q+1"
`
`
`
`Page 10 of 68
`
`
`
`U.S. Patent
`
`Nov. 29, 2016
`
`Sheet 10 of 30
`
`US 9,509,634 B2
`
`FG. 3B
`
`Ring "x", Stage "p+1"
`
`Ring "x", Stage "p
`
`ii
`
`...)
`
`Ring "y", Stage "q"
`
`Ring "y", Stage "q+1"
`
`
`
`s --
`
`3.
`d
`
`Page 11 of 68
`
`
`
`U.S. Patent
`
`Nov. 29, 2016
`
`Sheet 11 of 30
`
`US 9,509,634 B2
`
`FIG. 3C
`
`39C
`
`Ring "x", Stage "p"
`
`Ring "x", Stage "p+1"
`
`
`
`
`
`Bo(x,2p+2)
`
`g
`
`Š iD
`
`C
`
`Hop(2.1)
`
`s
`
`Ring "y", Stage "q"
`
`Ring "y", Stage "q+1"
`
`
`
`Page 12 of 68
`
`
`
`U.S. Patent
`
`Nov. 29, 2016
`
`Sheet 12 of 30
`
`US 9,509,634 B2
`
`FIG. 3D
`
`Ring "x", Stage "p+1"
`
`39D
`
`Ring "x", Stage "p"
`
`
`
`Page 13 of 68
`
`
`
`U.S. Patent
`U.S. Patent
`
`Nov. 29, 2016
`Nov.29, 2016
`
`Sheet 13 of 30
`Sheet 13 of 30
`
`US 9,509,634 B2
`US 9,509,634 B2
`
`FIG. 3E
`
`
`
`a
`
`Foly.2q+3)
`>
`
`Ring “x”, Stage “p”
`
`
`Bofy,2q+1) Q
`Boly,2q+3)
`
`
`E > GC»
`q
`Gi
`. C
`Uily.2q+4)
`rue
`oL
`
`5
`lett
`Ci
`Ci
`31
`
`
`
`
`Uoly.2q+1
`
`Uoly,2q+3
`
`Ui(y,2q+3)
`
`
`
`Boly,2q+2)
`
`y2a%2)
`3
`
`Page 14 of 68
`
`Page 14 of 68
`
`
`
`U.S. Patent
`
`Nov. 29, 2016
`
`Sheet 14 of 30
`
`US 9,509,634 B2
`
`FIG. 4A
`
`39A
`
`
`
`Ring "x", Stage "p"
`
`Ring "y", Stage "q"
`
`Page 15 of 68
`
`
`
`U.S. Patent
`
`Nov. 29, 2016
`
`Sheet 15 Of 30
`
`US 9,509,634 B2
`
`FG. 4B
`
`Ring "x", Stage "p"
`
`39B
`
`
`
`
`
`Page 16 of 68
`
`
`
`U.S. Patent
`
`Nov.29, 2016
`
`Sheet 16 of 30
`
`US 9,509,634 B2
`
`FIG. 5A
`
`Ring “x”, Stage “p”
`
`Ring “x’, Stage “p+1"
`
`“
`
`.
`
`I
`e p+2)
`
`Ui(QX%.2p+2
`
`Hop(t.if
`Hop(2,1)7
`
`.
`
`ne
`
`Fo(y,2q+1)
`
`Ri(x,2p+1)
`/
`
`Rilx,2p+2)
`2
`
`S
`Bo(x,2p+1)
`
`|
`pi
`Boia)S|
`
`Ring “a”, Stage "s’
`
`Ri@.2s+1)
`)
`
`O
`
`R
`
`Ri(a,2s+2)
`/
`
`_—|ome]
`
`«
`
`Roly.2q+1)
`
`)-
`
`Bo(a,2s+2)
`
`Rily,2q+1)
`(
`
`D
`
`Rily.2q+2)
`?
`
`}
`
`Bo(y,2q+2))
`
`Page 17 of 68
`
`d ‘
`~_Hop(t.2)
`Uitb,2t+2)
`oDwog2azztyxRity.2q+3=zRily.2q+3)meny
`-<SB
`
` Rily.2q+4)
`oe!P|
`
` Boly.2q+3)
`i L4
`
` Boly,2q+4)JF
`
`=faves
`—&
`
`
`Fo(x,2p+1)
`}
`
`RiGc2pt3)
`‘
`
`FO(x,2p+3)
`DD
`
`Fo(x.2p+4)
`
`Uile.2p+3)
`
`Uibx20+4)
`
`:
`
`Ci
`a
`
`e
`
`\
`
`Ring “b”, Stage “t"
`
`ep
`
`Fo(b,2t+2)
`_)
`
`Ri(x,2p+4)
`;
`
`%)
`
`6
`
`Hop(2,2)
`
`Ri(b,2t+1)
`>
`
`Ri(b,2t+2)
`
`Ui(b,2t+1)
`o
`
`<nNBé&
`
`\oom2
`
`Foly.2qt4)
`
`Uily,2q+3)
`
`Uily.2qr4)
`\
`
`a&
`
`a2
`
`cea
`
`-/
`
`q
`
`.
`
`Cc
`
`Page 17 of 68
`
`
`
`U.S. Patent
`
`Nov. 29, 2016
`
`Sheet 17 Of 30
`
`US 9,509,634 B2
`
`FIG. 6A
`
`Ring "x", Stage "p"
`
`8.2
`
`59A
`
`
`
`
`
`
`
`
`
`
`
`Hop(1,2)
`
`y
`
`Fota,2s+1)
`
`---
`
`-
`
`s
`
`Ring "y", Stage"q"
`
`i.
`
`Page 18 of 68
`
`
`
`U.S. Patent
`
`US 9,509,634 B2
`
`=?H????,
`
`\aey
`
`Ring "a", Stage "s"
`
`| 5
`
`L
`
`
`
`Ring "y", Stage "d"
`
`!!!!
`
`) !!!
`
`Page 19 of 68
`
`
`
`U.S. Patent
`
`Nov. 29, 2016
`
`Sheet 19 Of 30
`
`US 9,509,634 B2
`
`FIG. 7A
`
`39A
`
`Stages in 1: Ring
`
`Stages in 2"Ring
`
`L1 V1 U1 H1 K1 L2
`
`AAAAAA
`Y-y/A4
`
`L1 V3 U3 L2 H3 K3 V5
`
`Page 20 of 68
`
`
`
`U.S. Patent
`
`Nov. 29, 2016
`
`Sheet 20 of 30
`
`US 9,509,634 B2
`
`
`
`8 "OIDH
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Page 21 of 68
`
`
`
`U.S. Patent
`
`Nov. 29, 2016
`
`Sheet 21 of 30
`
`US 9,509,634 B2
`
`FIG. 9A
`
`90
`
`YFick.2m+1)
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`FIG. 9D
`
`900D
`
`Page 22 of 68
`
`
`
`U.S. Patent
`
`Nov.29, 2016
`
`Sheet 22 of 30
`
`US 9,509,634 B2
`
`FIG. 10A
`
`1000A
`
`FIG. 10B
`
`1099B
`
`Fo(k.2m+1)
`3
`
`Fotk2me2)
`<
`
`Ri(k,2m#1)
`|
`
`RYi(k,2m+1)
`aq om42),
`“
`
`Uitk.2me)
`
`Botk2m+1)
`
`Fo(k,2m+1)
`3
`
`Fo(k,2m+2)
`re
`
`Ui(k,2m+1)
`
`
`
`FIG. 10C=1099¢ FIG. 10D=1099D
`
`!
`Fo(k,2m+1)
`
`FIG. 10F=1090F
`
`Vitk,2m+2)
`c
`?
`
`Ri(k,2m+1)
`,
`
`+4
`
`YRi(k,2m+1)_
`
`Ri(k,2m+2)
`.
`
`Bo(k.2m+t),
`
`Bo(k,2m+2)
`
`Page 23 of 68
`
`Fo(k,2m+1)
`a
`
`Ri(k2m+1)
`}
`i
`.
`RYi(k,2m+1)
`Fo(k,2m+2)
`_
`Fe
`Uickame P22)
`_
`Bo(k,2m+1)
`itk,2met)
`B
`Ui(k,2m+2)
`aS
`:
`
`Bo(k,2m+2)
`
`Fo(k,2m+1)
`3S
`Fork 2ma?+
`2m+2)
`.
`Ui(k,2m+1)
`a
`
`UYi(k,2m+1)
`
`Page 23 of 68
`
`
`
`U.S. Patent
`
`Nov. 29, 2016
`
`Sheet 23 of 30
`
`US 9,509,634 B2
`
`FIG 11 A
`
`11 OOA
`
`Rick,2m+1)
`
`
`
`FIG. 11C
`
`FIG. 11B uppe
`
`Page 24 of 68
`
`
`
`U.S. Patent
`U.S. Patent
`
`Nov. 29, 2016
`Nov. 29, 2016
`
`Sheet 24 of 30
`Sheet 24 of 30
`
`US 9,509,634 B2
`US 9,509,634 B2
`
`
`
`
`
`ul+d,aBeys*.x,Bury
`
`o0etZL‘Od
`
`texdz0ry
`
`
`
`.d,aBers‘x,Bury
`
`a?Oo”f
`
`(aden
`
`(
`
`¢
`
`(rede"In,
`
`(e+dzIAd
`
`Gedo)a
`
`{
`
`(ZL}doH™;
`
`(2Zd3H-
`
`
`
`ib+b,aberg“A,Bury
`
`(e+bz4ion
`
`.b,oes‘A,Bury
`
`(L+bzA)ou
`
`(z+be‘A)ou
`
`iz+dzxJog
`
`(2+bz"Airy
`
`(abz'Alog
`
`(z+bz‘Ajog
`
`
`
`Page 25 of 68
`
`Page 25 of 68
`
`
`
`
`
`
`
`
`
`
`
`
`U.S. Patent
`U.S. Patent
`
`Nov. 29, 2016
`
`Sheet 25 of 30
`
`US 9,509,634 B2
`US 9,509,634 B2
`
`coAOA
`
`
`
`\b”(e+bz‘A)o4(erbe“tLebzAllg
`
`(p+bz'A)In
`
`
`
`trezAles(2be'Alig
`
`
`
`(L402ISA
`
`
`
`ab+b,aBerg°,4,,Gury«b,Bers‘A,Bury
`
`
`
`
`
`a\
`
`oerel‘Od
`
`ÇI “OICH
`
`
`
`
`
`
`
`
`
`
`Page 26 of 68
`
`(zidoH
`
`(e2)doH|
`
`~_AVZ)d0H
`
`~(pdoy(2+42%)'0
`
`F(z4dzxJog
`
`\/
`
`
`
`(L+dz'x)o4Neredzooid
`
`.d,aes“x,Gury
`
`Page 26 of 68
`
`
`
`
`
`
`
`
`
`U.S. Patent
`U.S. Patent
`
`Nov. 29, 2016
`Nov. 29, 2016
`
`Sheet 26 of 30
`Sheet 26 of 30
`
`US 9,509,634 B2
`US 9,509,634 B2
`
` (g+bz‘A)o4
`
`
`(pede'xos(pedz'x)I4jo4(z+dzx)I4
`(g+dzx)oyL(r+dz‘xjo4(aden
`_¢we\
`(redeir(L+dZ59INA
`
`‘A,Bur!$aab,o6eIS‘A,Bury
`
`
`
`ubtb,aBeyg
`
`(+6244
`
`
`
`
`
`>)
`
`lexdz*xiga,
`
`(e+dg'xjl4
`
`(gsdz'xog
`
`(ZL)doH~~
`
`(z'z)doH
`
`onblOW
`
`Page 27 of 68
`
`d,obeyg(x,Bury
`
`Page 27 of 68
`
`
`
`U.S. Patent
`
`Nov. 29, 2016
`
`Sheet 27 Of 30
`
`US 9,509,634 B2
`
`
`
`
`
`SI "OIDH
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Page 28 of 68
`
`
`
`U.S. Patent
`U.S. Patent
`
`Nov. 29, 2016
`
`Sheet 28 of 30
`
`US 9,509,634 B2
`US 9,509,634 B2
`
` V9Old
`
`aIIV9l‘OM
`
`
`
`LI
`
`LvO09L
`
`
`
`J(z'W)dd(a0
`
`CV9Ol“Ol
`
`(.1V101d)
`
`
`
` (Z@)dod4/(L'b)doaJvV009LevooaL
`
`Z1I(Zana
`
`10110
`
`210+10
`
`
`
`
`
`PV9L‘DIAeVvol‘Old
`
`(GAY41011)Gay4011)
`
`(Z'%)d9
`
`Z10110
`
`(AY1011g)
`
`
`
`
`
`Page 29 of 68
`
`Page 29 of 68
`
`
`
`
`U.S. Patent
`
`Nov. 29, 2016
`
`Sheet 29 Of 30
`
`US 9,509,634 B2
`
`FIG. 17
`
`y
`
`Initialize the multicast ||
`Connection list
`
`1710
`
`Get the next multicast
`Connection
`
`1720
`
`-
`
`) if it is scheduled first time of
`2) if one of the nodes used by it is
`OverSubSCribed?
`
`
`
`if one of the nodes used by it is oversubscribed,
`then rip the multiCast Connection
`
`Using A* algorithm with the look-ahead cost
`using cost-table approach based Cost function,
`1) Find the least cost path from its source outlet
`link of the computational block to all the target
`inlet links of the computational blocks;
`2) Schedule it.
`
`Update demand cost and history cost of each
`link used.
`
`Count the OSN nodes in the entire multi-stage
`network. (OSN nodes are the links that are used
`by more than one connection)
`
`
`
`
`
`
`
`
`
`Scheduling
`SuCCessfully
`completed
`
`Page 30 of 68
`
`
`
`U.S. Patent
`
`Nov. 29, 2016
`
`Sheet 30 of 30
`
`US 9,509,634 B2
`
`F.G. 18
`
`y
`
`
`
`Schedule each multiCast Connection from its source Outlet link of the
`Computational block by treating the partial multi-stage hierarchical network | 1810
`of each Computational block as a fully connected single stage network, so -
`that no external wire/link is used by more than one connection.
`
`chedule partial multi-stage hierarchical network of each computationa
`|
`block by replacing all the single stage Connections with the actual partial
`multi-stage hierarchical network, so that no internal wire/link is used by more
`than One Connection.
`
`1820
`
`Page 31 of 68
`
`
`
`US 9,509,634 B2
`
`1.
`FAST SCHEDULING AND OPTMIZATION OF
`MULT-STAGE HERARCHICAL
`NETWORKS
`
`CROSS REFERENCE TO RELATED
`APPLICATIONS
`
`10
`
`15
`
`25
`
`30
`
`35
`
`This application is related to and claims priority of the
`U.S. Provisional Patent Application Ser. No. 61/846,083
`entitled “FAST SCHEDULING AND OPTIMIZATION OF
`MULTI-STAGE HIERARCHICAL NETWORKS” by
`Venkat Konda assigned to the same assignee as the current
`application, filed Jul. 15, 2013.
`This application is Continuation In Part application to and
`claims priority of the U.S. application Ser. No. 14/199,168
`entitled “OPTIMIZATION OF MULTI-STAGE HIERAR
`CHICAL NETWORKS FOR PRACTICAL ROUTING
`APPLICATIONS” by Venkat Konda assigned to the same
`assignee as the current application, filed Mar. 6, 2014 and
`the PCT Application Ser. No. PCT/US 12/53814 entitled
`“OPTIMIZATION OF MULTI-STAGE HIERARCHICAL
`NETWORKS FOR PRACTICAL ROUTING APPLICA
`TIONS''' by Venkat Konda assigned to the same assignee as
`the current application, filed Sep. 6, 2012, and both of them
`in turn are Continuation in Part applications to the U.S.
`Provisional Patent Application Ser. No. 61/531,615 entitled
`“OPTIMIZATION OF MULTI-STAGE HIERARCHICAL
`NETWORKS FOR PRACTICAL ROUTING APPLICA
`TIONS''' by Venkat Konda assigned to the same assignee as
`the current application, filed Sep. 7, 2011.
`This application is related to and incorporates by refer
`ence in its entirety the U.S. Pat. No. 8,270,400 entitled
`FULLY CONNECTED GENERALIZED MULTI-STAGE
`NETWORKS” by Venkat Konda assigned to the same
`assignee as the current application, issued Sep. 18, 2012, the
`U.S. Provisional Patent Application Ser. No. 60/905,526
`entitled LARGE SCALE CROSSPOINT REDUCTION
`WITH NONBLOCKING UNICAST & MULTICAST IN
`ARBITRARILY LARGE MULTI-STAGE NETWORKS
`by Venkat Konda assigned to the same assignee as the
`current application, filed Mar. 6, 2007, and the U.S. Provi
`sional Patent Application Ser. No. 60/940,383 entitled
`FULLY CONNECTED GENERALIZED MULTI-STAGE
`NETWORKS” by Venkat Konda assigned to the same
`assignee as the current application, filed May 25, 2007.
`This application is related to and incorporates by refer
`ence in its entirety the U.S. Pat. No. 8,170,040 entitled
`FULLY CONNECTED GENERALIZED BUTTERFLY
`FAT TREE NETWORKS by Venkat Konda assigned to the
`same assignee as the current application, issued May 1,
`2012, the U.S. Provisional Patent Application Ser. No.
`60/940,387 entitled “FULLY CONNECTED GENERAL
`IZED BUTTERFLY FAT TREE NETWORKS by Venkat
`Konda assigned to the same assignee as the current appli
`cation, filed May 25, 2007, and the U.S. Provisional Patent
`55
`Application Ser. No. 60/940,390 entitled “FULLY CON
`NECTED GENERALIZED MULTI-LINK BUTTERFLY
`FAT TREE NETWORKS by Venkat Konda assigned to the
`same assignee as the current application, filed May 25, 2007
`This application is related to and incorporates by refer
`ence in its entirety the U.S. Pat. No. 8.363,649 entitled
`FULLY CONNECTED GENERALIZED MULTI-LINK
`MULTI-STAGE NETWORKS” by Venkat Konda assigned
`to the same assignee as the current application, issued Jan.
`29, 2013, the U.S. Provisional Patent Application Ser. No.
`65
`60/940,389 entitled “FULLY CONNECTED GENERAL
`IZED REARRANGEABLY NONBLOCKING MULTI
`
`40
`
`45
`
`50
`
`60
`
`2
`LINK MULTI-STAGE NETWORKS by Venkat Konda
`assigned to the same assignee as the current application,
`filed May 25, 2007, the U.S. Provisional Patent Application
`Ser. No. 60/940,391 entitled “FULLY CONNECTED GEN
`ERALIZED FOLDED MULTI-STAGE NETWORKS by
`Venkat Konda assigned to the same assignee as the current
`application, filed May 25, 2007 and the U.S. Provisional
`Patent Application Ser. No. 60/940,392 entitled “FULLY
`CONNECTED GENERALIZED STRICTLY NON
`BLOCKING MULTI-LINK MULTI-STAGE NET
`WORKS by Venkat Konda assigned to the same assignee
`as the current application, filed May 25, 2007.
`This application is related to and incorporates by refer
`ence in its entirety the U.S. Pat. No. 8.269,523 entitled
`VLSI LAYOUTS OF FULLY CONNECTED GENERAL
`IZED NETWORKS by Venkat Konda assigned to the same
`assignee as the current application, issued Sep. 18, 2012, the
`PCT Application Ser. No. PCT/U08/64605 entitled “VLSI
`LAYOUTS OF FULLY CONNECTED GENERALIZED
`NETWORKS” by Venkat Konda assigned to the same
`assignee as the current application, filed May 22, 2008, and
`the U.S. Provisional Patent Application Ser. No. 60/940,394
`entitled VLSI LAYOUTS OF FULLY CONNECTED
`GENERALIZED NETWORKS” by Venkat Konda assigned
`to the same assignee as the current application, filed May 25,
`2007.
`This application is related to and incorporates by refer
`ence in its entirety the U.S. application Ser. No. 13/502.207
`entitled VLSI LAYOUTS OF FULLY CONNECTED
`GENERALIZED AND PYRAMID NETWORKS WITH
`LOCALITY EXPLOITATION” by Venkat Konda assigned
`to the same assignee as the current application, filed Apr. 16.
`2012, the PCT Application Ser. No. PCT/US10/52984
`entitled VLSI LAYOUTS OF FULLY CONNECTED
`GENERALIZED AND PYRAMID NETWORKS WITH
`LOCALITY EXPLOITATION” by Venkat Konda assigned
`to the same assignee as the current application, filed Oct. 16.
`2010, the U.S. Provisional Patent Application Ser. No.
`61/252,603 entitled “VLSI LAYOUTS OF FULLY CON
`NECTED NETWORKS WITH LOCALITY EXPLOITA
`TION’ by Venkat Konda assigned to the same assignee as
`the current application, filed Oct. 16, 2009, and the U.S.
`Provisional Patent Application Ser. No. 61/252,609 entitled
`VLSI LAYOUTS OF FULLY CONNECTED GENERAL
`IZED AND PYRAMID NETWORKS by Venkat Konda
`assigned to the same assignee as the current application,
`filed Oct. 16, 2009.
`
`BACKGROUND OF INVENTION
`
`Multi-stage interconnection networks such as Benes net
`works and butterfly fat tree networks are widely useful in
`telecommunications, parallel and distributed computing.
`However VLSI layouts, known in the prior art, of these
`interconnection networks in an integrated circuit are inefli
`cient and complicated.
`Other multi-stage interconnection networks including
`butterfly fat tree networks, Banyan networks, Batcher-Ban
`yan networks, Baseline networks, Delta networks, Omega
`networks and Flip networks have been widely studied par
`ticularly for self-routing packet Switching applications. Also
`Benes Networks with radix of two have been widely studied
`and it is known that Benes Networks of radix two are shown
`to be built with back to back baseline networks which are
`rearrangeably nonblocking for unicast connections.
`The most commonly used VLSI layout in an integrated
`circuit is based on a two-dimensional grid model comprising
`
`Page 32 of 68
`
`
`
`3
`only horizontal and vertical tracks. An intuitive intercon
`nection network that utilizes two-dimensional grid model is
`2D Mesh Network and its variations such as segmented
`mesh networks. Hence routing networks used in VLSI
`layouts are typically 2D mesh networks and its variations.
`However Mesh Networks require large scale cross points
`typically with a growth rate of O(N) where N is the number
`of computing elements, ports, or logic elements depending
`on the application.
`Multi-stage interconnection network with a growth rate of
`O(Nx log N) requires significantly small number of cross
`points. U.S. Pat. No. 6,185,220 entitled “Grid Layouts of
`Switching and Sorting Networks’ granted to Muthukrishnan
`et al. describes a VLSI layout using existing VLSI grid
`model for Benes and Butterfly networks. U.S. Pat. No.
`6,940,308 entitled “Interconnection Network for a Field
`Programmable Gate Array' granted to Wong describes a
`VLSI layout where switches belonging to lower stage of
`Benes Network are laid out close to the logic cells and
`Switches belonging to higher stages are laid out towards the
`center of the layout.
`Due to the inefficient and in some cases impractical VLSI
`layout of Benes and butterfly fat tree networks on a semi
`conductor chip, today mesh networks and segmented mesh
`networks are widely used in the practical applications such
`as field programmable gate arrays (FPGAs), programmable
`logic devices (PLDS), and parallel computing interconnects.
`The prior art VLSI layouts of Benes and butterfly fat tree
`networks and VLSI layouts of mesh networks and seg
`mented mesh networks require large area to implement the
`Switches on the chip, large number of wires, longer wires,
`with increased power consumption, increased latency of the
`signals which effect the maximum clock speed of operation.
`Some networks may not even be implemented practically on
`a chip due to the lack of efficient layouts.
`Fully connected Benes and butterfly fat tree networks are
`an over kill for certain practical routing applications and
`need to be optimized to significantly improve area, power
`and performance of the routing network.
`
`5
`
`10
`
`15
`
`25
`
`30
`
`35
`
`40
`
`US 9,509,634 B2
`
`4
`connected with shorter shuffle exchange links compared to
`the shuffle exchange links between spatially farther sub
`integrated circuit blocks. The optimized multi-stage net
`works provide high routability for broadcast, unicast and
`multicast connections, yet with the benefits of significantly
`lower cross points hence Smaller area, lower signal latency,
`lower power and with significant fast compilation or routing
`time. Various scheduling methods are also disclosed to
`schedule a set of multicast connections in the multi-stage
`hierarchical network.
`The optimized multi-stage networks V. (N. N. d, s)
`& V., (N1, N2, d. S) according to the current invention
`inherit the properties of one or more, in addition to addi
`tional properties, generalized multi-stage and pyramid net
`works V(N, N, d, s) & V (N, N, d, s), generalized folded
`multi-stage and pyramid networks V (N1, N2, d, s) &
`V (N. N. d, s), generalized butterfly fat tree and
`butterfly fat pyramid networks V., (N1, N2, d, s) & V (N,
`N. d, s), generalized multi-link multi-stage and pyramid
`networks V,ii (N1, N2, d, s) & V
`(N1, N2, d. S).
`generalized folded multi-link multi-stage and pyramid net
`Works Vini (N1, N2, d. S) & Vint- (N1, N2, d. S).
`generalized multi-link butterfly fat tree and butterfly fat
`pyramid networks V,ii-, (N1, N2, d, s) & V-2 (N.
`N. d, s), generalized hypercube networks V
`(N.N., d.
`s), and generalized cube connected cycles networks V,
`(N, N, d, s) for S-1, 2, 3 or any number in general.
`
`BRIEF DESCRIPTION OF DRAWINGS
`
`FIG. 1A is a diagram 100A of an exemplary partial
`multi-stage hierarchical network corresponding to one block
`with 4 inputs and 2 outputs of a computational block
`connecting only from left-hand side, to route practical
`applications such as FPGA routing of hardware designs in
`accordance with the invention.
`FIG. 1B is a diagram 100B of an exemplary partial
`multi-stage hierarchical network corresponding to one block
`with 8 inputs and 4 outputs of a computational block
`connecting from both left-hand side and right-hand side, to
`route practical applications such as FPGA routing of hard
`ware designs in accordance with the invention.
`FIG. 1C is a diagram 100C of an exemplary partial
`multi-stage hierarchical network corresponding to one
`block, by dividing the network into two parallel and inde
`pendent slices, with 16 inputs and 4 outputs of a computa
`tional block connecting from both left-hand side and right
`hand side, to route practical applications such as FPGA
`routing of hardware designs in accordance with the inven
`tion.
`FIG. 1C1 is a diagram 100C1, FIG. 1C2 is a diagram
`100C2, FIG. 1C3 is a diagram 100C3, and FIG. 1C4 is a
`diagram 100C4 illustrate the specific details of the diagram
`100C of FIG. 1C, particularly the connections between
`different slices.
`FIG. 1C5 is a diagram 100C5 illustrate the specific details
`of the diagram 100C of FIG. 1C, particularly the internal
`connections between two Successive stages of any ring of
`any slice, in one embodiment.
`FIG. 2A is a diagram 200A, in an embodiment of a stage
`in a ring of multi-stage hierarchical network corresponding
`to one block.
`FIG. 2B is a diagram 200B, in an embodiment of, a stage
`in a ring of multi-stage hierarchical network corresponding
`to one block.
`
`SUMMARY OF INVENTION
`
`Significantly optimized multi-stage networks for faster
`scheduling of connections, useful in wide target applica
`tions, with VLSI layouts (or floor plans) using only hori
`Zontal and vertical links to route large scale Sub-integrated
`circuit blocks having inlet and outlet links, and laid out in an
`integrated circuit device in a two-dimensional grid arrange
`ment of blocks, (for example in an FPGA where the sub
`integrated circuit blocks are Lookup Tables, or memory
`blocks, or DSP blocks) are presented. The optimized multi
`stage networks in each block employ several slices of rings
`of stages of switches with inlet and outlet links of sub
`integrated circuit blocks connecting to rings from either
`left-hand side only, or from right-hand side only, or from
`both left-hand side and right-hand side.
`The optimized multi-stage networks with their VLSI
`layouts employ shuffle exchange multi-drop links where
`outlet links of cross links from Switches in a stage of a ring
`in one Sub-integrated circuit block are connected to either
`inlet links of Switches in the another stage of a ring in
`another sub-integrated circuit block or inlet links of switches
`in the another stage of a ring in the same Sub-integrated
`circuit block so that said cross links are either vertical links
`or horizontal and Vice versa.
`The VLSI layouts exploit spatial locality so that different
`Sub-integrated circuit blocks that are spatially nearer are
`
`45
`
`50
`
`55
`
`60
`
`65
`
`Page 33 of 68
`
`
`
`5
`FIG. 2C is a diagram 200C, in an embodiment of, a stage
`in a ring of multi-stage hierarchical network corresponding
`to one block.
`FIG. 2D is a diagram 200D, in an embodiment of, a stage
`in a ring of multi-stage hierarchical network corresponding
`to one block.
`FIG. 2E is a diagram 200E, in an embodiment of a stage
`in a ring of multi-stage hierarchical network corresponding
`to one block.
`FIG. 2F is a diagram 200F, in an embodiment of, a stage
`in a ring of multi-stage hierarchical network corresponding
`to one block.
`FIG. 3A is a diagram 300A, in an embodiment of, all the
`connections between two successive stages of two different
`rings in the same block or in two different blocks of a
`multi-stage hierarchical network.
`FIG. 3B is a diagram 300B, in an embodiment of all the
`connections between two successive stages of two different
`rings in the same block or in two different blocks of a
`multi-stage hierarchical network.
`FIG. 3C is a diagram 300C, in an embodiment of all the
`connections between two successive stages of two different
`rings in the same block or in two different blocks of a
`multi-stage hierarchical network.
`FIG. 3D is a diagram 300D. in an embodiment of all the
`connections between two successive stages of two different
`rings in the same block or in two different blocks of a
`multi-stage hierarchical network.
`FIG. 3E is a diagram 300E, in an embodiment of, all the
`connections between two successive stages of two different
`rings in the same block or in two different blocks of a
`multi-stage hierarchical network.
`FIG. 4A is a diagram 400A, in an embodiment of, all the
`connections between different stages of two different rings
`in the same block or in two different blocks of a multi-stage
`hierarchical network.
`FIG. 4B is a diagram 400B, in an embodiment of all the
`connections between different stages of two different rings
`in the same block or in two different blocks of a multi-stage
`hierarchical network.
`FIG. 5A is a diagram 500A, in an embodiment of, all the
`connections with multi-drop hop wires, between two Suc
`cessive stages of two different rings in the same block or in
`two different blocks of a multi-stage hierarchical network.
`FIG. 6A is a diagram 600A, in an embodiment of, all the
`connections with multi-drop hop wires, between different
`stages of two different rings in the same block or in two
`different blocks of a multi-stage hierarchical network.
`FIG. 6B is a diagram 600B, in an embodiment of all the
`connections with multi-drop hop wires, between different
`stages of two different rings in the same block or in two
`different blocks of a multi-stage hierarchical network.
`FIG. 7A is a diagram 700A, is an embodiment of hop wire
`connection chart corresponding to a block of multi-stage
`hierarchical network, where the inter-ring connections are
`given between two Successive stages of two different rings
`as described in diagrams 300A of FIG. 3A to 300E of FIG.
`3E.
`FIG. 8 is a diagram 800, is an embodiment of 2D-grid of
`blocks with each block corresponding to a partial multi
`stage network to implement an exemplary multi-stage hier
`archical network, in accordance with the invention.
`FIG. 9A is a diagram 900A, in an embodiment of a stage
`in a ring of multi-stage hierarchical network corresponding
`to one block, with delay optimizations.
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`US 9,509,634 B2
`
`10
`
`15
`
`6
`FIG.9B is a diagram 900B, in an embodiment of, a stage
`in a ring of multi-stage hierarchical network corresponding
`to one block, with delay optimizations.
`FIG.9C is a diagram 900C, in an embodiment of, a stage
`in a ring of multi-stage hierarchical network corresponding
`to one block, with delay optimizations.
`FIG. 9D is a diagram 900D, in an embodiment of a stage
`in a ring of multi-stage hierarchical network corresponding
`to one block, with delay optimizations.
`FIG.9E is a diagram 900E, in an embodiment of a stage
`in a ring of multi-stage hierarchical network corresponding
`to one block, with delay optimizations.
`FIG. 10A is a diagram 1000A, in an embodiment of a
`stage in a ring of multi-stage hierarchical network corre
`sponding to one block, with delay optimizations.
`FIG. 10B is a diagram 1000B, in an embodiment of a
`stage in a ring of multi-stage hierarchical network corre
`sponding to one block, with delay optimizations.
`FIG. 10C is a diagram 1000C, in an embodiment of a
`stage in a ring of multi-stage hierarchical network corre
`sponding to one block, with delay optimizations.
`FIG. 10D is a diagram 1000D, in an embodiment of a
`stage in a ring of multi-stage hierarchical network corre
`sponding to one block, with delay optimizations.
`FIG. 10E is a diagram 1000E, in an embodiment of a
`stage in a ring of multi-stage hierarchical network corre
`sponding to one block, with delay optimizations.
`FIG. 10F is a diagram 1000F, in an embodiment of a
`stage in a ring of multi-stage hierarchical network corre
`sponding to one block, with delay optimizations.
`FIG. 11A is a diagram 1100A, in an embodiment of a
`stage in a ring of multi-stage hierarchical network corre
`sponding to one block, with delay optimizations.
`FIG. 11B is a diagram 1100B, in an embodiment of a
`stage in a ring of multi-stage hierarchical network corre
`sponding to one block, with delay optimizations.
`FIG. 11C is a diagram 1100C, in an embodiment of a
`stage in a ring of multi-stage hierarchical network corre
`sponding to one block, with delay optimizations.
`FIG. 12 is a diagram 1200, in an embodiment, all the
`connections between two Successive stages of two different
`rings in the same block or in two different blocks of a
`multi-stage hierarchical network with delay optimizations.
`FIG. 13 is a diagram 1300, in one embodiment, all the
`connections between two Successive stages of two different
`rings in the same block or in two different blocks of a
`multi-stage hierarchical network with delay optimizations.
`FIG. 14 is a diagram 1400, in an embodiment of all the
`connections between two Successive stages of two different
`rings in the same block or in two different blocks of a
`multi-stage hierarchical network with delay optimizations.
`FIG. 15 is a diagram 1500, in an embodiment of all the
`connections between two Successive stages of two different
`rings in the same block or in two different blocks of a
`multi-stage hierarchical network with delay optimizations.
`FIG.16A1 is a diagram 1600A1 of an exemplary prior art
`implementation of a two by two switch; FIG. 16A2 is a
`diagram 1600A2 for programmable integrated circuit prior
`art implementation of the diagram 1600A1 of FIG. 16A1;
`FIG. 16A3 is a diagram 1600A3 for one-time programmable
`integrated circuit prior art implementation of the diagram
`1600A1 of FIG. 16A1: FIG. 16A4 is a diagram 1600A4 for
`integrated circuit placement and route implementation of the
`diagram 1600A1 of FIG. 16A1.
`FIG. 17 is high-level flowchart of a scheduling method
`1700 according to the invention, used to set up a set of
`
`Page 34 of 68
`
`
`
`US 9,509,634 B2
`
`7
`multicast connections in the complete multi-stage hierarchi
`cal network as disclosed in the current invention.
`FIG. 18 is high-level flowchart of a scheduling method
`1800 according to the invention, used to set up a set of
`m