`Vorbach
`
`US 11,061,682 B2
`( 10 ) Patent No .:
`( 45 ) Date of Patent :
`Jul . 13 , 2021
`
`US011061682B2
`
`( 54 ) ADVANCED PROCESSOR ARCHITECTURE
`( 71 ) Applicants : Martin Vorbach , Lingenfeld ( DE ) ;
`Hyperion Core , Inc. , Los Gatos , CA
`( US )
`( 72 ) Inventor : Martin Vorbach , Lingenfeld ( DE )
`( * ) Notice :
`Subject to any disclaimer , the term of this
`patent is extended or adjusted under 35
`U.S.C. 154 ( b ) by 0 days .
`15 / 535,697
`Dec. 13 , 2015
`PCT / US2015 / 065418
`
`( 65 )
`
`( 21 ) Appl . No .:
`( 22 ) PCT Filed :
`( 86 ) PCT No .:
`$ 371 ( c ) ( 1 ) ,
`( 2 ) Date :
`Jun . 13 , 2017
`( 87 ) PCT Pub . No .: WO2016 / 100142
`PCT Pub . Date : Jun . 23 , 2016
`Prior Publication Data
`US 2018/0004530 A1
`Jan. 4 , 2018
`Foreign Application Priority Data
`( 30 )
`Dec. 15 , 2014
`( EP )
`14197929
`( EP )
`Jun . 24 , 2015
`150201036
`( 51 ) Int . CI .
`GO6F 9/38
`G06F 9/30
`GOOF 9/32
`( 52 ) U.S. CI .
`CPC
`
`( 2018.01 )
`( 2018.01 )
`( 2018.01 )
`
`G06F 9/3855 ( 2013.01 ) ; GO6F 9/3001
`( 2013.01 ) ; G06F 9/3017 ( 2013.01 ) ;
`( Continued )
`( 58 ) Field of Classification Search
`CPC
`G06F 9/3855
`See application file for complete search history .
`
`srco src1
`
`0211 v
`
`r3
`
`rr1 72 rr3
`
`0231
`
`1
`
`I
`
`r3 src2
`
`0212
`tgtorr1
`
`02321 1
`
`0121 src3
`0213
`tgt1 rr1 rr2 rr3
`
`( 56 )
`
`5,699,537 A
`5,923,862 A *
`
`References Cited
`U.S. PATENT DOCUMENTS
`12/1997 Sharangpani et al .
`7/1999 Nguyen
`( Continued )
`FOREIGN PATENT DOCUMENTS
`
`EP
`WO
`
`9/2014
`14185745.8
`2002/071249
`9/2002
`( Continued )
`
`G06F 9/28
`712/208
`
`OTHER PUBLICATIONS
`Cheol - Ho Jeong , Woo - Chan Park , Tack - Don Han , Sang - Woo Kim
`and Moon - Key Lee , “ In order issue out - of - order execution floating
`point coprocessor for CalmRISC32 , ” Proceedings 15th IEEE Sym
`posium on Computer Arithmetic . ARITH - 15 2001 , Vail , CO , USA ,
`2001 , pp . 195-200 , doi : 10.1109 / ARITH.2001.930119 . *
`( Continued )
`Primary Examiner Michael J Metzger
`( 74 ) Attorney , Agent , or Firm IP Spring
`
`( 57 )
`ABSTRACT
`The invention relates to a method for processing instructions
`out - of - order on a processor comprising an arrangement of
`execution units . The inventive method comprises looking up
`operand sources in a Register Positioning Table and setting
`operand input references of the instruction to be issued
`accordingly , checking for an Execution Unit ( EXU ) avail
`able for receiving a new instruction , and issuing the instruc
`tion to the available Execution Unit and entering a reference
`of the result register addressed by the instruction to be issued
`to the Execution Unit into the Register Positioning Table
`( RPT ) .
`
`20 Claims , 26 Drawing Sheets
`
`instr r3 , srco , src1 0221
`0222
`1
`instr tgt0 , r3 , src2 0223
`0224
`instr tgt1 , r3 , src3 0225
`0226
`instr r3 , src4 , src5 0227
`
`0201
`
`0116
`T
`
`0131
`
`WIZ, Inc. EXHIBIT - 1069
`WIZ, Inc. v. Orca Security LTD.
`
`
`
`US 11,061,682 B2
`US 11,061,682 B2
`
`Page 2
`Page 2
`
`( 52 ) U.S. CI .
`(52) U.S. Cl
`CPC
`G06F 9/30065 ( 2013.01 ) ; G06F 9/30098
`CPC...... GO6F 9/30065 (2013.01); GO6F 9/30098
`( 2013.01 ) ; G06F 9/325 ( 2013.01 ) ; G06F
`(2013.01); GO6F 9/325 (2013.01); GO6F
`9/327 ( 2013.01 ) ; G06F 9/382 ( 2013.01 ) ;
`9/327 (2013.01); GO6F 9/382 (2013.01):
`G06F 9/384 ( 2013.01 ) ; G06F 9/3824
`GO6F 9/384 (2013.01); GO6F 9/3824
`( 2013.01 ) ; G06F 9/3826 ( 2013.01 ) ; G06F
`(2013.01); GO6F 9/3826 (2013.01); GO6F
`9/3828 ( 2013.01 ) ; G06F 9/3836 ( 2013.01 ) ;
`9/3828 (2013.01); GO6F 9/3836 (2013.01);
`G06F 9/3842 ( 2013.01 ) ; G06F 9/3844
`GO6F 9/3842 (2013.01); GO6F 9/3844
`( 2013.01 ) ; G06F 9/3846 ( 2013.01 ) ; G06F
`(2013.01); GO6F 9/3846 (2013.01); GO6F
`9/3848 ( 2013.01 ) ; G06F 9/3857 ( 2013.01 ) ;
`9/3848 (2013.01); GO6F 9/3857 (2013.01);
`G06F 9/3861 ( 2013.01 ) ; GO6F 9/3863
`GO6F 9/3861 (2013.01); GO6F 9/3863
`( 2013.01 ) ; GO6F 9/3885 ( 2013.01 )
`(2013.01); GO6F 9/3885 (2013.01)
`References Cited
`References Cited
`U.S. PATENT DOCUMENTS
`U.S. PATENT DOCUMENTS
`7,096,345 B1*
`8/2006 Chen ou... GO6F 9/3838
`7,096,345 B1 *
`G06F 9/3838
`8/2006 Chen
`712/217
`712/217
`9/2006 Coffer oo... GO6F 9/3836
`7FAM,IS2 BIL*
`9/2006 Cofler
`7,111,152 B1 *
`G06F 9/3836
`712/216
`712/216
`2002/0188828 Al * 12/2002 Sugimoto
`2002/0188828 Al* 12/2002 Sugimoto ........... GO6F 9/30083
`G06F 9/30083
`712/215
`712/215
`2007/0055852 Al*
`3/2007 Hanes oo... GO6F 9/3851
`2007/0055852 Al *
`3/2007 Hanes
`G06F 9/3851
`712/228
`712/228
`8/2013 Vorbach.............. GO6F 9/30043
`2013/0205123 Al*
`2013/0205123 A1 *
`8/2013 Vorbach
`GO6F 9/30043
`712/221
`712/221
`
`( 56 )
`(56)
`
`FOREIGN PATENT DOCUMENTS
`FOREIGN PATENT DOCUMENTS
`2010/142432
`2/2010
`2010/142432
`2/2010
`2010/043401
`4/2010
`4/2010
`2010/043401
`2011/079942
`7/2011
`7/2011
`2011/079942
`2012/003997
`1/2012
`1/2012
`2012/003997
`2012/123061
`9/2012
`9/2012
`2012/123061
`20 12/167933
`12/2012
`2012/167933
`12/2012
`20 13/098643
`7/2013
`2013/098643
`7/2013
`
`WO
`WO
`WO
`WO
`WO
`WO
`WO
`WO
`WO
`WO
`WO
`WO
`WO
`WO
`
`OTHER PUBLICATIONS
`OTHER PUBLICATIONS
`“ARM7TDMI-S Data Sheet”, Document No. ARM DDI 0084D,
`" ARM7TDMI - S Data Sheet " , Document No. ARM DDI 0084D ,
`ARM Ltd. , UK , 1998 , 60 pages .
`ARM Ltd., UK, 1998, 60 pages.
`Balasubramonian, “Lecutre Notes: Out-of-Order Processors”, Uni-
`Balasubramonian , “ Lecutre Notes : Out - of - Order Processors ” , Uni
`versity of Utah , Oct. 13 , 2007 , 8 pages .
`versity of Utah, Oct. 13, 2007, 8 pages.
`
`European Patent Office , International Search Report for Interna
`European Patent Office, International Search Report for Interna-
`tional Patent Application No. PCT / US2015 / 065418 , dated Jul . 7 ,
`tional Patent Application No. PCT/US2015/065418, dated Jul. 7,
`2016 , 6 pages .
`2016, 6 pages.
`European Patent Office , Written Opinion for International Patent
`European Patent Office, Written Opinion for International Patent
`Application No. PCT / US2015 / 065418 , dated Jul . 7 , 2016 , 8 pages .
`Application No. PCT/US2015/065418, dated Jul. 7, 2016, 8 pages.
`Fog ,
`" The microarchitecture of Intel , AMD and VIA CPUs : An
`Fog, “The microarchitecture of Intel, AMD and VIA CPUs: An
`optimization guide for assembly programmers and compiler mak
`optimization guide for assembly programmers and compiler mak-
`ers ” , http://www.agner.org/optimize/microarchitecture.pdf , 1996
`ers”, http://www.agner.org/optimize/microarchitecture.pdf, 1996-
`2017, 233 pages.
`2017 , 233 pages .
`Goulding - Hotta , et al . , “ The GreenDroid Mobile Application Pro
`Goulding-Hotta, et al., “The GreenDroid Mobile Application Pro-
`cessor : an Architecture for Silicon's Dark Future ” , University
`cessor: an Architecture for Silicon’s Dark Future”, University
`California , San Diego ; Published by the IEEE Computer Society ,
`California, San Diego; Published by the IEEE Computer Society,
`Mar./Apr . 2011 , 10 pages .
`Mar./Apr. 2011, 10 pages.
`Gunadi , et al . , “ CRIB : Consolidated Rename , Issue , and Bypass ” ,
`Gunadi, et al., “CRIB: Consolidated Rename, Issue, and Bypass”,
`ISCA’11, Jun. 4-8, 2011, San Jose, California, USA, 2011, 10
`ISCA’11 , Jun . 4-8 , 2011 , San Jose , California , USA , 2011 , 10
`pages.
`pages .
`Rotenberg , et al . , “ Trace Cache : a Low Latency Approach to High
`Rotenberg, et al., “Trace Cache: a Low Latency Approach to High
`Bandwith Instruction Fetching ” , Proceedings of the 29th annual
`Bandwith Instruction Fetching”, Proceedings of the 29th annual
`ACM / IEEE international symposium on Microarchitecture . IEEE
`ACM/IEEEinternational symposium on Microarchitecture. IEEE
`Computer Society , 1996 , 12 pages .
`Computer Society, 1996, 12 pages.
`Thomadakis, “The Architecture of the Nehalem Processor and
`Thomadakis , “ The Architecture of the Nehalem Processor and
`Nehalem - EP SMP Platforms ” , Texas A & M University , Mar. 17 ,
`Nehalem-EP SMP Platforms”, Texas A&M University, Mar. 17,
`2011, 49 pages.
`2011 , 49 pages .
`Tomasulo , “ An Efficient Algorithm for Exploiting Multiple Arith
`Tomasulo, “An Efficient Algorithm for Exploiting Multiple Arith-
`metic Units ” , IBM Journal of Research and Development archive ;
`metic Units”, IBM Journal of Research and Developmentarchive;
`vol . 11 , Issue 1 , Jan. 1967 , pp . 25-33 .
`vol. 11, Issue 1, Jan. 1967, pp. 25-33.
`UK Intellectual Property Office , Examination Report for United
`UK Intellectual Property Office, Examination Report for United
`Kingdom Patent Application No. 1711202.0 , dated Aug. 20 , 2018 ,
`Kingdom Patent Application No. 1711202.0, dated Aug. 20, 2018,
`5 pages .
`5 pages.
`UK Intellectual Property Office , Examination Report for United
`UK Intellectual Property Office, Examination Report for United
`Kingdom Patent Application No. 1711202.0 , dated Aug. 17 , 2017 ,
`Kingdom Patent Application No. 1711202.0, dated Aug. 17, 2017,
`10 pages.
`10 pages .
`“ Computer Organization and Architecture , Chapter 15. Control Unit
`“Computer Organization and Architecture, Chapter 15. Control Unit
`Operation ” , umcs.maine.edu , http://aturing.umcs.maine.edu/~meadow/
`Operation’, umcs.maine.edu,http://aturing.umces.maine.edu/~meadow/
`courses / cos335 / COA15.pdf , Mar. 16 , 2010 , 9 pages .
`courses/cos335/COAL5.pdf, Mar. 16, 2010, 9 pages.
`Lazzaro , " CS 152 Computer Architecture and Engineering : Lecture
`Lazzaro, “CS 152 Computer Architecture and Engineering: Lecture
`6 — Superpipelining + Branch Pre - diction ” , UC Berkeley , https : //
`6—Superpipelining + Branch Pre-diction”, UC Berkeley, https://
`inst.eecs.berkeley.edu/~cs152/sp14/lecnotes/lec3-2.pdf , Feb. 6 , 2014 ,
`inst.eecs.berkeley.edu/~cs 152/sp 14/lecnotes/lec3 -2 pdf, Feb. 6, 2014,
`37 pages .
`37 pages.
`Sima , Dezso “ Microarchitecture of Superscalars ( 3 ) : Branch Pre
`Sima, Dezso “Microarchitecture of Superscalars (3): Branch Pre-
`diction ” , Universitas Budensis , John von Neumann Faculty of
`diction”, Universitas Budensis , John von Neumann Faculty of
`Informatics , Fall 2007 , 73 pages .
`Informatics, Fall 2007, 73 pages.
`* cited by examiner
`* cited by examiner
`
`
`
`U.S. Patent
`U.S. Patent
`
`Jul . 13 , 2021
`Jul. 13, 2021
`
`Sheet 1 of 26
`Sheet 1 of 26
`
`US 11,061,682 B2
`US 11,061,682 B2
`
`Fig . 1
`
`T‘sI4
`
`0114
`
`0113
`
`0112
`
`0115
`
` VITO
`
`
`
`e+.lsara)SOLO)JvOtO
`
`
`0131
`
`01080109
`60108010LOLO
`
`
`0107
`
`0110
`OTTO
`
`0116
`0117 X
`
`0104 01050106
`
`
`
`
`LITOPyEOTOR.|ZOTO)._JTOTO
`
`0101 - 0102 0103
`
`
`
`
`U.S. Patent
`U.S. Patent
`
`Jul . 13 , 2021
`Jul. 13, 2021
`
`Sheet 2 of 26
`Sheet 2 of 26
`
`US 11,061,682 B2
`US 11,061,682 B2
`
`VI‘sid
`
`Fig . 1A
`
`
`
`0125
`SCLO
`
`nsVv
`
`EXU(u
`
`EXU ( u , v ) 0141
`tet
`
`
`
`cSTO
`
`0143
`
`EXU ( m , n )
`
`EXU ( 1,0 )
`EXU ( 0,0 )
`
`0142
`
`EXU ( m , n )
`
`EXU ( 1,0 )
`EXU ( 0,0 )
`
`0153
`
`EXU ( m , n )
`
`EXU ( 1,0 )
`EXU ( 0,0 )
`
`0512
`select
`0127
`
`0126
`
`select )
`
`0510
`
`0154
`select
`0505
`
`0122
`
`0121
`
`0155
`
`0124
`
`0123
`
`0151
`select
`*yoayas
`0152
`
`
`
`U.S. Patent
`U.S. Patent
`
`Jul. 13, 2021
`Jul . 13 , 2021
`
`Sheet 3 of 26
`Sheet 3 of 26
`
`US 11,061,682 B2
`US 11,061,682 B2
`
`src5|10227
`
`src2||0223 src310225
`src||0221
`0201
`0201
`
`
`
`0223 0224
`0225 0226 instr r3 , src4 , src50227
`0221 0222 instr
`
`instr tgt1 , r3 , src3
`instrr3,src4,
`instr r3 , srco , src1
`
`instrtgtO,3,
`
`instrtgt1,8,
`
`
`
`
`tgt0 , r3 , srca
`
`Fig . 2
`
` instrr3,src0,
`
`LeererneeeemennaeweMamanaenameneinemawNeHaMmMaeMMefmmMMRintmSaaaeMMHHYWetaMD
`
`0131
`0131
`
`0116
`
`0213
`
`0212
`
`rr3
`rr2
`
`
`
`tgt1 rr1
`
`rr3
`rr2
`
`
`
`tgto rr1
`
`
`
`0121 src3
`
`src2
`
`r3
`
`
`
`srco src1
`
`rr3
`rr2
`rr1
`
`0211 r3
`
`02321
`
`10231
`
`1 1
`
`1
`
`1 1
`
`
`
`U.S. Patent
`U.S. Patent
`
`Jul . 13 , 2021
`Jul. 13, 2021
`
`Sheet 4 of 26
`Sheet 4 of 26
`
`US 11,061,682 B2
`US 11,061,682 B2
`
`Fig . 3
`
`# 12
`
`# 13
`
`# 11
`
`# 12
`
`# 09
`
`# 10
`
`# 02
`
`r2
`
`# 12
`
`# 09
`
`# 08
`
`# 05
`
`# 07
`
`# 05
`
`# 06
`
`# 03
`
`r1
`
`r3
`
`€“SI4cOEO¢xre704s
`
`
`
`
`(1H)(oH)(eo#)GoH)]|Cox)ot)eerexduo240
`
`
` €ZpeoT0/[TZpeot:ToliTOEO€IlS048iT
`CiT4etPeoT=€0
`
`
` etpeoT:TT}ipt8704s01/izziTX=ptéATxnw260//Zafeadwo:90#/;pii?TZ=pré3Txu2/0
`
`a:yiex‘Taxduo:99
`64Zt‘Ea4EAPPeet
`
`
`Ta‘@r‘TZppeiQ
`
`Tt‘Ga‘TZppe20
`Zt“EtEAPpke2ZT
`TZ‘prATAou10
`Zi‘pa4TAow260
`
`€l‘palAOW:F0
`
`ea‘tadwo:99
`€t‘padaow:F0
`
`r3
`
`T2peoT:T0
`
`E+PeoT220
`
`# 05 : add rl , r2 , r1
`# 04 : mov r4 , r3
`# 02 : load r2 # 03 : load 3
`# 01 : load 1
`
`# 08
`
`# 06
`
`# 04
`
`# 07
`
`# 09
`
`( # 04
`
`# 07
`
`r4
`
`: r4
`
`: r2
`
`r1
`
`r1
`ett
`
`# 12 : add r3 , r3 , r2 # 13 : store r3 0302
`# 05 : add rl , r2 , rl
`# 06 : cmp rl , r3 # 07 : mux lt ? r4
`# 08 : cmp r3 , r2 # 09 : mux lt ? r4
`# 04 : mov r4 , r3
`# 10 : store r4
`# 03 : load r3
`# 01 : load r1 # 02 : load r2
`# 11 : load r3
`
`movlt r4 , r2
`store 14
`# 07 : movlt r4 , r1 # 08 : cmp r3 , r2
`# 06 : cmp rl ,
`
`
`
`piez04S8:0T
`
`€fpeotTt
`
`# 12 : add r3 , r3 , r2 # 13 : store 13 0301
`# 11 : load r3
`# 10 :
`# 09 :
`
`
`
`
`
`U.S. Patent
`
`Jul. 13, 2021
`
`Sheet 5 of 26
`
`US 11,061,682 B2
`
`
`
`faa)
`“~
`
`
`
`
`
`rT
`四
`bno
`LL.
`
`
`
`
`
`CN
`
`Y
`
`
`
`
`
`N
`
`oc
`
`
`
`
`
`
`
`
`
`
`
`4
`
`之
`
`
`
`
`
`
`
`cc
`
`ac
`
`
`
`
`
`
`
`
`
`Start instruction issue
`
`
`
`U.S. Patent
`U.S. Patent
`
`Jul . 13 , 2021
`Jul. 13, 2021
`
`Sheet 6 of 26
`Sheet 6 of 26
`
`US 11,061,682 B2
`US 11,061,682 B2
`
`
`
`
`
`0511 0510 0505
`
`0563 10564 0565
`0550 10551 0552 0553
`0561
`wel wel
`we3
`we2
`
`0131
`
`
`
`ORAO ORA1 0504
`
`0505
`
`0506
`
`0751
`
`J0752
`
`0550
`
`0501
`
`RRA ORAO ORA1
`
`0131
`
`WWWWWWWWWWWW
`wwwwwwwwwww
`
`
`
`
`
`0504 0510 0511 0512
`
`-0792
`
`0504
`0502
`
`0503
`
`0110
`
`Fig . 5A
`
`
`
`055ni
`wen
`
`0562
`
`
`
`Fig . 5
`
`RRA
`
`0701
`0702
`
`
`
`
`
`U.S. Patent
`U.S. Patent
`
`Jul. 13, 2021
`Jul . 13 , 2021
`
`Sheet 7 of 26
`Sheet 7 of 26
`
`US 11,061,682 B2
`US 11,061,682 B2
`
`
`
`from 0131 ( n )
`from 0131(n)
`noe. ql (0561
`0563
`80606
`
`! 0561 ;
`prsso
`IN0564
`05501
`es EOE
`wel +
`0551
`wel 10
`10552
`we2
`10553
`we3 +
`
`0563
`0564
`0565
`
`To 0751
`
`en
`
`Fig . 6
`
`A
`
`wen
`0562
`
`0602
`0605 ;
`0603 115
`
`186606
`
`en
`
`+
`
`wel
`wel
`we2 +
`we3 +
`
`055n
`
`0131 ( 1 )
`
`! 0561
`90550
`0551
`10552
`0553
`
`0563
`0564
`0565
`
`To 0751
`
`wen +
`0562
`
`0605 ;
`
`055n
`
`0602
`
`Ta
`-0603
`
`0601(1
`0601 ( 1 )
`
`0601 ( 2 )
`
`0504
`
`RRA
`
`0601 ( 3 )
`
`0604
`
`0604
`
`
`
`
`
`0131 ( 2 )
`
`SeLees 50HP0551
`86606
`0563
`0564
`0565
`|
`stySessa u
`eet}#0553}
`IS
`
`en
`
`+
`
`wel
`wel
`we2
`we3 +
`
`! 0561 ;
`0550
`0551
`10552
`10553
`
`
`
`
`wen
`0562
`
`055n
`
`+
`
`0602
`0605 ;
`0603
`0603 “tt
`to 0131 ( m ... n )
`to 0131(m...n)
`
`0131 ( 3 )
`0131(3)
`
`To 0751
`
`
`
`U.S. Patent
`U.S. Patent
`
`Jul . 13 , 2021
`Jul. 13, 2021
`
`Sheet 8 of 26
`Sheet 8 of 26
`
`US 11,061,682 B2
`US 11,061,682 B2
`
`v
`0752 ( u , v )
`
`Fig . 7B 0751 ( n )
`
`0701 ( n ) !
`Fig . 7A
`
`--0702
`
`trash_ ( 0,2 )
`trash_ ( 0,1 )
`trash_ ( 0,0 )
`
`& H
`
`&
`&
`
`trash_ ( m , n )
`
`&
`
`ysedyP3|
`
`cLeO LELO.
`0712
`0762 0
`otpe|aea(A'n)A‘n)
`
`0711
`40721
`0723
`072n
`0722
`+0764
`0761 ee 0761
`WWW
`
`set n
`
`0710
`
`Fig . 7
`
`OTLO (uty)
`
`
`
`&
`
`0763
`
`clear
`
`0751_ ( u , v )
`
`TSZO
`
`WWW * * * * * *
`
`* M * Www w w w w w w w W
`
`clear
`
`0505 ( u , v )
`0753 ( n ) .
`
`0601 ( n )
`
`0604
`
`0792
`
`0791
`
`
`
`clear all
`
`set 1
`set 0
`set 2
`
`&
`
`0713 ( n ) 0601 ( n )
`
`0604
`
`0504
`
`0131 ( 1 )
`0701 ( 1 )
`0751 ( 1 )
`
`0131 ( 2 )
`
`0751 ( 2 )
`0701 ( 2 )
`
`0131 ( n )
`
`0701 ( n )
`
`0751 ( n )
`
`
`
`0603
`0603
`
`0601 ( 1 )
`
`0713 ( 1 )
`
`0753 ( 1 )
`
`0601 ( 2 )
`
`0713 ( 2 )
`
`0753 ( 2 )
`
`0801
`
`0601 ( n )
`
`0713 ( n )
`0753 ( n )
`
`0802
`
`
`
`
`
`U.S. Patent
`U.S. Patent
`
`Jul . 13 , 2021
`Jul. 13, 2021
`
`Sheet 9 of 26
`Sheet 9 of 26
`
`US 11,061,682 B2
`US 11,061,682 B2
`
`Fig . 8
`
`Fig . 8A
`
`
`
`
`
`0861 1000011111 0862 * 00000111 0863
`00001100
`
`
`
`0601 ( n ... 0 )
`
`(o""4)TOSO
`
`0802
`c080
`
`0604
`7090
`0801
`
`0713 ( n ... 0 )
`
`(oO“UJETLO
`0811
`
`0809
`
`0753 ( n ... 0 )
`
`(o""U)ESZO
`70814
`0813
`
`0806
`
`0807
`
`0808
`
`0812
`
`0810
`
`0803
`
`0823
`€C80
`
`0821
`TC80
`
`0804
`0831+
`
`TE8O
`
`0805
`
`0822
`€C80
`
`
`
`U.S. Patent
`U.S. Patent
`
`Jul . 13 , 2021
`Jul. 13, 2021
`
`Sheet 10 of 26
`Sheet 10 of 26
`
`US 11,061,682 B2
`US 11,061,682 B2
`
`Fig . 9A
`
`Fig . 9B
`
`
`
`
`
`
`
`; RO is current number , 5 assumed for start value
`
`; R1 is Count of number of iterations
`
`repeat if RO ! = 1
`
`
`
`; test whether RO is odd
`
`; increment number of iterations
`
`; if even , set RO R1 >> 1
`
`RO , RO , RO , LSL # 1 ; if odd , set RO ROmogli ( RO << 1 ) afin 1
`
`f
`
`~
`
`‘
`
`
`0932 ; else done
`0942 )
`
`0941
`
`RO , RO , # 1
`
`RO , RO , ASR # 1
`
`STTUMa
`
`
`
`while
`
`B
`
`R7 , RO , # 1
`R1 , R1 , # 1
`
`R7 , RO , # 1 0931
`
`R1 , # 0 0951
`RO # 5
`
`exit
`
`MOV
`
`MOV
`
`ADDNE
`
`SUBS BEQ
`while
`eTTuM
`
`ADD ANDS
`body
`
`exit
`4TxS
`
`RO , RO , ASR # 1
`RO , RO , # 1
`
`T#‘Ou‘ouaANddY
`
`ADDNE
`
`while
`MOVEQ B
`
`exit
`
`while
`
`eTTUMa
`
`B
`
`MOV
`Cven
`
` V6“SIeuopeste
`
`a6‘314euopesTa
`
`
`
`(I<ONpaazuezenb):,doot,o3ATjoertpdumlptnos
`
`@NTPAFICSTOFpewnsse¢‘’zTequnuWuezNOsToy
`eNTBA{FesZOFpounsse¢g/zZequnuyuezINoOstOU
`
`SUOTRETA@VITJoTaqunuFOywUNCOsTTY
`suotjezeqtgozequnuFoyunoostTY
`T+(IT>>OY)+O89=OYFES‘ppoZT
`[I+(fT>>04)+ON=ONFes‘ppoFT
`
`SUOTRZeTAaRTFOFequnuZusweToOUT
`
`
`SUOTIETEIUTgotTaqunuJZUSWeTOUT
`I<<04=OYJES‘/UBADFT
`I<<0M=ONJes‘ueAeFT
`ppoSfoyTeyjeym3se
`ppoSToYzey2eym7s0eq
`
`T=]OWFTWeedez
`7601Exe(0.ote
`
`
`T#IST‘OU‘OU‘OUaNddy
`T#IST‘Ou‘Ou‘Ouaqw
`T#usw‘ou‘owAOWusAS
`T#usw‘ou‘OuOWAON
`TH‘Tu‘tuadqwApog
`
`T#‘OU‘LasanseTTuM
`T#‘Ta‘TaadwKpog
`
`T#‘OU‘LeaSANW
`T#‘OW‘LeSANW
`
`
`
`
`
`IT=]OMFTzeaedaz
`
`oto 1
`
`RO + ( RO << 1 )
`
`; if odd , set RO
`
`
`
`
`; RO is current number , 5 assumed for start value
`; R1 is count of number of iterations
`; repeat if RO ! = 1
`
`
`
`i test whether RO is odd
`
`; increment number of iterations
`
`he
`
`~~
`
`
`
`
`
`; could jump directly to ' loop ' : ( guaranteed RO > 1 )
`
`set RO R0 >> 1
`
`; if even ,
`~
`
`
`
`; else done
`
`wf.
`
`RO , RO , RO , LSL # 1
`
`CS60s#‘ou
`
`Oo#‘Tad
`
`R0 , # 5 0952 R1 # 0
`
`R7 , RO , # 1
`
`MOV
`
`MOV
`
`SUBS
`while
`
`4Txeoua
`
`exit
`
`BEQ
`
`UsASOud
`
`R1 , R1 , # 1 R7 , RO , # 1
`
`even
`
`BEQ
`
`ADD
`
`ANDS
`
`ADD
`body
`
`T#‘Ou‘ouaqw
`
`ADD
`
`
`
`
`
`
`U.S. Patent
`U.S. Patent
`
`Jul . 13 , 2021
`Jul. 13, 2021
`
`Sheet 11 of 26
`Sheet 11 of 26
`
`US 11,061,682 B2
`US 11,061,682 B2
`
`
`
`
`
`SUOTZeIeQVTFOTequnuJo2unoDstTy#/
`
`d6‘s!4
`
`Fig . 9D
`
`Fig . 90
`
`96‘3I4€060
`
`0903
`
`I<<OW=OWFES‘UeAeFT
`
`R0 >> 1
`
`; if even , set RO
`
`
`
`; else done
`
`
`
`; RO is current number , 5 assumed for start value
`
`
`
`
`
`· R1 is count of number of iterations
`i repeat if RO ! = 1
`
`; if odd , set RO RO eflent ( RO << 1 ) 1 ; SPECULATIVE !!
`
`R0 >> 1
`
`
`
`i increment number of iterations
`
`; if even , set RO
`
`euopeste
`
`
`
`; else done
`
`n~
`
`0902
`
`within 1 ; SPECULATIVE !!
`
`
`
`SUOTRZeETSRZTFOAaQunuZUSwWSsTOUT
`
`ppostoyTeyzeym4sez
`
`
`i increment number of iterations ; test
`whether RO is odd
`
`; if odd , set RO = RO option ( RO << 1 )
`
`R2 , RO , RO , LSL # 1
`
`ADD IF
`
`
`
`
`((FATLVINOGES/T+(T>>OM)+OY=OHGas“‘ppoJr/THIST‘OU‘OU‘ZUaay
`
`
`ONTeCAJFITSUSFOFpounsse¢/zTequnu2ue7INDStOWic#‘OyAOW
`iiHAILVTNOTIS“T+(1>>OM)+OU=OWJES‘ppoFT
`
`
`
`@NTRA4IeISTOFpeumsse¢‘/rTequnuquerzanostoy#
`SUOTIPTeUTFOrTequnuFOFuNeostTaéO#‘TyACW
`
`SUOTJeTERTFOTequnujJuewezTouT¢TH‘Tu/TUaawApog
`4Txe
`
`
`I<<0W=OWFes‘ueAeFT?éT#Usw“OU‘OuJ°AOKT[TZ6O
`
`I=iOWZTJeaedez/T#‘OU“LYsans eTtum
`euopesTe/4Txe
`
`
`
`
`
`RO is current number , 5 assumed for start value
`
`; R1 is count of number of iterations
`; repeat if RO ! = 1
`
`T=iOWZTWeedez
`
`R7 , RO , # 1
`
`R1 , R1 , # 1
`
`R2 , RO , RO , LSL # 1
`
`R1 , # 1
`R7 , RO , # 1
`
`RO , RO , ASR # 1
`RO , R2 , # 1
`
`R7 , RO , # 1
`
`exit
`
`R0 , # 5
`Rl , # 0
`
`
`Rlr
`
`
`
`while
`
`MOV
`AOW
`
`MOV
`AOW
`
`SUBS
`while
`
`BEQ
`
`ADD
`body
`
`ANDS
`
`ADD
`
`ADDNE
`
`MOVEQ
`0901
`
`B
`
`exit
`
`R0 , # 5
`R1 , # 0
`
`MOV
`
`MOV
`
`SUBS
`while
`
`
`
`exit
`
`jTxSOud
`
`BEQ
`
`ADD
`body
`
`# 2 , R0 & # 1
`
`T#3OU‘ZHAl
`
`RO , RO , ASR # 1
`R2 , # 1
`
`T#‘cd‘ouL°aqyv
`
`RO ,
`
`ADD . T
`
`MOV.F
`0921
`
`
`
`while
`
`B
`
`exit
`
`
`
`
`U.S. Patent
`U.S. Patent
`
`Jul . 13 , 2021
`Jul. 13, 2021
`
`Sheet 12 of 26
`Sheet 12 of 26
`
`US 11,061,682 B2
`US 11,061,682 B2
`
`PP
`
`PP
`dd
`
`r7
`
`OT‘sI4
`
`Fig . 10
`
`TxXSOud
`
`exit
`BEQ
`
`
`
`
`
`while
`STTUM
`
`B
`
`-1011
`ro
`
`R7 , RO , # 1
`
`T#‘OU“LUsans
`
`SUBS
`
`1003
`
`PP
`
`1002
`
`
`
`1005
`
`
`
`1001
`
`1004
`
`
`
`
`
`
`RO RO , LSL # 1
`
`
`
`
`RO , ASR # 1
`
`##‘Td‘OJ
`
`ro , r1 ,
`
`ro , r1
`
`R7 , RO , # 1
`R1 , R1 , # 1
`
`RO , R2 , # 1
`
`RO ,
`
`RO ,
`
`ANDS
`
`ADD
`
`ADDNE
`
`ADDNE
`
`MOVEQ
`
`1 1 1 1 1
`
`1
`
`1 1 1 1
`
`
`
`
`i loop :
`i_ loop:
`
`k loop :
`k_loop:
`
`j_loop :
`j_ loop:
`
`[r7, r3]; matrix C
`
`mul r3 , r10 , ro
`mul r3, rid, x0
`add r3 , r3 , r1
`add r3,
`r3, rl
`[ r7 , r3 ] ; matrix C
`idr r3 ,
`ldr r3,
`
`1101
`
`mov r2 , # 0
`iJ
`mov x2, #0
`
`
`
`mul r4 , r10 , r0
`mul r4, 410,
`rO
`add r4 , r4 , r2
`
`add r4,
`r4,
`r2
`1102
`[ r8 , r4 ] ; matrix A
`ldr r4 ,
`
`
`ldr r4,
`[r8, r4]; matrix A
`
`
`
`mul r5 , r12 , r2
`mul r5, ri2,
`r2
`add r5 , r5 , r1
`add r5,
`r5, rl
`ldr r5,
`v9, v5); matrix B
`ldr r5 ,
`[ r9 , 15 ] ; matrix B
`
`1103
`1103
`
`mul 14 , r4 ,
`r5
`rd
`mui r4,
`r4,
`1121
`x4
`1121
`add _r3,
`13,
`add r3 , 13 ,
`r4
`
`; end j loop
`; end j_ loop
`
`
`
`U.S. Patent
`U.S. Patent
`
`Jul. 13, 2021
`Jul . 13 , 2021
`
`Sheet 13 of 26
`Sheet 13 of 26
`
`US 11,061,682 B2
`US 11,061,682 B2
`
`be
`; i
`mov ro , # 0
`;
`mov rO0, #0
`ik
`mov r1 , # 0
`
`mov xl, #0 ik
`
`
`
`1111
`1111
`
`1104
`1104
`
`1112
`1112
`
`1113
`1113
`
`Fig . 11
`Fig. 11
`
`add r2 , r2 , # 1
`add r2, v2, #1
`cmp r2 , r12
`emp +2,
`r12
`bne j loop
`pne j_loop
`
`mul r4 , r10 , r0
`mul r4,
`r10,
`r0
`add r4 , r4 , r1
`add r4,
`r4, rl
`str r3 ,
`[ r7 , r4 ]
`str +3,
`r7,
`rd]
`
`add r1 , 11 ,
`# 1
`add rl, rl, #1
`cmp ri , r11
`emp ri,
`ril
`bne k loop
`bne k_loop
`
`; end k loop
`; end k_loop
`
`add ro , ro # 1
`add x0,
`rO #1
`cmp ro , r10
`cmp rO,
`rid
`bne i loop
`bne iloop
`
`; end i loop
`7 end iloop
`
`
`
`exit:
`exit :
`
`
`
`U.S. Patent
`U.S. Patent
`
`Jul . 13 , 2021
`Jul. 13, 2021
`
`Sheet 14 of 26
`Sheet 14 of 26
`
`US 11,061,682 B2
`US 11,061,682 B2
`
`~
`
`~
`
`uO}NIYSUl‘pljea=a)
`
`instruction
`COCTBeplea
`1202
`
`
`
`(PHEAUI=AI
`valid flag ( v = valid , iv = invalid )
`
`VZT‘SI4
`
`Fig . 12A
`
`1201
`TOcT
`
`1203
`
`1204 1205
`
`V
`
`AZT‘34
`
`Fig . 12B
`
`1211
`
`1201
`
`1203
`
`1204 1205
`
`V
`
`1202
`
` LTCcl
`
`1215
`
`1216
`1212
`1219
`
`
`
`VICTEtcl
`
`1213 1214
`
`1217
`
`1220
`1218
`
`
`
`
`U.S. Patent
`U.S. Patent
`
`Jul . 13 , 2021
`Jul. 13, 2021
`
`Sheet 15 of 26
`Sheet 15 of 26
`
`US 11,061,682 B2
`US 11,061,682 B2
`
`1321
`
`1322
`
`1323
`
`1324
`
`1325
`
`1326
`1326
`
`instr - 2
`
`b
`
`instr.1
`
`subs
`
`b
`
`instr.1
`
`beq
`
`subs
`
`b
`
`while
`instr +1
`
`subs
`
`while
`instr + +1
`instr + 2
`
`V
`
`V
`
`iv
`
`IV
`
`V
`
`IV
`
`V
`
`V
`
`instr +3
`instr + 2
`instr + 1
`
`QO
`YY
`ei
`
`Lh.
`
`Fig . 13C
`bo
`
`am
`
`n
`el
`
`Fig . 13B
`bo
`
`LL.
`
`instr - 2
`instr - 1
`
`add
`
`???
`
`add
`
`instr.1
`
`bne
`
`???
`
`add
`
`instr +1
`for
`
`???
`
`instr + 2
`instr .
`for
`
`instr +1
`instr +3
`instr + 2
`
`V
`
`V
`
`1316
`
`
`
`1304
`
`add
`
`mul
`
`instr - 1
`
`Idr
`
`add
`
`mul
`
`instr +1
`fldr
`
`add
`
`V
`
`iv
`
`iv
`
`V
`
`iv
`
`1205~~
`DQwenn
`1201 ? 1204
`E 1205
`§
`time t+2
`
`1305-nr
`201-ene
`1201 1204 1205
`
`“<t
`
`1204 1205
`
`©Na
`
`1201
`
`e
`time t + 3 time t + 4
`e
`time
`t+3 time t+
`D
`
`1201 1204 1205
`
`3
`S
`N
`a
`
`time t+1
`
`st
`oO
`a
`ot
`
`V
`
`V
`
`V
`
`V
`
`iv
`
`iv
`
`iv
`
`V
`
`iM
`
`Y
`ei
`
`i
`
`Fig . 13A
`bd
`
`1306
`
`instr +2
`instr +3
`
`instr +1
`
`witin
`ooo
`GEOL AY
`ecdei
`$
`time t+5
`
`1201 D 1204 1205
`
`1305
`
`instr +1
`instr +2
`fldr
`
`V
`
`
`
`instr - 2
`
`mul
`
`instr_1
`
`V
`
`1201 1204 + 1205
`
`
`
`U.S. Patent
`U.S. Patent
`
`Jul . 13 , 2021
`Jul. 13, 2021
`
`Sheet 16 of 26
`Sheet 16 of 26
`
`US 11,061,682 B2
`US 11,061,682 B2
`
`1321
`
`1342
`1342
`
`1343
`1343
`
`1344
`1344
`
`1324
`1324
`
`mov
`
`mov
`
`b
`
`instr.1
`
`instr_2
`
`invalidnop )
`
`b
`
`instr_1
`
`subs
`
`b
`
`instr.1
`
`beq
`
`subs
`
`b
`
`instr + 1
`
`while
`
`subs
`
`Fig . 13E
`
`mov
`
`mov
`
`1321
`
`V
`
`V
`
`V
`
`V
`
`iv
`
`iV
`
`iv
`
`V
`
`iv
`
`1311
`
`1312
`
`1313
`
`1334
`
`1335
`
`add
`
`instr_1
`
`instr.2
`
`??? add
`
`instr.1
`
`bne
`
`???
`
`add
`
`mov
`
`bne
`
`cmp
`
`iv
`
`iv
`
`IV
`
`iv
`
`Fig . 13D
`
`instr + 1
`for
`
`???
`
`iv
`
`1211 1201 D 1204 1205
`
`time t
`
`1211 1201 1204 1205
`
`time t + 1
`time t+1
`
`1211 1201 1204 N 1205
`
`*
`time t+2
`
`1211 1201 1204 w 1205
`
`time t + 3
`time t+3
`
`1211 1201 1204 1205
`
`time t + 4
`time t+4
`
`
`
`U.S. Patent
`U.S. Patent
`
`Jul . 13 , 2021
`Jul. 13, 2021
`
`Sheet 17 of 26
`Sheet 17 of 26
`
`US 11,061,682 B2
`US 11,061,682 B2
`
`Mov
`mov
`Mov
`mov
`
`fidr
`fldr
`mov
`mov
`
`fldr
`fldr
`fidr
`fldr
`
`mul
`mul
`add
`add
`
`endfor
`endfor
`
`fstr
`fstr
`
`endfor
`endfor
`endfor
`endfor
`
`( nop )
`( nop )
`
`fldr
`fldr
`( nop )
`(nop)
`
`fldr
`fldr
`fldr
`fldr
`
`mul
`mul
`add
`add
`
`endfor
`endfor
`
`fstr
`fstr
`
`endfor
`endfor
`endfor
`endfor
`
`i loop :
`i_loop:
`
`k loop :
`kloop:
`
`j loop :
`j_ioop:
`
`exit:
`exit :
`
`i loop :
`i loop:
`
`k loop :
`kloop:
`
`j_loop :
`j_ioop:
`
`exit:
`exit :
`
`ro ,
`# 0
`r0O, #0
`# #
`II ,
`# 0
`rl, #0
`
`[ r3 ,
`r3 ,
`[r3,
`r3,
`r2 , # 0
`c2, #0
`
`r4 ,
`r4,
`r5 ,
`v5,
`
`[ r8 ,
`[r8,
`( r8 ,
`[r8,
`
`; i
`pa
`; k
`7; k
`
`( 810 * 80 ) + r1 ) ] 1401
`( r10 * r0 ) + r1 ) ]
`1401
`(r10*r0)+r1)]
`73
`
`( r10 * r0 ) + r2 ) ]
`( 810 * 80 ) + r2 ) ] 1402
`(rl0*r0)4+r2) ]
`1402
`( r12 * r2 ) + r1 ) ] 1403
`(ri2*r2)4+r1) ]
`1403
`
`
`
`r4 , r4 , r5
`r4, c4, x5
`r3 , r3 , 14
`r3, v3, v4
`
`Fig . 14A
`Fig. 14A
`
`r3 ,
`v3,
`
`( r7 ,
`[r7,
`
`( r2 ,
`+ # 1 , r12 ,
`j loop ) 1411
`1411
`(x2, +#1,
`r12,
`J loop)
`( r10 * r0 ) + r1 ) ] 1404
`(r10*r0)4+r1)]
`1404
`( rl ,
`+ # 1 , 111 , k loop ) 1412
`(x1, +#1, cll, kloop)
`1412
`( ro , + # 1 , r10 , i_loop ) 1413
`1413
`(xO, +#1, x10,
`iloop)
`
`1431
`1431
`1432
`1432
`1401
`1401
`1433
`1433
`1402
`1402
`1403
`1403
`
`7; mov
`mov
`3 mov
`mov
`
`IO
`# 0
`cO, #0
`rl , # 0
`ri, #0
`
`( r3 ,
`r3 ,
`[r3,
`v3,
`; mov
`MOV
`
`( r10 * 10 ) + r1 ) ]
`(©10*r0)4+rl1)]
`12 ,
`# 0
`v2, #0
`
`r4 ,
`( r8 ,
`( 010 * 80 ) + r2 ) ]
`r4,
`[r8,
`(ri0*r0)4+r2) ]
`
`r5 ,
`( r8 ,
`( r12 * r2 ) + r1 ) ]
`v5,
`[r8,
`(rl12*r2)+r1)]
`r4 , r4 , r5 1431
`v4, v4, x5
`r3 , r3 , r41431
`v3, v3, x4 1431
`( r2 , + # 1 , r12 , j_loop , # 0 ) 1421
`#0) 1421
`(x2, +#1,
`r£12,
`J loop,
`1404
`r3 ,
`( r7 ,
`( r10 * r0 ) + r1 ) ]
`(r10*r0)4r1)]
`[x7,
`v3,
`1404
`( ri , + # 1 , r11 , k_loop , # 0 ) 1422
`#0) 1422
`(cl, +#1, cil, kloop,
`( ro , + # 1 , r10 , i_loop , # 0 ) 1423
`#0) 1423
`(xO, +#1, rid,
`iloop,
`
`Fig . 14B
`Fig. 14B
`
`
`
`U.S. Patent
`U.S. Patent
`
`Jul . 13 , 2021
`Jul. 13, 2021
`
`Sheet 18 of 26
`Sheet 18 of 26
`
`US 11,061,682 B2
`US 11,061,682 B2
`
`i loop :
`1loop:
`
`for
`
`{ r0 , + # 1 , r10 , # 0 )
`(x0,
`+#1, x10, #0)
`
`k loop :
`k_ loop:
`
`j_loop :
`j_ loop:
`
`( r1 , + # 1 , 111 , # 0 )
`+#1, rll, #0)
`(rl,
`r3 ,
`[ r3 ,
`( r10 * 70 ) + r1 ) ]
`r3,
`[r3,
`{r10*r0)4+r1) ]
`
`( r2 , + # 1 , 112 , # 0 )
`(x2,
`+#1,
`r12, #0)
`r4 ,
`( r8 ,
`( r10 * 50 ) +12 ) ]
`r4,
`[xs8,
`(r10*r0)4+r2) |
`
`r5 ,
`[ r8 ,
`( r12 * r2 ) + r1 ) ]
`r5,
`[r8,
`(r1l2*r2)4+r1)]
`
`r5
`r4 , r4 , r5
`r4,
`r4,
`r4
`r3 , r3 , r4
`r3,
`v3,
`
`for
`fldr
`
`for
`fldr
`fldr
`
`mul
`add
`
`endfor
`endfor
`
`fstr
`Istr
`
`r3 ,
`r3,
`
`[ r7 ,
`[x7,
`
`( r10 * 70 ) + r1 ) ]
`(c10*r0)4r1) ]
`
`endfor
`endfor
`endfor
`endfor
`
`exit:
`exit :
`
`Fig . 14C
`Fig. 14C
`
`
`
`U.S. Patent
`U.S. Patent
`
`Jul. 13, 2021
`Jul . 13 , 2021
`
`Sheet 19 of 26
`Sheet 19 of 26
`
`US 11,061,682 B2
`US 11,061,682 B2
`
`
`
`Issue
`
`2
`©
`aL)
`a
`
`1518
`1506
`1516
`
`1521
`
`1501
`1501
`1212
`1212
`
`1513
`1513
`
`1512
`1512
`
`1511
`1511
`
`1510
`
`1509
`1509
`
`1508
`1508
`
`1507
`1507
`
`1521
`
`
`
`wy
`
`1515
`
`od
`
`Fig.15
`
`Fig . 15
`
`
`1215/1216
`1215/1216
`
`1505
`
`a
`
`
`
`Dispatch States
`
`SS}
`
`1517 1218
`
`LON
`eae
`
`1219 /
`1219/
`1220
`1220
`
`1211 12041205
`
`1203
`
`©
`a
`
`©
`a
`
`1520
`
`
`
`1519
`
`ai
`
`n
`
`
`
`U.S. Patent
`U.S. Patent
`
`Jul . 13 , 2021
`Jul. 13, 2021
`
`Sheet 20 of 26
`Sheet 20 of 26
`
`US 11,061,682 B2
`US 11,061,682 B2
`
`‘API (1 +3)"
`
`
`
`
`
`Aduaye|S38]9A9490)9|
`
`I clock cycles latency
`
`endfor : 1421 , 1422 , 1423
`
`ECVT‘CVT‘TZHT40jpue
`
`
`
`1602
`cO9T
`
`TEVT‘ppe/|nuw
`
`mul / add : 1431
`
`
`
`
`
`COSTVOVT:43S}
`
`1603
`
`fstr : 1404
`
`CLOT
`
`( 2 ) |
`
`endfor : 1421 , 1422 , 1423 ( 2 ) 1612
`
`(Z)€@vT‘ZZvT‘TZvT:40Jpua
`
`
`
`1602
`cO9T
`
`TEVT‘ppe/|nuw
`
`1603
`
`fstr : 1404
`
`
`
`
`
`I clock cycles latency
`
`mul / add : 1431
`
`TT9OT
`
`endfor : 1421 , 1422 , 1423 ( 1 ) 1611
`
`fldr : 1402 , 1403
`
`(L)€@vT‘ZZvT‘TZvT:40Jpua
`
`
`
`1601
`TO9T
`
`
`EOVT‘ZOVT‘4PIy
`yawng9T“34
`
`1613
`
`1st
`
`timet
`
`Fig . 16B
`
`2nd
`
`time tu !
`
`[+]9wWy
`
`time t + + 1
`
`T+/+}0wWn
`
`
`
`
`
`fldr : 1402 , 1403
`
`1601
`TO9T
`€OVT‘ZOVT
`yawnVOT“SI4
`
`( 1 +3 ) th
`
`timet +1 +1
`
`T+]+}8W}
`
`time t + 1 + 2
`
`7+)+}9W
`
`timet
`
`Fig . 16A
`
`
`
`
`
`U.S. Patent
`U.S. Patent
`
`Jul . 13 , 2021
`Jul. 13, 2021
`
`Sheet 21 of 26
`Sheet 21 of 26
`
`US 11,061,682 B2
`US 11,061,682 B2
`
`endfor : 1421 , 1422 , 1423 ( 1 ) 1611
`
`
`
`1601
`TOOT
`
`fldr : 1402 , 1403
`
`
`
`
`
`
`
`
`
`LLOT](1)€@vt‘7ZvT‘TZyT‘s0jypua€OVT‘ZOVTPIL
`
`endfor : 1421 , 1422 , 1423 ( 2 ) 1612
`
`
`
`CLOTTEvT:ppe/jnu
`
`
`
`1602
`COST
`
`mul / add : 1431
`
`
`
`
`
`| clock cycles latency
`
`1623 1622
`
`1603
`
`fstr : 1404
`
`1621
`
`time t + 1 +1
`
`
`
`T+)+}0wWn
`
`1613
`
`1 st
`
`timet
`yawn
`
`1 st
`
`j+}aul
`
`time t +
`
`99T‘“3I4
`
`Fig . 16C
`
`
`
`endfor : 1421 , 1422 , 1423 ( 1 ) 1631
`
`1601
`LOOT
`
`fldr : 1402 , 1403
`
`:
`
`
`
`
`
`
`
`
`
`TEOT|(T)€@%vl‘ZT‘TZpTuojpua€OVT‘ZOVTJPY
`
`1632
`ceol
`
`
`
`
`
`
`Aduaye|Saj2A9490)9|
`
`
`
`
`
`1613 I clock cycles latency
`
`1623
`
`1602
`cool
`
`TEbvT:ppe/jnu
`
`mul / add : 1431
`
`
`
`1621V
`
`sopwasyeei
`
`1603
`
`COST
`
`fstr : 1404
`
`time t + 1
`
`j+}ouuly
`
`time t + 1 + 1
`
`Tl+}oun
`
`1st
`timet
`}Sui}
`
`GOT‘sid
`
`Fig . 16D
`
`
`
`
`
`
`
`
`
`U.S. Patent
`U.S. Patent
`
`Jul . 13 , 2021
`Jul. 13, 2021
`
`Sheet 22 of 26
`Sheet 22 of 26
`
`US 11,061,682 B2
`US 11,061,682 B2
`
`( ro , + # 1 , r10 , i loop , # 0 ) ( ri , + # 1 , r11 , k loop , # 0 ) ( r2 , + # 1 , r12 , j_loop , # 0 )
`
`
`
`
`
`(o#‘dooty‘TT2‘TH+‘T2)OZ
`
`
`
`
`
`(o#‘dooy£4ZTa‘/T#+’Za)205
`
`
`(O#‘doopy/TT2‘TH+‘Ta)Zzog:doot¥\a7aa-(o¢‘doort‘ott
`
`
`
`
`
`
`
`
`
`(04‘doot‘CTS‘TH+(22)zog‘ootfoyt'exexppe
`
`‘TH+‘0z)x03:doop¢}==}——ofyeet
`
`
`
`
`
`
`
`
`Poettizzazta)fea)’¢2apts
`
`
`
`(o#‘dootF‘O12‘T#+‘02)Z03
`
`( ro , + # 1 , 110 , i_loop , # 0 ) ( ri , + # 1 , 111 , k_loop , # 0 ) ( r2 , + # 1 , 112 , j loop , # 0 )
`
`( ro , + # 1 , r10 , i_loop , # 0 ) ( rl , + # 1 , r11 , k_loop , # 0 ) ( r2 , + # 1 , r12 , j_loop , # 0 )
`
`for for for
`i_loop : k loop : j_loop : 1708
`
`for for for
`i loop : k loop : j loop : 1703
`
`1702
`1704 ; 1 ? IT
`
`suet60ZLT
`
`1709
`17101 ?
`“OTLT
`
`
`
`
`17051
`
`OLT
`
`fldr r3 , ( r3 , ( r10 * r0 ) + r1 ) ] fldr r4 , [ r8 , ( r10 * 80 ) + r2 ) ] fldr r5 , ( r8 , ( r12 * r 2 ) + r1 ) ]
`
`1706
`90LT
`
`
`
`17011
`TOLT
`
`
`
`‘ez]‘¢24pTs
`
`gr]‘7aapts
`
`VLT‘sid
`
`Fig . 17A
`
`1702
`
`1704 ;
`
`Fig . 17B
`
`
`
`
`
`1711 Estr r3 , 107 , ( r10 * r ( ) + r1 ) ]
`
`17131
`
`1712
`
`
`
`15 r4
`F4 , 13 ,
`1705 ) ,
`24 131
`mul add
`1707
`
`fldr r3 , [ r3 , ( 810 * 80 ) + r1 ) ] fldr r4 , ( r8 , ( r10 * r0 ) + r2 ) ] fldr r5 , ( r8 , ( r12 * r2 ) + r1 ) ]
`
`1706
`
`17011
`
`1712
`
`r4 , r4 , r5 r3 , r3 , r4
`
`mul add
`1707
`ZOLT
`
`
`
`[(T2+(042%0T=)
`
`
`
`1711 fstr r3 , [ r7 , ( r10 * 80 ) + r1 ) ]
`
`17131
`
`for for for
`i loop : k loop : j_loop : 1703
`
`
`
`U.S. Patent
`U.S. Patent
`
`Jul . 13 , 2021
`Jul. 13, 2021
`
`Sheet 23 of 26
`Sheet 23 of 26
`
`US 11,061,682 B2
`US 11,061,682 B2
`
`1805 ( 1 )
`
`1805 ( 2 )
`
`1804 ( 0 )
`
`1804 ( 1 )
`
`1804 ( 2 )
`
`18050
`C)/SO8T|(T)s08t|
`1821
`TC8t
`(Z)vO8T
`
`
`
`
`Jp
`(L)Lost7eeyuS
`(Z)LOgtpee
`
`ext ( 1 )
`
`T)1xe
`
`1806 ( 3 )
`
`(€)908T
`-_
`
`cnt
`uD
`
`&
`
`ext ( 0 ) |
`
`(0)1xe
`
`1806 ( 2 )
`
`Z)908T
`
`-(
`
`1801 ( 1 )
`E
`cnt
`clr
`
`1806 ( 1 )
`
`1801 ( 0 )
`cir
`
`cnt
`
`1802 ( 0 ) 1803 ( 0 )
`
`1802 ( 1 ) 1803 ( 1 )
`
`1802 ( 2 ) 1803 ( 2 )
`
`stp
`
`trg ( 0 ) idx ( 0 )
`
`trg ( 1 ) idx ( 1 )
`
`Ts.
`
`(T)Xp!
`
`trg ( 2 ) idx ( 2 ) exit
`
`Fig . 18
`
`ext ( 2 )
`
`1801 ( 2 )
`cir
`
`4pee
`
`1822
`CC8L
`
`&
`
`1823
`
`
`
`U.S. Patent
`U.S. Patent
`
`Jul . 13 , 2021
`Jul. 13, 2021
`
`Sheet 24 of 26
`Sheet 24 of 26
`
`US 11,061,682 B2
`US 11,061,682 B2
`
`V8T
`Fig . 18A
`3l4
`
`1851
`
`1852
`
`01
`
`10
`
`11
`
`1853
`
`1854 00
`
`PP
`
`1855
`
`1856
`
`1857
`
`
`
`(0)3x9j:00
`
`
`
`(T)3x9j78(0)3x8:TO
`
`00 : ! ext ( 0 )
`
`
`
`(Z)ax9j7B(T)ax9*8(0)3X2:OT
`
`
`
`(Z)axe7g(T)ax97g(0)3K9“TT
`
`10 : ext ( 0 ) & ext ( 1 ) & ! ext ( 2 ) 11 : ext ( 0 ) & ext ( 1 ) & ext ( 2 )
`
`01 : ext ( 0 ) & ! ext ( 1 )
`
`
`
`
`U.S. Patent
`U.S. Patent
`
`Jul . 13 , 2021
`Jul. 13, 2021
`
`Sheet 25 of 26
`Sheet 25 of 26
`
`US 11,061,682 B2
`US 11,061,682 B2
`
`1632
`
`1916
`
`1918
`
`(038‘ajlj4a3SIBay‘sIUP81035‘SUP[O4}U0Ddoo}‘syjuPUONNDaxq‘8'a)SUNJayNyWoy
`
`
`
`
`
`
`
`
`
`(938‘all44aysIBay‘syUPas03S‘syUPUO!NIaxy“3‘a)syJuNYaYyYNy0}
`
`
`
`
`
`
`
`LL|||__|
`
`
`
`
`
`
`
`
`
`
`
`to further units ( e.g. Execution Units , Store Units , Register File , etc )
`
`6T‘Sid
`
`Fig . 19
`
`1903
`
`
`
`
`
`
`
`
`from further units ( e.g. Execution Units , Loop Control Units , Store Units , Register File , etc )
`
`
`
`1901 1902
`
`1931
`
`1907
`1906
`
`1905
`
`1904
`
`1933
`
`1932
`
`1913
`
`1908
`
`1911
`
`1912
`mem_rdi
`
`1909
`
`1910
`
`
`E€6TZE6T
`
`1917
`
`1914
`
`1915
`
`reg en
`
`buf en
`
`+
`
`
`
`1934
`
`1935
`
`read
`
`&
`
`Adoo
`
`
`
`U.S. Patent
`U.S. Patent
`
`Jul . 13 , 2021
`Jul. 13, 2021
`
`Sheet 26 of 26
`Sheet 26 of 26
`
`US 11,061,682 B2
`US 11,061,682 B2
`
`2001
`TOOC
`
`
`
`( 1 + 2 h ) : 4 tagged arbiter
`
`2006
`
`Oz‘sI4
`
`Fig . 20
`
`2011
`TLOC
`
`8x 8:1
`8 x 8 : 1
`tagged mux
`tagged mux
`
`RegFile
`afl4soy
`
`2012
`CLOC
`
`X
`
`2005
`
`hx HW - Core
`
`2004
`
`| * LW - Core
`
`2002
`cOOC
`
`m xL / S
`
`S/1xwi
`
`xnpasseqT:gZx(yt|+W)
`
`( m + l + h ) x2 8 : 1 tagged mux
`
`2010
`OTOC
`
`
`
`
`
`Jayquepasse1y:w|EQO?
`
`
`
`2003 m : 4 tagged arbiter
`
`
`
`
`
`1
`