throbber
(12) Ulllted States Patent
`McCubbrey
`
`(10) Patent N0.:
`(45) Date of Patent:
`
`US 7,587,699 B2
`Sep. 8, 2009
`
`US007587699B2
`
`(54) AUTOMATED SYSTEM FOR DESIGNING
`AND DEVELOPING FIELD
`PROGRAMMABLE GATE ARRAYS
`
`(75) Inventor: David L. McCubbrey, AnnArbor, MI
`(US)
`
`'
`(73) Asslgneei Pixel Velocity, Inc, AnnArbor, MI (US)
`
`( * ) Notice:
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 282 days.
`
`(21) Appl. N0.: 11/432,186
`
`(22) Filed:
`
`May 10, 2006
`
`(65)
`
`Prior Publication Data
`
`Us 2006/0206850 A1
`
`Sep' 14’ 2006
`_
`_
`Related U's' Apphcatlon Data
`(62) Division of application No. 10/441,581, ?led on May
`19, 2003, noW Pat. No. 7,073,158.
`(60) Provisional application No. 60/381 ,295, ?led on May
`17 2002
`’
`'
`(51) Int Cl
`(2006 01)
`Goa, '17/50
`(52) us. Cl. .................. .. 716/17; 716/6; 716/9; 716/12
`_
`_
`_
`(58) Field of Classi?cation Search ................... .. 716/6,
`_
`_
`7_16/9’ 12’ 17
`See apphcanon ?le for Complete Search hlstory'
`References Cited
`
`(56)
`
`US. PATENT DOCUMENTS
`
`6,086,629 A
`6,301,695 B1
`6,370,677 B1
`6,457,164 B1
`6,526,563 B1
`6,557,156 B1
`2003/0086300 A1
`2005/0165995 A1 *
`
`7/2000 McGettigan et a1.
`10/2001 Burnham et a1.
`4/2002 Carruthers et a1.
`9/2002 HWang et a1.
`2/2003 Baxter
`4/2003 Guccione
`5/2003 Noyes et a1.
`7/2005 Gemelli et a1. ............ .. 710/305
`
`* Cited by examiner
`
`Primary ExamineriThuan Do
`(74) Attorney, Agent, or Firmileffrey Schox
`
`(57)
`
`ABSTRACT
`
`An automated system and method for programming ?eld
`programmable gate arrays (FPGAS) is disclosed for imple
`menting user-de?ned algorithms speci?ed in a high level
`language. The system is particularly suited foruse With image
`processing algorithms and can speed up the process of imple
`menting and testing a fully Written high-level user-de?ned
`algomhm to a matter of a few mmutés’ ram-er than the days’
`Weeks or even months presently requ1red us1ng convent1ona1
`software tools. The automated system includes an analyzer
`module and a mapper module. The analyzer determines What
`logic components are required and their interrelationships,
`and Observes 1h‘? relatlve nmmg between the reqmred Con?
`ponents and then part1a1 products. The mapper module un
`liZes the Output from the analyzer module and determines
`Where the required logic components must be placed on a
`given target FPGA in order to reliably route, Without inter
`ference, the required interconnections betWeen various com
`ponents and U0.
`
`5,841,439 A * 11/1998 Pose et a1. ................ .. 345/418
`
`10 Claims, 13 Drawing Sheets
`
`Source
`Operation
`Code
`Graph
`1:) Analyze :1) Map
`
`Hardware
`Spec
`Generate
`Z?
`Bitstream
`
`101 100...
`
`1:
`
`System Constrain
`tS,
`Target Platform
`
`XILINX, EX. 1002
`Page 1 of 24
`
`

`
`US. Patent
`
`Sep. 8, 2009
`
`Sheet 1 0f 13
`
`US 7,587,699 B2
`
`I05
`
`I08
`
`I08
`
`65
`
`I08
`
`I08
`
`CLB
`
`CLB
`
`CLB
`
`CLB
`
`IOB
`
`PSM
`
`PSM
`
`PSM
`
`IOB
`
`CLB
`
`CLB
`
`CLB
`
`CLB
`
`I08
`
`IOB
`
`I
`
`I
`
`PSM
`
`PSM
`
`PSM
`
`I
`
`CLB
`
`CLB
`
`CLB
`
`CLB
`
`I05
`
`I PSM
`
`PSM
`
`PSM
`
`I
`
`IOB
`
`CLB
`
`CLB
`
`CLB
`
`CLB
`
`I05
`
`I08
`
`FPGA Q
`
`I08
`
`I08
`
`75 v
`CAP
`1_2.5.
`CONFIGURATION
`PORT 1%
`
`I08
`
`FIG. 1
`
`L70
`
`XILINX, EX. 1002
`Page 2 of 24
`
`

`
`US. Patent
`
`Sep. 8, 2009
`
`Sheet 2 0f 13
`
`US 7,587,699 B2
`
`z
`/
`
`Con?qurable Lo lC
`
`Programmable llO
`
`UUUUmmUQ/B
`
`UUUUUUUB @
`
`UUUUUUDU
`
`_1 L; M\
`
`ED555566
`
`Block Ram
`
`T L_
`
`UUUUUUUU
`
`mnimmimm
`
`Multiplier
`
`UUUUUUDD
`UDUUUUUU
`
`Switch
`Matrix
`
`cout A
`
`A cout
`Slice
`
`Slice
`
`Slice
`
`Slice
`
`cin
`
`SLICE
`
`RAM/
`Shift!
`LUT
`
`RAM/
`Shl?/
`LUT
`
`Reg
`
`Reg
`
`E
`
`4
`
`XILINX, EX. 1002
`Page 3 of 24
`
`

`
`US. Patent
`
`Sep. 8, 2009
`
`Sheet 3 0f 13
`
`US 7,587,699 B2
`
`(26
`)
`——i> Analyzer
`
`( 28
`)
`Mapper
`
`24
`
`FPGA
`
`K120
`
`H6. 5
`
`Consider Application
`Constraints
`l
`Select Architecture
`
`Automatically Identify Order
`and Dependents in Source
`Code
`l
`Map Out FPGA Using Selected
`Architecture and Identi?ed Order +
`Dependencies
`
`30
`
`Source
`Code
`
`Operation
`Graph
`:o
`
`Map
`
`Hardware
`Spec
`Generate
`a
`Bitstream
`
`Analyze
`
`ll
`
`System Constraints,
`Target Platform
`
`FlG. 7
`
`XILINX, EX. 1002
`Page 4 of 24
`
`

`
`XILINX, EX. 1002
`Page 5 of 24
`
`

`
`US. Patent
`
`Sep. 8, 2009
`
`Sheet 5 0f 13
`
`US 7,587,699 B2
`
`FIIG. 12
`
`XILINX, EX. 1002
`Page 6 of 24
`
`

`
`XILINX, EX. 1002
`Page 7 of 24
`
`

`
`US. Patent
`
`Sep. 8, 2009
`
`Sheet 7 0f 13
`
`US 7,587,699 B2
`
`Instructions
`
`Stige
`StZge
`Sta+ge
`Master
`360mg 1 3 2 g, k :3:
`Serial
`Serial
`Image
`?
`Image
`Input
`Output
`
`E Image
`Image
`i Buffers i Raster Sub- _, Combiner
`—> Array Processor
`
`FIG. 13
`
`Master
`
`Recon?guration Instructions
`L
`
`‘
`
`l
`
`__, ControI __> 3tage1 _> Stage 2 —————> Stagek ——> ,
`Serral
`SerraI
`Image
`Image
`Input
`Output
`
`I
`Raster Sub
`Array Processor
`
`FIG. 15
`
`XILINX, EX. 1002
`Page 8 of 24
`
`

`
`US. Patent
`
`Sep. 8, 2009
`
`Sheet 8 0f 13
`
`US 7,587,699 B2
`
`Serial
`Image
`Load &
`Unload
`
`Master
`Control
`
`'
`
`Instructions
`—>
`To All PEs
`
`FIG. 14
`
`XILINX, EX. 1002
`Page 9 of 24
`
`

`
`XILINX, EX. 1002
`Page 10 of 24
`
`

`
`XILINX, EX. 1002
`Page 11 of 24
`
`

`
`US. Patent
`
`Sep. 8, 2009
`
`Sheet 11 0f 13
`
`US 7,587,699 B2
`
`Edge = DCyI(snsr,2) - ECyl(snsr,2); Out = edge - snsr
`
`XILINX, EX. 1002
`Page 12 of 24
`
`

`
`US. Patent
`
`Sep. 8, 2009
`
`Sheet 12 0f 13
`
`US 7,587,699 B2
`
`snsr
`
`@w
`
`XILINX, EX. 1002
`Page 13 of 24
`
`

`
`US. Patent
`
`Sep. 8, 2009
`
`Sheet 13 0f 13
`
`US 7,587,699 B2
`
`nozmlwmggs
`
`aoamlomgse
`
`ii
`
`825465
`
`825.25
`U
`
`XILINX, EX. 1002
`Page 14 of 24
`
`

`
`US 7,587,699 B2
`
`1
`AUTOMATED SYSTEM FOR DESIGNING
`AND DEVELOPING FIELD
`PROGRAMMABLE GATE ARRAYS
`
`CROSS-REFERENCE TO RELATED
`APPLICATIONS
`
`This application is a divisional of US. patent application
`Ser. No. 10/441,581 ?led May 19, 2003 now US. Pat. No.
`7,073,158 entitled “Automated System for Designing and
`Developing Field Programmable Gate Arrays”, Which is
`hereby incorporated in its entirety by this reference.
`This application claims the bene?t of US. provisional
`patent application Ser. No. 60/381,295 ?led May 17, 2002
`entitled “Automated System for Designing and Developing
`Field Programmable Gate Arrays”, Which is hereby incorpo
`rated in its entirety by this reference.
`
`TECHNICAL FIELD
`
`This invention relates in general to systems and methods
`for designing, developing and programming ?eld program
`mable gate arrays (FPGAs), and in particular to automated
`systems and methods for designing, developing and program
`ming FPGAs to implement a user-Written algorithm speci?ed
`in a high-level language for processing data vectors With one,
`tWo or more dimensions, such as often are found in image
`processing and other computationally intense applications.
`
`BACKGROUND
`
`There are known bene?ts of using FPGAs for embedded
`machine vision or other image processing applications. These
`include processing image data at high frame rates, converting
`and mapping the data and performing image segmentation
`functions that Were all previously handled by dedicated, pro
`prietary processors. FPGAs are Well-knoWn for having a
`much greater poWer to process images, on the order of 10 to
`100 times that of conventional advanced microprocessors of
`comparable siZe. This is in part a function of the fully pro
`grammed FPGA being set up as a dedicated circuit designed
`to perform speci?c tasks and essentially nothing else.
`Another bene?t of FPGAs is their loW poWer consumption
`and loW Weight. FPGAs are very suitable for embedded avi
`onic applications, in-the-?eld mobile vision applications and
`severe-duty applications, such as mobile vehicles, including
`those Which are off-road, Where severe bumps and jolts are
`commonplace. These applications are very demanding in that
`they have severe space, Weight, and poWer constraints. Mod
`ern FPGAs noW have the processing capacity on a par With
`dedicated application-speci?c integrated circuits (ASICs),
`and are or can be made very rugged.
`FPGAs have groWn in popularity because they can be
`programmed to implement particular logic operations and
`reprogrammed easily as opposed to an application speci?c
`integrated circuit (hereafter ASIC) Where the functionality is
`?xed in silicon. But this very generic nature of FPGAs, delib
`erately made so they can be used in many different applica
`tions, is also a draWback due to the many di?iculties associ
`ated With ef?ciently and quickly taking a high level design
`speci?ed by a user, and translating it into a practical hardWare
`design that meets all applicable timing, ?oor plan and poWer
`requirements so that it Will run successfully upon the target
`FPGA. As is Well-knoWn, a high level user-generated design
`is typically speci?ed by a sequence of matrix array or math
`ematic operations, including local pixel neighborhood opera
`tions (such as erosion, dilation, edge detection, determination
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`2
`of medial axis, etc.) and other forms of arithmetic or Boolean
`operations (e. g., addition, multiplication; accumulation;
`exclusive-OR, etc.), lookup table and shift register functions,
`and other functions like convolution, autocorrelation, and the
`like. In order to be able to handle all of this diverse logic, the
`individual logic blocks used in the FPGAs are made to be
`fairly generic.
`The problem in supporting all these applications and func
`tions is hoW to design recon?gurable hardWare resources that
`provide the most effective use of general purpose FPGA
`silicon for the speci?c image processing tasks to Which a
`given FPGA is put to use. FPGAs are by their very nature
`general purpose circuits that can be programmed to perform
`many different functions, such as digital signal processing
`used in Wireless communication, encryption and decryption
`for communications over the Internet, etc.
`One expected bene?t of FPGAs, since they are reprogram
`mable, is that they Would help eliminate the cost/risk of ASIC
`development. One of the feW things really holding back the
`larger use of FPGAs in vision applications has been the dif
`?culty in translating desired user-de?ned image processing
`algorithms into hardWare, and the dif?culty of updating those
`algorithms once they are in hardWare. If there Were a devel
`opment system for the design and programming of FPGAs
`that greatly simpli?ed the development of an image process
`ing algorithm or other sequence of desired operations into the
`bitstream coding required to program FPGAs, this might Well
`open up opportunities for Wider use of FPGAs in such appli
`cations as medical, automotive collision avoidance and com
`mercial video.
`For example, in the medical area, many medical imaging
`techniques have extremely high processing requirements.
`FPGAs, assuming that they can be programmed With the
`desired sequence of complex image processing steps, should
`produce smaller, faster and less expensive versions of existing
`image processing devices that presently require ASIC devices
`be developed. In addition, many neW applications Will
`become possible for the ?rst time, because FPGAs can give
`speedups of one, tWo and even three orders of magnitude over
`PCs, at a reasonable price. Automotive vision applications
`that are on the horizon include proposals to help enhance
`driver situational aWareness. Possible automotive vision
`applications include systems to assist With lane-changes, to
`provide backup obstacle Warnings, and to provide forWard
`collision Warnings.
`Commercial video FPGAs, if they Were much easier to
`design, program and test, Would likely ?nd much Wider use in
`video transcoders, compression, encryption and standards
`support, particularly in areas like MPEG-4. Many video
`applications are already being done With FPGAs, but the
`design, development and testing of such FPGAs is at present
`very labor-intensive in terms of designer and engineering
`services, Which drives up unit costs and sloWs doWn the
`transfer of proposed designs into actual commercial embodi
`ments.
`
`SUMMARY
`
`In light of the foregoing limitations and needs, the present
`invention provides an FPGA-based image processing plat
`form architecture that is capable dramatically speeding up the
`development of user-de?ned algorithms, such as those found
`in imaging applications.As a convenient shorthand reference,
`since the present invention is assigned to Pixel Velocity, Inc.
`of AnnArbor, Mich. (“PVI”), the system of the present inven
`tion Will at times be referred to as the PVI system, and the
`
`XILINX, EX. 1002
`Page 15 of 24
`
`

`
`US 7,587,699 B2
`
`3
`methods of the present invention discussed therein Will at
`times be referred to as the PVI methods.
`Generally, the present invention pertains to an automated
`system for programming ?eld programmable gate arrays (FP
`GAs) to implement a desired algorithm for processing data
`vectors With one, tWo or more of the dimensions. The PVI
`system automates the process of determining What logic com
`ponents are necessary and produces an optimiZed placement
`and routing of the logic on the FPGA. With this invention,
`FPGA programming development Work that used to take
`Weeks or months, in terms of trying to implement and test a
`previously-created user-de?ned algorithm, such as a
`sequence of steps to be carried out as part of an image pro
`cessing application in a machine vision system, can noW be
`completed in less than one day.
`As is Well-knoWn, Verilog and VHDL are languages for
`describing hardWare structures in development systems for
`Writing and programming FPGAs. In the methods and sys
`tems of the present invention, Verilog is used to develop What
`PVI refers to as “gateWare” Which provides speci?c hard
`Ware-level interfaces to things like image sensors and other
`U0. The end user evokes this functionality in much the Way
`prede?ned library functions are used in softWare today. The
`PVI system focuses solely on the image processing domain.
`At the application level, a user’s image processing algorithm
`is developed and veri?ed in C++ on a PC. An image class
`library and overloaded operators are preferably provided as
`part of the PVI system of the present invention to give users a
`Way of expressing algorithms at a high level. The PVI system
`uses that high level representation to infer a “correct-by
`construction” FPGA hardWare image data?oW processor
`automatically.
`In the method and systems of the present invention, the
`dedicated image processor is derived from the user’s source
`code and merged With prebuilt “gateWare” automatically, as
`part of the process of producing one or more loW-level ?les
`that may be referred to as hardWare-gate-programming ?les
`(or HGP ?les for short) for programming the FPGA(s) using
`knoWn loW-level softWare tools available from each FPGA
`manufacturer. The user thus ends up With a machine that
`poWers up and runs their algorithm on a continuous stream of
`images. A key advantage is that algorithm developers can
`Write and verify algorithms in a familiar and appropriate Way,
`then produce a “push-button” dedicated machine in only min
`utes, fabricated to do just that algorithm. In other Words, the
`PVI system of the present invention analyZes the imaging
`algorithm code speci?ed by the end user, that is the algorithm
`developer, and, by applying a sequence of steps, Which are
`further described beloW, generates a hardWare-gate-program
`ming ?le composed entirely of conventional commands and
`instructions that can be interpreted by loW-level FPGA pro
`gramming tools to produce bitstreams. These HGP ?les are
`used as a loW-level input ?le containing the code that speci
`?es, to conventional loW-level programming (LLP) softWare
`tools available from the FPGA manufacturer (that is, the
`bitstream generators used to hard code the FPGAs), the
`required connections to be programmed into the target FPGA.
`These LLP softWare tools are capable of reading and acting
`upon the commands represented by the HGP ?les in order to
`?eld-program the FPGA using conventional techniques. The
`method and systems of the present invention are preferably
`arranged to automatically apply, upon user command, the
`
`50
`
`55
`
`60
`
`65
`
`20
`
`25
`
`30
`
`35
`
`40
`
`4
`HGP ?le output they produce to these LLP softWare tools,
`thus completing the programming of the FPGA in a fully
`automatic manner.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`The draWings form an integral part of the description of the
`preferred embodiments and are to be read in conjunction
`thereWith. Like reference numerals designate the same or
`similar components or features in the various Figures, Where:
`FIG. 1 is a simpli?ed block diagram of a knoWn FPGA.
`FIGS. 2, 3 and 4 are further simpli?ed the block diagrams
`illustrating a knoWn style of FPGA, Where FIG. 2 shoWs the
`overall layout of the FPGA, and also shoWs one of its speci?c
`sections enlarged to reveal the arrangement details of CLBs,
`block RAM and multiplier logic therein,
`FIG. 3 is an enlargement of a single CLB unit shoWing its
`sWitch matrix and its associated slices, Which contain still
`further units of con?gurable logic therein, and
`FIG. 4 is a enlarged vieW of one of the slices, shoWing its
`RAM, registers, shift registers and lookup tables, all of Which
`are programmable.
`FIG. 5 is a simpli?ed block diagram shoWing the sequence
`of operations used by the system and methods of the present
`invention, starting With a user-de?ned algorithm on the left,
`Whose content is entered into an analyZer module, Whose
`output in turn is entered into a mapper module, Whose output
`is a loW level source code that can be used to program an
`FPGA.
`FIG. 6 is a ?owchart illustrating the overall method of the
`present invention.
`FIG. 7 is another simpli?ed by block diagram like that
`shoWn in FIG. 5 Which represents the major steps utiliZed in
`methods of the present invention.
`FIG. 8 is a simpli?ed layout shoWing a preferred serpentine
`arrangement for a succession of image processing operations
`Which have been mapped onto a portion of the overall FPGA
`shoWn in FIG. 2.
`FIG. 9 is a more detailed vieW of the simpli?ed layout of
`FIG. 8 shoWing hoW the individual operations of the user
`de?ned sequence may be mapped onto CLBs typically
`betWeen tWo separate sections of RAM Which are used as
`delay lines in order to ensure that proper timing is maintained
`betWeen partial products of the image processing sequence.
`FIG. 10 is a simpli?ed perspective vieW of a presently
`preferred arrangement of printed circuit boards (PCBs),
`called a multi-processor stack, Wherein each of the PCBs
`preferably contains at least one FPGA, and also may typically
`have associated thereWith driver circuits, input/ output cir
`cuits, poWer circuits and the like in order to ensure proper
`operation of the FPGA, and also has in-line connectors rep
`resented by the elongated blocks for interconnecting the
`PCBs together, and for receiving to input/ output signals at the
`top and bottom of the stack, and also shoWing, on the top
`PCB, an image sensor and a miniature focusing lens in the
`center of the top board.
`FIG. 11 is a block diagram shoWing the interrelationship
`and Wiring connections betWeen the four PCBs in the stack of
`FIG. 10, Which illustrates the signal ?oW paths betWeen the
`individual PCBs and also illustrates a Workstation being con
`nected to the microcontroller PCB, Which Workstation passes
`the bitstream from the loW level programming tool located on
`the Workstation to the FPGA/program ?ash/RAM microcon
`troller, Which thereafter handles the loading of the bitstream
`after poWer up to the individual FPGAs.
`FIG. 12 is a simpli?ed perspective vieW of a digital camera
`With its generally rectangular enclosure, having an external
`
`XILINX, EX. 1002
`Page 16 of 24
`
`

`
`US 7,587,699 B2
`
`5
`lens on its left surface, Which external lens is used to project
`a visual image onto the image sensor located on the top PCB
`of the FIG. 10 stack shown located Within the camera enclo
`sure.
`FIG. 13 is a simpli?ed block diagram of a ?rst possible
`target architecture for the system of the present invention,
`namely a multi-pipeline raster sub-array.
`FIG. 14 is a simpli?ed block diagram of a second possible
`target architecture for the system of the present invention,
`namely a parallel array processor.
`FIG. 15 is a simpli?ed block diagram of a third possible
`target architecture for the system of the present invention,
`namely a pipeline raster sub-array processor.
`FIG. 16 is a more detailed diagram shoWing some of the
`details of the FIG. 15 target architecture.
`FIG. 17 illustrates on the bottom half thereof a Sobel opera
`tion data?oW produced by the analyZer module of the system
`of the present invention, and on the top half thereof illustrates
`the mapping of that Sobel operation data?oW onto a multi
`pipeline sub-array processor.
`FIG. 18 is an illustration of high-level source code, de?ned
`by an end user, and its translation into an associated operation
`data?oW diagram.
`FIG. 19 is an illustration of the simpli?cation of the FIG. 18
`operation data?oW diagram by the removal of unnecessary
`operations.
`FIG. 20 is an illustration of pipeline compensation being
`added to the resulting product in FIG. 19 in order to equalize
`the timing betWeen alternate data paths.
`FIG. 21 is an illustration of operator elaboration modifying
`the graph When the operator is built from more than one
`primitive component, as Would be carried out by the mapper
`When presented With a image processing sequence of the type
`shoWn on the left side of FIG. 21.
`
`DETAILED DESCRIPTION OF THE PREFERRED
`EMBODIMENTS
`
`The present invention is illustrated and described herein in
`connection With preferred embodiments, With the under
`standing that the present disclosure is to be considered as an
`exempli?cation of the principles of the invention and the
`associated functional speci?cations required for its imple
`mentation. HoWever, it shouldbe appreciated that the systems
`and methods of the present invention may be implemented in
`still different con?gurations and forms, and that other varia
`tions Within the scope of the present invention are possible
`based on the teachings herein.
`Prior to discussing the embodiments of the present inven
`tion, it is useful to look more closely at some of the knoWn
`characteristics of existing design, development and program
`ming systems used to provide hardWare programming bit
`streams to program FPGAs. Typically, such design and devel
`opment systems are implemented on Workstations operating
`under any suitable operating system, such as UNIX, Win
`doWs, Macintosh or Linux. Such development systems typi
`cally Will have suitable applications softWare such as ISE
`development system from Xilinx, and C++ or Java program
`ming compilers, to alloW programs Written by users to run
`thereon.
`Due to advancing semiconductor processing technology,
`integrated circuits have greatly increased in functionality and
`complexity. For example, programmable devices such as ?eld
`programmable gate arrays (FPGAs) and programmable logic
`devices (PLDs), can incorporate ever-increasing numbers of
`functional blocks and more ?exible interconnect structures to
`provide greater functionality and ?exibility.
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`6
`A typical FPGA comprises a large plurality of con?gurable
`logic blocks (CLBs) surrounded by input-output blocks and
`interconnectable through a routing structure. The ?rst FPGA
`is described in US. reissue Pat. Re No. 34,363 to Freeman,
`and is incorporated herein by reference. The CLBs and rout
`ing structure of the FPGA are arranged in an array or in a
`plurality of sub-arrays Wherein respective CLBs and associ
`ated portions of the routing structure are placed edge to edge
`in What is commonly referred to as a tiled arrangement. Such
`a tiled arrangement is described in US. Pat. No. 5,682,107 to
`Tavana et al., the disclosure of Which is hereby incorporated
`by reference herein. The CLB portion of a tile comprises a
`plurality of primitive cells Which may be interconnected in a
`variety of Ways to perform a desired logic function. For
`example, a CLB may comprise a plurality of lookup tables
`(LUTs), multiplexers and registers. As used herein, the term
`“primitive cell” normally means the loWest level of user
`accessible component.
`FIG. 1 is a simpli?ed schematic diagram of a conventional
`FPGA 60. FPGA 60 includes user logic circuits such as
`input/output blocks (IOBs), con?gurable logic blocks
`(CLBs), and programmable interconnect 65, Which contains
`programmable sWitch matrices (PSMs). Each IOB and CLB
`can be con?gured through con?guration port 70 to perform a
`variety of functions. Programmable interconnect 65 can be
`con?gured to provide electrical connections betWeen the vari
`ous CLBs and IOBs by con?guring the PSMs and other
`programmable interconnection points (PIPS, not shoWn)
`through con?guration port 70. Typically, the IOBs can be
`con?gured to drive output signals or to receive input signals
`from various pins (not shoWn) of FPGA 60.
`FPGA 60 also includes dedicated internal logic. Dedicated
`internal logic performs speci?c functions and can only be
`minimally con?gured by a user. For example, con?guration
`port 70 is one example of dedicated internal logic. Other
`examples may include dedicated clock nets (not shoWn),
`poWer distribution grids (not shoWn), and boundary scan
`logic (i.e. IEEE Boundary Scan Standard 1 149.1, not shoWn).
`FPGA 60 is illustrated With 16 CLBS, l6 IOBs, and 9
`PSMs for clarity only. Actual FPGAs may contain thousands
`of CLBS, thousands of IOBs, and thousands of PSMs. The
`ratio of the number of CLBs, IOBs, and PSMs can also vary.
`FPGA 60 also includes dedicated con?guration logic cir
`cuits to program the user logic circuits. Speci?cally, each
`CLB, IOB, PSM, and PIP contains a con?guration memory
`(not shoWn) Which must be con?guredbefore each CLB, 10B,
`PSM, or PIP can perform a speci?ed function. Typically the
`con?guration memories Within an FPGA use static random
`access memory (SRAM) cells. The con?guration memories
`of FPGA 60 are connected by a con?guration structure (not
`shoWn) to con?guration port 70 through a con?guration
`access port (CAP) 75. A con?guration port (a set of pins used
`during the con?guration process) provides an interface for
`external con?guration devices to program the FPGA. The
`con?guration memories are typically arranged in roWs and
`columns. The columns are loaded from a frame register Which
`is in turn sequentially loaded from one or more sequential
`bitstreams. (The frame register is part of the con?guration
`structure referenced above.) In FPGA 60, con?guration
`access port 75 is essentially a bus access point that provides
`access from con?guration port 70 to the con?guration struc
`ture of FPGA 60.
`FIG. 1A illustrates a conventional method used to con?g
`ure FPGA 60. Speci?cally, FPGA 60 is coupled to a con?gu
`ration device 230 such as a serial programmable read only
`memory (SPROM), an electrically programmable read only
`memory (EPROM), or a microprocessor. Con?guration port
`
`XILINX, EX. 1002
`Page 17 of 24
`
`

`
`US 7,587,699 B2
`
`7
`70 receives con?guration data, usually in the form of a con
`?guration bitstream, from con?guration device 230. Typi
`cally, con?guration port 70 contains a set of mode p ins, a
`clock pin and a con?guration data input pin. Con?guration
`data from con?guration device 230 is transferred serially to
`FPGA 60 through the con?guration data input pin. In some
`embodiments of FPGA 60, con?guration port 70 comprises a
`set of con?guration data input pins to increase the data trans
`fer rate betWeen con?guration device 230 and FPGA 60 by
`transferring data in parallel. HoWever, due to the limited
`number of dedicated function pins available on an FPGA,
`con?guration port 70 usually has no more than eight con?gu
`ration data input pins. Further, some FPGAs alloW con?gu
`ration through a boundary scan chain. Speci?c examples for
`con?guring various FPGAs can be found on pages 4-46 to
`4-59 of “The Programmable Logic Data Book”, published in
`January, 1998 by Xilinx, Inc., and available from Xilinx, Inc.,
`2100 Logic Drive, San Jose, Calif. 95124, Which pages are
`incorporated herein by reference. Additional methods to pro
`gram FPGAs are described by in US. Pat. No. 6,028,445 to
`LaWman issued Feb. 22, 2000, assigned to Xilinx, Inc. and
`entitled “Decoder Structure and Method for FPGA Con?gu
`ration,” the disclosure of Which is hereby incorporated by
`reference herein.
`US. Pat. No. 6,086,629 to McGettigan et al. issued Jul. 11,
`2000, is entitled “Method for Design Implementation of
`Routing in an FPGA Using Placement Directives Such as
`Local Outputs and Virtual Buffers” (the ’629 patent), and is
`assigned to Xilinx, Inc. As explained therein, When an FPGA
`comprises thousands of CLBs in large arrays of tiles, the task
`of establishing the required multitude of interconnections
`betWeen primitive cells inside a CLB and betWeen the CLBs
`becomes so onerous that it requires softWare tool implemen
`tation. Accordingly, the manufacturers of FPGAs including
`Xilinx, Inc., have developed place and route softWare tools
`Which may be used by their customers to implement their
`respective designs. Place and route tools not only provide the
`means of implementing users’ designs, but can also provide
`an accurate and ?nal analysis of static timing and dynamic
`poWer consumption for an implemented design scheme. In
`fact, better place and route softWare provides iterative pro
`cesses to minimiZe timing and poWer consumption as a ?nal
`design implementation is approached. Iterative steps are usu
`ally necessary to reach a ?nal design primarily because of the
`unknoWn impact of the placement step on routing resources
`(Wires and connectors) available to interconnect the logic of a
`user’ s design. Iterative place and route procedures can be time
`consuming. A typical design implementation procedure can
`take many hours of computer time using conventional place
`and route softWare tools. Thus, as previously noted, there is an
`ongoing need to provide a method for reducing design imple
`mentation time by increasing the accuracy of static timing and
`dynamic poWer analysis during computer-aided design pro
`cedures for FPGAs. The ’629 patent addresses these issues of
`accuracy of static timing and dynamic poWer analyses. HoW
`ever, it does not provide a streamlined method for translating
`user-created algorithms into bitstreams.
`The ’629 patent also discusses the challenge presented to
`softWare tools used to place a user’s design into a coarse
`grained FPGA is to make optimum use of the features other
`than lookup tables and registers that are available in the FPGA
`architecture. These can include fast carry chains, XOR gates
`for generating sums, multiplexers for generating ?ve-input
`functions, and possibly other features available in the archi
`tecture. In order to achieve maximum density and maximum
`performance of user logic in an FPGA, the softWare must
`make use of these dedicated features Where possible. The
`
`40
`
`45
`
`20
`
`25
`
`30
`
`35
`
`50
`
`55
`
`60
`
`65
`
`8
`’629 patent also states that there is a need to densely pack the
`user’s design into the architecture that Will implement the
`design.
`The ’629 patent also discusses that it is Well-knoWn to
`specify or provide library elements Which re?ect features of
`the FPGA architecture in the typical development system
`provided to end-users. Several architectural features and
`associated timing and poWer parameters can be represented
`by variable parameters for one library element. For example,
`a lookup table library element has one variation in Which the
`lookup table output signal is applied to a routing line external
`to the con?gurable logic block (CLB), and another variation
`in Which the lookup table output signal is applied to another
`internal element of the CLB such as a ?ve-input function
`multiplexer or a carry chain control input. These tWo varia
`tions have different timing parameters associated With them
`because the time delay for driving an element internal to the
`CLB is less than the time delay for driving an interconnect
`line external to the CLB.
`If the FPGA user is using VHDL or schematic capture for
`design entry, the VHDL or schematic capture design entry
`tool Will auto-select the library elements, but the user must
`still control the design entry tool so it selects and connects the
`library elements properly. Alternatively, the user may design
`at a higher level using macros that incorporate the library
`elements. These macros Will have b

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket