`Ace ad
`
`111111111111111111111111111111111111111111111111111111111111111111111111111
`US005982937 A
`[11] Patent Number:
`[45] Date of Patent:
`
`5,982,937
`Nov. 9,1999
`
`[54] APPARATUS AND METHOD FOR HYBRID
`COMPRESSION OF RASTER DATA
`
`06153172
`9615620
`
`Japan .
`5/1994
`5/1996 WIPO .
`
`[75]
`
`Inventor: Yigal Accad, Millbrae, Calif.
`
`[73] Assignee: Electronics for Imaging, Inc., Foster
`City, Calif.
`
`[21] Appl. No.: 08/773,656
`
`[22] Filed:
`
`Dec. 24, 1996
`
`[51]
`[52]
`[58]
`
`[56]
`
`Int. Cl.6
`....................................................... G06T 9/00
`U.S. Cl. ............................................. 382/239; 382/166
`Field of Search ..................................... 358/430, 539;
`382/166, 239, 243
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`9/1992 Miyata .................. ... ......... .......... 382/9
`5,151,949
`5,479,587 12/1995 Campbell et a!.
`...................... 395/116
`5,796,864
`8/1998 Callahan ................................. 382/166
`
`FOREIGN PATENT DOCUMENTS
`
`0286286 10/1988
`8/1993
`0597571
`0691784
`1!1996
`0712088
`5/1996
`
`European Pat. Off ..
`European Pat. Off ..
`European Pat. Off ..
`European Pat. Off ..
`
`OTHER PUBLICATIONS
`
`Search Report mailed on Apr. 15, 1998.
`
`Primary Examiner-Thomas D Lee
`Assistant Examiner-Stephen Brinich
`Attorney, Agent, or Firm-Majestic, Parsons, Siebert &
`Hsue P.C.
`
`[57]
`
`ABSTRACT
`
`From a raster page, patches of connected pixels of the same
`color are identified. Patches of at least a predetermined
`sized, typically corresponding to text or line art objects, are
`subjected to a lossless compression. Patches below the
`predetermined size, typically corresponding to image or
`photo objects, are substantially subjected to a lossy com(cid:173)
`pression. The patch predetermined size controls the mix of
`lossless and lossy compression procedures. Optimum com(cid:173)
`pression is achieved by maximizing the lossless compres(cid:173)
`sion while attaining a targeted compression ratio. Various
`features include efficient recognition and encoding of
`patches, refined treatment of the boundaries between the
`lossless- and the lossy-compressed pixels, adaptive com(cid:173)
`pression ratio control, and fail-safe compression provisions.
`
`60 Claims, 8 Drawing Sheets
`
`Rasterized Data
`
`Size>= D1
`(Type 1 Patch)
`
`Patch Recognizer
`
`Patch Type Discriminator
`
`130
`
`Size< D1
`(Type 2 Patch)
`
`160
`
`210
`
`First
`Compressed
`Layer
`
`:
`200
`'--.J
`I
`:
`1
`1
`I
`I
`I
`I
`I
`I
`I
`I
`L ____ -------------- _____ I
`300 ------ --------------- ------
`~ .----~,....-----,
`
`Second
`Compressed
`Layer
`
`I
`I
`:
`
`320
`
`To PRINT ENGINE
`
`Vedanti Systems Limited - Ex. 2017
`Page 1
`
`
`
`U.S. Patent
`
`Nov. 9,1999
`
`Sheet 1 of 8
`
`5,982,937
`
`10
`
`20
`
`HOST
`or
`WORKSTATION
`
`Page
`1----------1~ Description
`File
`
`60
`)
`
`Microprocessor} J
`
`62
`
`ROM
`
`~- 64
`
`RAM
`
`f-.. 66
`
`70
`
`ENGINE
`
`Printed
`Doc-
`ument
`
`80
`
`Comp(cid:173)
`ressed
`Page
`Buffer
`
`FIG. 1
`
`Vedanti Systems Limited - Ex. 2017
`Page 2
`
`
`
`U.S. Patent
`
`Nov. 9,1999
`
`Sheet 2 of 8
`
`5,982,937
`
`92
`
`StyleA
`
`FIG. 2
`
`90
`
`94
`
`96
`
`98
`
`410
`
`94
`
`96
`
`412
`
`Style A
`
`FIG. 4a
`
`I I I I I I I
`
`~432
`
`I I I I I I I I I I I I ~--- 440
`20
`r-.4
`
`92
`
`.........
`430
`
`1-1"-...
`1'\
`
`.
`
`v- -I'.
`. ll.
`
`I
`
`\
`
`~
`
`444
`446-
`
`I
`
`v
`
`'-'-
`
`,., 96
`
`-
`
`-.....9
`8
`
`FIG. 4b
`
`Vedanti Systems Limited - Ex. 2017
`Page 3
`
`
`
`U.S. Patent
`
`Nov. 9,1999
`
`Sheet 3 of 8
`
`5,982,937
`
`50
`Rasterized Data
`'
`100
`,.
`__ :J _____________ _
`
`110
`
`Patch Recognizer
`
`Patch Type Discriminator
`
`Size>= 01
`(Type 1 Patch)
`
`Size< D1
`(Type 2 Patch)
`
`t2( i, j )
`
`p 1 ( i, j )
`First
`Compressor
`
`150
`
`p2( i ,j ) blocks
`Second
`Compressor
`
`160
`
`First
`Compressed
`Layer
`
`Second
`Compressed
`Layer
`
`1
`I
`
`Second
`Decompressor
`
`320
`
`:
`200
`\._j
`I
`I
`I
`I
`I
`I
`I
`I
`I
`300
`- _-_-_-_-_-
`~ First
`Decompressor
`310 1 312
`p1 ( i, j )
`:
`I
`,.----L----,
`I
`1340
`I
`L _____ - - - - - - - - - - - - - - - - - - - - - -
`
`330
`
`To PRINT ENGINE
`
`FIG. 3
`
`Vedanti Systems Limited - Ex. 2017
`Page 4
`
`
`
`U.S. Patent
`
`Nov. 9,1999
`
`Sheet 4 of 8
`
`5,982,937
`
`Rasterized Data
`,,
`100~~-- ------------ ~- ~2- -------------,
`110
`
`r 134
`Patch Type Discriminator 132
`Type 2 r--------'---....,
`p2(i, D..
`In-Block Comparator:
`Patch Size
`Comparator:
`...
`BlockP2Count
`PatchPixeiCount >= 01 ? No
`>= BlockP2MinCount?
`1 No
`Yes
`Yes
`.
`treat as 1f Type 1
`
`First-Pass
`Encoded
`Type 1 p1( i, j)
`
`t2( i, j )
`
`p2( i ,j ) blocks
`
`First
`Compressor
`::
`'-..../ 160
`- Second Pass y
`Encoded p2 blocks
`Encoded p1 & t2
`1
`L--------- ---------------~-----------~
`
`Second
`"'
`Compressor
`
`150'
`
`1
`
`:
`
`~
`174
`
`Q_factor
`
`( 170
`
`1
`
`210
`~
`
`First
`Compressed
`Layer
`
`Second
`Compressed
`Layer
`
`:
`
`~---- ---------------~----,
`, r
`:
`'
`200
`220
`'-.J
`' -
`1
`I
`1
`1
`I
`1
`I
`
`Adaptive
`1
`I
`1 Cft Compression
`Ratio
`...
`(
`1
`I
`Controller
`1
`I 176~-~~--~
`t ; 172
`t-"
`Compression
`Parameters
`
`~ ---- -------------- . -----:
`300 - - - - - - - - - - - - - - - - - - - - - -
`- - - - - -
`'
`~
`First
`Decompressor
`310: 312-
`I
`:
`p1(i,j)
`I
`:
`330
`OR
`:::
`: AND
`340
`I
`L----------------------------
`54 __. r To PRINT ENGINE
`
`t-i"'-..
`1 :
`'
`320
`
`1
`
`Second
`Decompressor
`
`314./ t2(i,j)~ ,p2'(i,j)
`
`FIG. 5
`
`Vedanti Systems Limited - Ex. 2017
`Page 5
`
`
`
`U.S. Patent
`
`Nov. 9,1999
`
`Sheet 5 of 8
`
`5,982,937
`
`RUN LENGTH CODING
`
`TYPE-
`CODE
`(4 bit)
`0
`
`1
`
`2
`
`3
`
`4
`
`5
`
`6
`
`7
`
`8
`
`9
`
`10
`
`11
`
`12
`
`13
`
`14
`
`15
`
`+PARAMETER Total
`Bits
`
`FUNCTION
`
`RUN LENGTH
`CONFIGURATION
`
`32 bit color
`
`36
`
`p(i, j) = arbitrary color
`
`4 bit index
`
`NONE
`(run length r = 1)
`4 or 20 bits
`run length r = R
`NONE
`(run length r = 1)
`4 or 20 bits
`run length r = R
`NONE
`(run length r = 1)
`4 or 20 bits
`run length r = R
`NONE
`(run length r = 1)
`4 or 20 bits
`run length r = R
`
`8
`
`4
`
`p( i, j ) = palette color
`
`p( i, j ) = p( i, j - 1)
`
`8-24 p( i, k ) = p( i, j - 1)
`j <= k <j + R
`p( i, j ) = p( i - 1' j )
`
`4
`
`8-24 p( i, k ) = p( i - 1' k)
`j < k <j + R
`p( i, j ) = p( i - 1' j - 1)
`
`4
`
`8-24 p( i, k ) = p( i - 1' j -1)
`j <- k < j + R
`p( i, j ) = p( i - 1' j + 1)
`
`4
`
`8-24 p( i, k ) = p( i - 1, j + R)
`j <= k < j + R
`
`Reserved
`
`D
`
`D
`
`~
`lit I I .. - -· .- .;_"]]
`R
`
`~
`~:~~~:~
`----
`
`~lib
`~-t I
`I ! ·:-.::-.;"I]
`.
`. -.:
`ffiJ
`rr· - . -. -rrfil¥
`----
`
`.
`
`4 or 20 bits
`run length r = R
`
`8-24 p( i, k) = p( i -1, j)
`j <- k <j + R
`
`.fl1-n-. -. -. TI
`----
`
`Reserved
`
`4 or 20 bits
`run length r R
`
`NONE
`(run length r = 1)
`
`4 or 20 bits
`run length r = R
`
`8-24 p( i, k) = p( i- 1, j + R -1)
`j < k <j + R
`p( i, j ) = transparent
`= t2( i, j )
`
`4
`
`I I I I ·_:-.:_:-.;_~.
`
`D
`
`8-24 p( i, k ) = transparent
`= t( i, k )
`j <= k < j + R
`
`ITL-.~-~.:.TI
`
`R
`
`FIG. 6
`
`Vedanti Systems Limited - Ex. 2017
`Page 6
`
`
`
`U.S. Patent
`
`Nov. 9,1999
`
`Sheet 6 of 8
`
`5,982,937
`
`r- 510
`r- 520
`
`START
`
`BEGIN NEW PAGE (NxM)
`Initialize i=O
`Striplast=No-lnk
`GET NEW StripCurrent (8 rows of r-530
`...._
`pixels) and Initialize i5=0
`
`...
`
`Begin next row in StripCurrent
`row count in page: i=i+1
`row count in Strip: i5=i5+1
`
`Start with first column
`Set j=1
`
`r - 532
`
`l--- 534
`
`r--+
`
`No
`.....__
`No
`
`r-
`570
`
`No
`
`PATCH RECOGNITION
`Find next row patch by recognizing ~ 536
`a new run along row i5
`and possibly connect it to adjacent
`same-color pixels to the top or left
`of the run (see Figs. 8a & 8b)
`
`538
`Beyond last column: j>N?
`End of StripCurrent: i5 =8 ? v-- 539
`r
`DISCRIMINATE TYPE 1 or 2
`54o
`patches in Striplast
`v- 550
`
`COMPRESS & STORE
`Types 1 and 2 patches from
`Striplast appropriately
`
`ADJUST COMPRESSION RATIO:
`
`(CR) and adjust compression
`parameter (Q_factor) for next strip
`
`Process the last strip of the current
`page
`
`Discriminate Type 1 or 2 patches
`in the last strip
`
`Compress & store Types 1 and 2
`
`580
`
`582
`
`Compute to-date compression ratio v- 560
`572
`L
`END OF PAGE: i>=M ? A Striplast=StripCurrent I
`v
`v
`584
`patches in the last strip accordingly v
`v- 590
`v- 592
`
`END of document ?
`
`END
`
`FIG. 7
`
`Vedanti Systems Limited - Ex. 2017
`Page 7
`
`
`
`U.S. Patent
`
`Nov. 9,1999
`
`Sheet 7 of 8
`
`5,982,937
`
`.....
`
`r0=1
`j+r0<=N?
`p(i,j+r 0)=p(i,j) ? ~ r 0=r0 +1 I
`~0
`R=r0
`
`I
`
`600
`
`610
`
`No
`,_.
`
`41
`
`No
`
`No
`,_.
`
`41
`
`1lr
`p(i,j)=p(i-1,j)?
`
`r=1
`
`j+r<=N?
`
`.....
`
`I
`
`p(i,j+r}=p(i-1,j+r}? ~ r=r+1
`~0
`r>=r0 ?
`
`I
`
`_ ...
`i...
`
`_ ...
`
`__.._
`
`'
`
`Yes.l.
`
`.l.No
`
`C=2
`R=r
`REF=Var
`
`C=5
`R=r0
`REF=p(i-1,j)
`
`+
`
`p(i,j)=p(i,j-1)?
`
`No
`
`C=1
`R=r0
`REF=p(i,j-1)
`I
`
`+
`
`No
`
`p(i,j)=p(i-1,j-1) ?
`
`C=3
`R=r0
`REF=p(i-1,j-1)
`I
`
`I
`
`FIG. Ba
`
`Vedanti Systems Limited - Ex. 2017
`Page 8
`
`
`
`U.S. Patent
`
`Sheet 8 of 8
`
`5,982,937
`
`.
`.
`.
`
`No
`
`No
`
`Nov. 9,1999
`!
`j+r0<=N?
`p(i,j)=p(i-1 ,j+r0)?
`C=4
`R=r0
`REF=p(i-1 ,j+r0 )
`r
`
`670
`
`680
`
`..
`r=r0
`.. p(i,j)=p(i-1 ,j+r-1)?
`..
`.No
`r=r-1
`
`C=6
`~
`R=r
`REF=p(i-1 ,j+r -1)
`
`f-+
`
`I
`
`r No
`r<=1 ?
`Yesl__
`
`,
`p(i,j+r)=palette color ?
`
`t Yes
`Code=O I I Code=1
`
`No
`
`R=1
`dependents=O
`
`,
`R=1?
`
`t Yes
`I Code=2C I I Code=2C+1 I
`
`No
`
`660
`
`(1) Correct link to
`REF (if needed)
`(2) Update number
`of dependents
`
`• j=j+R
`
`Yes
`
`.No
`
`R=r0 ?
`lNo
`r0 -R=1?
`+ves
`I Code=2
`I Code=3
`R=r0-R
`REF=p(i,j-1)
`( 1) Correct link to
`REF (if needed)
`(2) Update number
`of dependents
`
`j=j+R
`...
`~.
`
`FIG. Bb
`
`Vedanti Systems Limited - Ex. 2017
`Page 9
`
`
`
`5,982,937
`
`1
`APPARATUS AND METHOD FOR HYBRID
`COMPRESSION OF RASTER DATA
`
`BACKGROUND OF THE INVENTION
`
`This invention relates generally to compression and
`decompression of data, and more particularly to determining
`different types of structures that may exist in rasterized data
`and selectively applying appropriate compression schemes
`thereto.
`In a display-oriented environment, pictorial data is pre(cid:173)
`sented in a two-dimensional page representation. A page is
`typically composed by a user on a workstation with the aid
`of a desktop publishing application. The page may contain
`text, line art (also called "graphic") and image (e.g., photo)
`objects and is usually output by the desktop publishing 15
`application in the form of a page description file as specified
`by a page description language (PDL). Before a page can be
`rendered by a rendering device such as a printer or a display
`screen, the data must be presented to the rendering device in
`the form of a rasterized page. The conversion to a rasterized
`form is accomplished by a PDL interpreter specific to the
`PDL used.
`A rasterized page is a digital representation of a page by
`means of a two-dimensional array of pixels, with each pixel
`assuming a particular color. The color has a range depending
`on the number of bits assigned to each pixel, with a larger
`number of bits producing a higher color resolution (color
`depth). In printer applications, it is expedient to classify the
`colors into four components corresponding to four basic
`inks: cyan (C), magenta (M), yellow (Y), and black (K). For
`example, commercial applications typically has a color
`resolution obtained from using 8 bits (byte) of storage
`assigned to each color component so that each pixel has 4
`bytes associated with it. This will produce approximately 4
`billion ink combinations.
`Printers, particularly laser printers, typically have a print
`engine that prints at a constant rate. Raster data must be fed
`to the print engine at a rate commensurate with the output
`rate or else a printer overrun error will occur. At the very
`least, the print engine can not be made to wait for raster data
`in the course of outputting a page. Thus, to accommodate the
`incompatibility between input data rate and print engine
`output rate, a print buffer (also referred to as a frame buffer)
`is employed to accommodate at least one rasterized page at 45
`a time.
`The two-dimensional nature of a rasterized page results in
`the memory needed to store the page increasing as the square
`of the resolution and/or the product of the linear dimensions
`of the page. For example, for a modest printer resolution,
`such as 400 dpi (dots per inches) (i.e., 157 dots per em) as
`applied to a page 8.5 inches by 11 inches (i.e., 21.6x27.9 em)
`in size, the memory required for a page amounts to as much
`as 60 Mbytes (megabytes). With the high cost of memory,
`this amount of memory could easily cost more than the sum
`of all other parts of a laser printer, and would not be
`commercially or economically viable.
`One common solution to minimize the size of the print
`buffer is to compress the raster data before storing in it. Once
`one or more pages of compressed raster data have been
`stored, they can be decompressed at a controlled rate appro(cid:173)
`priate for the print engine.
`U.S. Pat. No. 5,479,587 discloses a print buffer minimi(cid:173)
`zation method in which the raster data is compressed by
`trying different compression procedures with increasing
`compression ratios until the raster data is compressed suf(cid:173)
`ficiently to fit in a given print buffer. Each time, a compres-
`
`2
`sian procedure with a higher compression ratio is selected
`from a predefined repertoire of such procedures, ranging
`from lossless ones such as run-length encoding to lossy
`ones. Generally, lossless encoding is efficient on text and
`line art data while lossy encoding is effective on image data.
`However, this method may produce poor print quality when
`the nature of the raster page calls for lossy compression in
`order to achieve a predetermined compression ratio. This is
`because only one of the selected compression procedure is
`10 summarily applied across each strip of the page and when
`the strip contains both image data as well as text or line art
`data, the lossy compression procedure will generally blur
`sharp lines that usually delineate text or line art data or may
`introduce undesirable artifacts.
`European Patent Publication No. 0597571 discloses a
`method in which the types of objects in a page are first
`extracted and the boundary of each object determined before
`rasterization. Appropriate compression procedures are selec(cid:173)
`tively applied to each type of objects. In this way, lossless
`20 compression procedures may optimally be applied to text or
`line art objects while lossy compression procedures may be
`applied to image objects. Essentially, the method operates at
`the display list level which is an intermediate form between
`the page description file and the rasterized page. Objects and
`25 their types are determined by parsing from the high-level,
`implicitly object-defining commands of the PDL in the
`display list. This requires knowledge of the particular brand
`and version of PDL commands as well as how to reconstruct
`a certain object from these implicit manifestations. In any
`30 case, it appears all but the simplest boundaries such as
`objects enclosed in rectangular blocks are practically deter(cid:173)
`minable from such deciphering at the display list level.
`In general, the display list is interpreted by a specific PDL
`interpreter to generate raster data in page representation. The
`35 interpretation process is likened to a "black box" in which
`the display list is input at one end and out comes the raster
`data at the other end. Once the data is rasterized, it is in the
`form of an array of pixels or a bit map, and there are no
`longer any explicit and well defined objects to which indi-
`40 vidual compression procedure can be applied.
`
`50
`
`OBJECTS AND SUMMARY OF THE
`INVENTION
`Accordingly, it is a general object of the present invention
`to provide a method and apparatus for optimum data com(cid:173)
`pression of raster data with a minimum of data degradation.
`It is another object of the present invention to determine
`the different types of structures that may exist in already
`rasterized data and selectively applying appropriate com(cid:173)
`pression procedures thereto.
`It is another object of the present invention to minimize
`the memory requirement for a print or frame buffer.
`It is another object of the present invention to prescribe,
`55 given a desired compression ratio, how to balance between
`lossless and lossy compression procedures (as well as to
`control the amount of lossiness) such that compression to at
`least the desired ratio is achieved.
`These and additional objects are accomplished by the
`60 following features of the invention.
`The invention seeks to apply optimum compression pro(cid:173)
`cedures to different types of objects that may exist on a page.
`A first type of compression such as lossless, run-length
`encoding is preferably applied to data that is recognized as
`65 text or line art. However, the first type of compression may
`not be efficient when applied to data that is recognized as
`image or photo. In order to meet a targeted compression
`
`Vedanti Systems Limited - Ex. 2017
`Page 10
`
`
`
`5,982,937
`
`15
`
`4
`page buffer. These are decompressed respectively by a first
`and second decompressor before being merged to become
`the reconstructed document.
`One feature in an alternative embodiment is that the first
`compressor in a first pass also assumes the role of the patch
`recognizer. In the first pass, the first compressor applies
`run-length encoding (RLE) to all raster data. Since the RLE
`represents a run or spread of pixels with the same color,
`patches are therefore identified as an integral part of the RLE
`10 process. This feature is efficient because the processes of
`Type 1 compression and patch recognition are combined
`into one.
`According to another feature in the alternative
`embodiment, a threshold is set for deciding whether a block
`is to be JPEG compressed. If a block (8x8 pixels) has only
`a small number of Type 2 pixels (i.e. BlockP2Count is less
`than a predetermined threshold), it will not be cost effective
`to perform also the JPEG compression on that block.
`Instead, all data in the block will be treated as Type 1. In a
`preferred implementation, this is accomplished by an
`20 in-block counter.
`According to another aspect of the invention, during the
`compression procedures, the compression ratio achieved
`to-date is monitored after the completion of each strip (row
`of blocks) and a set of compression parameters is adaptively
`25 changed in order to attain the required compression ratio for
`the whole page. A number of parameters affects the com(cid:173)
`pression ratio and this includes the Q_factor parameter
`applied to the quantization matrices in the JPEG encoding.
`In a preferred embodiment, the Q_factor is allowed to
`30 change adaptively to maintain the best image quality within
`the required compression ratio. In a preferred
`implementation, this is accomplished by an adaptive com(cid:173)
`pression ratio controller.
`According to another aspect of the invention, a recom-
`pression procedure is implemented in the rare event that the
`required compression ratio is not attained. A set of param(cid:173)
`eters associated with compression ratio is adjusted. The
`compressed document is decoded portion by portion using
`the previous set of parameters and recompressed using the
`updated set of parameters.
`Additional objects, features and advantages of the present
`invention will be understood from the following description
`of the preferred embodiments, which description should be
`taken in conjunction with the accompanying drawings.
`
`35
`
`3
`ratio, a second type of compression such as transform
`encoding is preferably applied to these type of data. In the
`preferred embodiment the transform encoding is a lossy
`JPEG encoding which is applied to integral blocks of 8x8
`pixels. However, once a page has been converted from the
`page description file to rasterized page data, all history
`information regarding type and source of the various objects
`comprising the page is lost.
`One important feature of the present invention is the
`ability to analyze the page in the rasterized form. This is
`accomplished by recognizing structures in the raster data in
`the form of color patches. A patch is regarded as a spread of
`connected pixels of the same color. Once the patches are
`recognized, they are discriminated between a Type 1 or a
`Type 2 patch, depending on whether or not the patch can be
`efficiently compressed by the first type of compression
`procedure. Each patch has a size measured by the number of
`pixels therein ("PatchPixelCount"). Type 1 patch has a
`PatchPixelCount greater or equal to a predetermined
`number, D1, and Type 2 patch has a PatchPixelCount less
`than Dl. In a preferred implementation, D1 is from 6 to 8.
`The first compression procedure is then applied to Type 1
`patches and the second compression procedure is applied to
`Type 2 patches.
`Thus, even in rasterized form, text or line art objects can
`generally be recognized and distinguished from image or
`photo objects. Appropriate compression procedures can then
`be applied to each type of data to optimally attain efficient
`compression while maintaining quality.
`According to another feature of the present invention,
`provision is made for handling the arbitrary boundary
`between the two types of patches. The original raster data
`has a granularity at the pixel level. However, the preferred
`second compression procedure, block-oriented JPEG
`compression, coarsens the granularity of Type 2 data to the
`block level (8x8 pixels) which is larger by an area ratio of
`64:1 compared to that of Type 1 data associated with Type
`1 patches. The mismatch in granularity between the two
`types results in a discontinuity at the boundary between Type
`2 and Type 1 data. This problem is handled by coding the
`Type 2 patches also in their original pixel granularity as
`transparent pixels. Later, when the document is
`reconstructed, the transparent pixels form one or more
`windows for the decompressed Type 2 data to show through
`in the reconstructed document, thereby preserving the origi(cid:173)
`nal fine structure at the boundary between Type 1 and Type
`2 data.
`Also at the boundary, there will be JPEG blocks that are
`only partially filled with Type 2 patches. When JPEG 50
`compression is applied to such blocks, their unnatural and
`discontinuous structure causes the introduction of undesir(cid:173)
`able frequencies which tend to decrease the compression
`ratio. To reduce this effect, before applying the compression
`the pixel values in the unfilled portion of the block are 55
`replaced by the average value of the Type 2 pixels in the
`block.
`In a preferred embodiment, a patch recognizer scans the
`pixels within a working window row by row, and pixel by
`pixel within each row, and recognizes the patches. After the 60
`patches are parsed out, they are separated into Type 1 and
`Type 2 patches by a patch type discriminator. A first com(cid:173)
`pressor compresses Type 1 pixels from Type 1 patches as
`well as transparent pixels representing the position of Type
`2 pixels from Type 2 patches. A second compressor com- 65
`presses Type 2 pixels block by block. The two types of
`compressed codes are stored individually in a compressed
`
`40
`
`45
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`FIG. 1 is a schematic system block diagram illustrating a
`typical environment in which the present invention is appli(cid:173)
`cable.
`FIG. 2 is an example page containing text, graphic and
`image objects.
`FIG. 3 is a block diagram of the apparatus and method of
`hybrid compression, according to a preferred embodiment of
`the invention.
`FIG. 4a illustrates the schematic separation of the Type 1
`raster data onto a first raster layer.
`FIG. 4b illustrates the schematic separation of the Type 2
`raster data onto a second raster layer.
`FIG. 5 is a block diagram of the apparatus and method of
`hybrid compression, according to another preferred embodi(cid:173)
`ment of the invention.
`FIG. 6 is a table listing a schedule of run-length codes
`according to a preferred embodiment of the invention.
`FIG. 7 is a flow diagram illustrating the sequence of
`compressing a document, according to a preferred embodi(cid:173)
`ment of the present invention.
`
`Vedanti Systems Limited - Ex. 2017
`Page 11
`
`
`
`5,982,937
`
`5
`FIG. Sa shows a first portion of a flow diagram illustrating
`the procedure embodied in the first pass of the first com(cid:173)
`pressor shown in FIG. 5 and FIG. 7.
`FIG. Sb shows a continuation of the flow diagram that
`begins in FIG. Sa.
`
`DETAILED DESCRIPTION OF 1HE
`PREFERRED EMBODIMENTS
`
`6
`document which may contain multiple pages. The example
`shows a page 90 with text 92 at the bottom. It also has Line
`art 94 consisting of a heart design 96 overlaying a larger
`heart design with a ribbon interleaved in between. The
`smaller heart 96 also forms a frame for a photo 9S, denoted
`by "PHOTO" at the center of the page. Even though FIG. 2
`is shown schematically in black and white, the various
`objects may be in color.
`When the page 90 is rasterized, it is represented by a two
`10 dimensional array of pixels. In the preferred embodiment,
`each pixel is represented by four bytes, one byte for each of
`the primary ink colors CMYK. As explained earlier, this will
`allow each pixel to represent one of about 4 billion ink
`combinations.
`
`Determination of Structures and their Type from
`Rasterized Data
`
`20
`
`25
`
`FIG. 1 is a schematic system block diagram illustrating a
`typical environment in which the present invention is appli(cid:173)
`cable. A host computer or workstation 10 running a desktop
`publishing application program is used to generate page(cid:173)
`oriented documents which are ultimately rendered in printed
`form or displayed on a screen. To be specific, disclosure will
`be made with reference to a printer application, although it 15
`will be understood other types of rendering applications
`such as plotters, facsimile machines, and display screens are
`also applicable.
`A page-oriented document generated by the host com-
`puter or workstation 10 may typically contain a mixture of
`text, line art (alternatively known as "graphic") and image
`objects. These and other contents are specified by a page
`description language (PDL) and the document is output as a
`page description file 20. A PDL interpreter 30 specific to the
`particular PDL used interprets the page description file and
`generates the document page by page in bit map format, as
`represented schematically by a rasterized page data stream
`40. A generalized page memory 50 receives the rasterized
`data through an input 52 and relates it via an output 54 to a
`print engine 70. Since the print engine 70 typically prints
`page by page, the generalized page memory 50 serves to
`collect one or more pages of the document before sending it
`to the print engine 70. As the rate of the rasterized data
`produced from the PDL interpreter 30 is generally different
`from the intake rate of the print engine 70, the generalized
`page memory 50 also serves to buffer the two dissimilar
`rates by providing temporary storage.
`As described earlier, a page of rasterized data may require
`tens to hundreds of megabyte of memory for storage and is
`therefore impractical to store in uncompressed form. To this
`end, the generalized page memory 50 includes a compressor
`100 which receives and compresses the rasterized page data
`stream 40 a portion at a time, such as a strip 41. The
`compressed raster data is stored in a compressed page buffer
`200. Later, it is retrieved from the compressed page buffer
`200 and is decompressed by a decompressor 300 before
`being fed to the print engine 70 to output a printed document
`SO page by page. In this way, the memory requirement of the
`generalized page memory is reduced by the same ratio as the 50
`compression ratio achieved in the compression process.
`In practice, the PDL interpreter 30, and the generalized
`page memory 200 form part of a printer controller that
`operates with the print engine 70. The PDL interpreter 30,
`the compressor 100 and the decompressor 300 can be 55
`implemented either as dedicated hardware processors or as
`part of a microprocessor system 60. In the latter case, the
`microprocessor system 60 includes a microprocessor 62,
`non-volatile, read-only memory (ROM) 64 and random(cid:173)
`access memory (RAM) 66. The functions of the various 60
`components are embodied as procedures stored in ROM 64
`and are executable by the microprocessor 62. Also, RAM 66
`acts as temporary storage for the strip 41 as well as the
`compressed page buffer 200.
`FIG. 2 is an example page containing text, graphic and 65
`image objects. Although the discussion refers to a page, it
`will be understood that the page is in the context of a
`
`30
`
`35
`
`Once the page 90 has been converted from the page
`description file to rasterized page data, all history inform a(cid:173)
`tion regarding type and source of the various objects com(cid:173)
`prising the page is lost. One important feature of the present
`invention is the ability to analyze the page in the rasterized
`form and determine the different structures and their types
`therein and apply the appropriate compression procedure to
`each type of structures.
`FIG. 3 is a block diagram of the apparatus and method of
`hybrid compression, according to a preferred embodiment of
`the invention. Referring also to FIG. 1, the generalized page
`memory 50 includes the compressor 100, the compressed
`page buffer 200 and the decompressor 300. The compressor
`100 receives rasterized data strip by strip from the input 52
`into a working buffer 110. In the preferred embodiment, each
`strip is 8 rows of pixels amounting to an 8-pixel high
`working window across the page. A patch recognizer 120
`scans the pixels within the working window row by row
`(from top to bottom of the window) and pixel by pixel within
`each row (from left to right) and recognizes patches of
`connected pixels of the same color. Two pixels are con-
`40 nected if they are adjacent each other whether horizontally,
`vertically or diagonally. As will be explained later, in the
`preferred embodiment, patches in a strip are recognized after
`the recognizer 120 has actually scanned through the next
`strip. Once the patches are recognized, a patch type dis-
`45 criminator 130 discriminates between at least two type of
`patches by virtue of their size or the number of pixels
`("PatchPixelCount") in each patch.
`The nature of text or line art objects is such that they are
`typically made up of color patches of larger size, each
`having a PatchPixelCount ranging from several to thousands
`of pixels. A large patch will also be referred to as a Type 1
`patch. The pixels in it will be referred to as Type 1 pixels,
`p1(i,j) where the coordinates (i,j) represents the ith row and
`jth column of the pixel array of a page. This characteristic of
`large size color patches means there is low activity or high
`redundancy in going from one pixel to another and generally
`yields well to efficient compression by lossless compression
`procedures.
`Image or photo objects, on the other hand, are mainly
`constituted from color patches of smaller size, each with a
`PatchPixelCount typically ranging from a single to a few
`pixels. A small patch will also be referred to as a Type 2
`patch and pixels in it as Type 2 pixels, p2(i,j). This charac(cid:173)
`teristic of small size color patches means there is high
`activity or low redundancy in going from one pixel to
`another. Generally, as the patch size decreases, lossless
`compression becomes increasingly inefficient. Indeed, when
`
`Vedanti Systems Limited - Ex. 2017
`Page 12
`
`
`
`7
`a patch falls below a certain size, lossless compression may
`yield a compression ratio of less than one, since the code
`required for the encoding occupies more space than the
`patch it seeks to encode. Generally, image objects are more
`efficiently compressed by a transform coding procedure,
`particularly a lossy type. This type of compression is termed
`"lossy" because some information is lost in the
`compression-decompression cycle. However, for image
`objects, and even with a compression ratio of 10:1 the loss
`of information is not readily apparent to the human eye.
`The demarcation between a Type 1 (large) patch and a
`Type 2 (small) patch is respectively whether the number of
`pixels in a patch, i.e., PatchPixelCount, is equal or greater
`than a predetermined minimum count ("PatchP1MinCount"
`or "D1"). In other words, the patch is Type 1 if its 15
`PatchPixelCount>=D1, and the patch is Type 2 if its
`PatchPixelCount<Dl. This threshold number, D1, provides
`a parameter for adjusting the desired proportion of Type 1
`and Type 2 compression procedures being applied t