throbber
(12) United States Patent
`Mehrotra et al.
`
`US006571016B1
`US 6,571,016 B1
`May 27, 2003
`
`(10) Patent No.:
`(45) Date of Patent:
`
`(54)
`
`(75)
`
`(73)
`
`(21)
`(22)
`
`(63)
`
`(51)
`(52)
`(58)
`
`(56)
`
`INTRA COMPRESSION OF PIXEL BLOCKS
`USING PREDICTED MEAN
`
`Inventors: Sanjeev Mehrotra, Kirkland, WA (US);
`Albert S. Wang, Kirkland, WA (US)
`Assignee: Microsoft Corporation, Redmond, WA
`(US)
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 0 days.
`
`Notice:
`
`Appl. No.: 08/850,957
`Filed:
`May 5, 1997
`Related U.S. Application Data
`
`Continuation of application No. 08/623,299, filed on Mar.
`28, 1996, now Pat. No. 6,215,910.
`Int. Cl." ............................. G06K 9/36; G06K 9/46
`U.S. Cl. ................................... 382/236; 375/240.13
`Field of Search ................................. 382/236, 238,
`382/248, 253; 375/240.13, 240.16, 240.22,
`240.24
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`5,144,425 A 9/1992 Joseph ....................... 358/133
`5,155,594 A * 10/1992 Bernstein et al. ...... 375/240.14
`5,227,878 A * 7/1993 Puri et al. .............. 375/240.15
`5,351,095 A 9/1994 Kerdramvat ................. 348/699
`5,414,469 A 5/1995 Gonzales et al. ........... 348/408
`5,418,568 A 5/1995 Keith ......................... 348/390
`5,453,801 A 9/1995 Kim .......
`... 348/699
`5,473,379 A 12/1995 Horne ....
`.... 348/416
`5,502,492 A 3/1996 Jung .........
`.... 348/413
`5,512,952 A 4/1996 Iwamura ..................... 348/416
`5,521,988 A 5/1996 Li et al. ..................... 382/248
`
`5,537,155
`5,557,341
`5,560,038
`5,576,767
`5,585,852
`5,596,659
`5,604,867
`5,623,312
`5,623,313
`5,673,265
`5,694,173
`
`::
`
`7/1996
`9/1996
`9/1996
`11/1996
`12/1996
`1/1997
`2/1997
`4/1997
`4/1997
`9/1997
`12/1997
`
`O’Connell et al. ......... 348/699
`Weiss et al. ................ 348/699
`Haddock .................... 395/800
`Lee et al. ................... 348/413
`Agarwal ..................... 348/398
`Normile et al. ............. 382/253
`Harwood ............... 395/200.13
`Yan et al. ................... 348/416
`Naveen ...................... 348/416
`Gupta et al. ................ 370/432
`Kimura et al. .............. 348/423
`
`
`
`OTHER PUBLICATIONS
`“Video Coding for Low Bitrate Communication”, ITU-T,
`Draft H.263: Line Transmission of Non–Telephone Signals,
`Int’l Telecommunication Union, (May 2, 1996).
`Chaddha, N., et al., “Hierarchical Vector Quantization of
`Perceptually Weighted Block Transforms”, IEEE, pp. 3–12,
`(1995).
`* cited by examiner
`Primary Examiner—Phuoc Tran
`(74) Attorney, Agent, or Firm—Lee & Hayes, PLLC
`(57)
`ABSTRACT
`An apparatus and method for encoding video frames is
`provided. The video frames are divided into blocks for
`encoding. Encoding of the video blocks utilizes motion
`detection, motion estimation and adaptive compression, to
`obtain the desired compression for a particular bit rate.
`Adaptive compression includes intra compression (without
`regard to other frames) and inter compression (with regard
`to other frames). Intra compression, inter compression with
`motion detection, and inter compression with motion esti
`mation are performed on a block by block basis, as needed.
`Segmentation is provided to compare encoding of a block
`with encoding of its sub-blocks, and to select the best block
`size for encoding.
`
`50 Claims, 17 Drawing Sheets
`
`corresponding blocks
`
`8:3
`
`ls distance > threshold?
`
`-
`
`18
`
`Create header for Motion
`Detection (Header=0)
`do not Se?i? b|CCK
`
`|
`
`
`
`Increment N
`
`19 || compression
`
`create header for motion
`Detection (Header=1.)
`
`Perfºrm
`inter
`compressiºn
`Gr
`9|ack
`(Figure 7)
`select best compression
`for block
`
`Google Inc.
`GOOG 1038
`IPR2016-00212
`
`0001
`
`

`
`U.S. Patent
`
`May 27, 2003
`
`Sheet 1 of 17
`
`US 6,571,016 B1
`
`F/G, 7 (Prior art)
`
`
`
`108 -
`
`100
`
`
`
`112
`
`|
`||||
`
`AE/G. 2 (Prior art)
`
`
`
`
`
`
`
`202
`
`Minimize the
`amount of data
`
`Transform
`
`
`
`
`
`
`
`Skip
`
`y 204
`/
`
`
`
`, 206
`
`Minimize the
`precision of the data
`
`Minimize the # of
`bits per data item
`
`Code
`
`Repeat
`
`0002
`
`

`
`U.S. Patent
`
`May 27, 2003
`
`Sheet 2 of 17
`
`US 6,571,016 B1
`
`F/G, 3 (Related art)
`
`302
`
`_ 304
`
`Original
`Data
`
`…”
`
`
`
`- 314
`
`Playback |__
`346
`Device
`
`A/G 42
`
`
`
`0003
`
`

`
`U.S. Patent
`
`May 27, 2003
`
`Sheet 3 of 17
`
`US 6,571,016 B1
`
`4m
`
`408
`
`0004
`
`0004
`
`

`
`U.S. Patent
`
`May 27, 2003
`
`Sheet 4 of 17
`
`US 6,571,016 B1
`
`A/G 5
`
`.
`
`502
`
`egin
`
`…
`
`, 500
`/
`
`Get initial frame N
`
`Convert frame N to YUV
`(Figures 10, 11)
`
`_ 504
`~~~
`
`506
`
`Encode macroblocks in frame N
`using adaptive
`intra-compression
`
`508
`…~
`
`Decode frame N
`
`Get frame N4-1
`
`---
`
`…”
`
`Convert frame N-H1 to YUV ...”
`
`Perform Motion Detection
`on frame N-H 1
`
`Perform Motion Estimation
`On frame N-F1
`(Figure 6)
`
`510
`
`512
`
`514
`
`519
`
`|
`
`516
`
`} 518
`|
`
`- - - - - - - - - - - - - - - - - - - - - |
`
`Encode blocks in frame N+1
`(Figure 9)
`
`52O
`
`…” 522
`
`524
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`0005
`
`

`
`U.S. Patent
`U.S. Patent
`
`May 27, 2003
`May 27, 2003
`
`Sheet 5 of 17
`Sheet 5 of 17
`
`US 6,571,016 B1
`US 6,571,016 B1
`
`F/G, 6.9
`@a
`
`I
`
`651
`651
`
`, 652
`652
`
`
`
`F/O 7,
`Z0
`
`
`
`/ 751
`/ 754
`
`/,r 752
`752
`
`
`
`0006
`
`0006
`
`

`
`U.S. Patent
`
`May 27, 2003
`
`Sheet 6 of 17
`
`US 6,571,016 B1
`
`, 600
`
`602
`
`Obtain a block N in a spatial |__
`604
`location
`
`Calculate distance between
`corresponding blocks
`
`… 606
`
`---
`
`|s distance > threshold?
`
`
`
`Create header for Motion
`Detection (Header=1),
`
`612
`
`Create header for Motion
`Detection (Header=0),
`do not send block
`
`
`
`Perform
`
`Perform
`
`618
`
`anº Orl
`On
`block
`(Figure 8)
`
`
`
`
`
`CO º Or)
`Orn
`619
`biock
`(Figure 7)
`
`Select best compression
`for block
`
`620
`
`.
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Increment N
`
`More blocks to encode?
`
`, 624
`…”
`
`
`
`
`
`0007
`
`

`
`U.S. Patent
`
`May 27, 2003
`
`Sheet 7 of 17
`
`US 6,571,016 B1
`
`A/G 7,
`
`-
`Perform inter compression
`
`_ 618
`...”
`
`-
`
`Find best matching macroblock
`
`-
`
`Calculate motion vector
`
`-
`
`-
`
`Obtain residual
`
`Code residual
`
`-
`
`... 702
`
`- 704
`
`- 706
`
`708
`
`710
`
`712
`
`| Perform adaptive compression
`on residual
`
`Perform no residual inter
`compression on macroblock
`
`716
`
`Perform reconstruction
`of macroblock
`
`0008
`
`

`
`U.S. Patent
`U.S. Patent
`
`May 27, 2003
`May 27, 2003
`
`Sheet 8 of 17
`Sheet 8 of 17
`
`US 6,571,016 B1
`US 6,571,016 B1
`
`F/G. &
`FIG, 8,52
`
`
`
`800
`
`810
`
`0009
`
`0009
`
`

`
`U.S. Patent
`U.S. Patent
`
`May 27, 2003
`May 27, 2003
`
`Sheet 9 of 17
`Sheet 9 of 17
`
`US 6,571,016 B1
`US 6,571,016 B1
`
`A/G &A
`FIG, 2%
`
`
`
`, 820
`820
`
`0010
`
`0010
`
`

`
`U.S. Patent
`U.S. Patent
`
`May 27, 2003
`May 27, 2003
`
`Sheet 10 of 17
`Sheet 10 of 17
`
`US 6,571,016 B1
`US 6,571,016 B1
`
`F/G, 9.
`902
`
`F/G, 9%
`FIG, 9%»
`
`
`
`940
`
`3
`b
`
`__ 920
`
` ,,/— 920
`--3'(Q—c-CD0.00'0J
`
`930 1
`930 < .
`
`C
`d
`
`e
`f
`9
`h
`i
`
`0011
`
`0011
`
`

`
`U.S. Patent
`
`May 27, 2003
`
`Sheet 11 of 17
`
`US 6,571,016 B1
`
`A/G 9a
`
`F/G 9/
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`000
`
`3
`
`1
`
`goto 110 2nd
`stage table
`goto 111 2nd
`Stage table
`
`OO
`O1
`10
`11
`
`C
`C
`C
`e
`
`1
`1
`2
`2
`
`000
`
`f
`
`1
`
`960
`
`0012
`
`

`
`U.S. Patent
`
`May 27, 2003
`
`Sheet 12 of 17
`
`US 6,571,016 B1
`
`F/G, J (),
`
`
`
`
`
`Obtain pixel values
`from codebook
`
`
`
`Define noise to be
`added to pixel values
`
`1004
`
`1006
`
`
`
`
`
`Add noise to pixel values
`
`
`
`1010
`
`--
`
`Clip pixel values
`
`
`
`Reduce pixel values |
`
`1012
`
`
`
`
`
`
`
`
`
`
`
`Construct RGB table
`according to
`display format
`
`1014
`
`Convert YUV to
`RGB display format
`
`- 1016
`...~~~
`
`
`
`1020
`
`1022
`
`|| Obtain YUV pixel values
`from YUV codebook
`
`
`
`1024 s
`
`Create and Concatenate
`YUV Word
`
`
`
`1026
`
`Look-up RGB value
`for display
`
`0013
`
`

`
`U.S. Patent
`
`May 27, 2003
`
`Sheet 13 of 17
`
`US 6,571,016 B1
`
`A/G //
`
`1102,
`\
`*
`.
`
`1104
`
`*
`\
`
`.
`
`1100
`/ /
`.
`
`110s,
`W
`\
`w
`
`Y-high
`
`Y-|OW
`
`U-high
`
`U-low
`
`V-high
`
`V-low
`
`
`
`0014
`
`

`
`U.S. Patent
`
`May 27, 2003
`
`Sheet 14 of 17
`
`US 6,571,016 B1
`
`F/G, 7%
`
`Encode 8x8 block
`
`~~~
`
`Encode two segmented
`8x4 blocks
`
`Calculate distortion (D1) and
`rate (R1) of 8x8 block
`
`Calculate sum (D2) of
`distortions and sum (R2) of
`rates of 8x4 blocks
`
`1202
`
`1204
`
`1206
`
`1208
`
`Compare D1 +XR1 to
`D2 + AR2, where
`X is a constant
`
`_ 1210
`
`
`
`1214
`
`Encode block as two 8x4 blocks
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`0015
`
`

`
`U.S. Patent
`U.S. Patent
`
`May 27, 2003
`May 27, 2003
`
`Sheet 15 of 17
`Sheet 15 of 17
`
`US 6,571,016 B1
`US 6,571,016 B1
`
`F/G, 7%
`FEE, 17.21»
`
`7 1250
`
`F/G, 7%
`FIG, 17.20
`
`
`
`1288
`1288
`
`1298
`
`1298
`
`0016
`
`0016
`
`

`
`U.S. Patent
`U.S. Patent
`
`May 27, 2003
`May 27, 2003
`
`Sheet 16 of 17
`Sheet 16 of 17
`
`US 6,571,016 B1
`US 6,571,016 B1
`
`F/G, 72/
`/FICE}, 112%]
`
`
`
`1252
`
` \
`
`\
`\ 1270
`‘- 1268
`\ 1268 W 1270
`
`0017
`
`0017
`
`

`
`U.S. Patent
`
`May 27, 2003
`
`Sheet 17 of 17
`
`US 6,571,016 B1
`
`A'/G 73.
`
`1300
`
``
`
`1302
`
`“,
`
`Get block N
`
`1304 s,
`
`--
`
`Read block header
`
`1306
`
`Was block
`encoded?
`
`
`
`1332
`~
`Perform YUV to RGB
`transform
`
`Increment N
`
`
`
`~ 1334
`
`
`
`
`
`all blocks
`decoded?
`
`_ 1310
`1308
`Calculate mean from
`intra
`decoded adjacent blocks
`
`inter or
`intra?
`
`1318
`
`Display frame
`
`Y ~
`
`1338
`
`Read tree and indices |
`
`- 1312
`
`inter
`Read motion Vector
`
`1320
`
`Decode residual
`
`1314
`
`
`
`Add to mean
`
`,”
`
`
`
`
`
`|s residual
`encoded?
`
`Read tree and indices
`
`1326
`
`1328
`
`
`
`1324
`
`Copy block from
`previous reconstructed
`frame, offset by the
`motion vector
`
`Decode residual
`
`_ 1330
`Add to previous block
`With motion Vector offset
`
`
`
`0018
`
`

`
`1
`INTRA COMPRESSION OF PIXEL BLOCKS
`USING PREDICTED MEAN
`CROSS-REFERENCE TO RELATED
`APPLICATIONS
`This application is related to U.S. Patent Applications:
`Ser. No. 08/625,650, filed Mar. 29, 1996, entitled “TABLE
`BASED LOW-LEVEL IMAGE CLASSIFICATION AND
`COMPRESSION SYSTEM" (now U.S. Pat. No. 6,404,
`923); Ser. No. 08/714,447, filed Sep. 16, 1996, entitled
`10
`“MULTIMEDIA COMPRESSION SYSTEM WITH ADDI
`TIVE TEMPORAL LAYERS" (pending); Ser. No. 08/818,
`805, entitled “METHOD AND APPARATUS FOR IMPLE
`MENTING MOTION DETECTION IN VIDEO
`COMPRESSION” (abandoned); Ser. No. 08/819,507
`15
`entitled “DIGITAL VIDEO SIGNAL ENCODER AND
`ENCODING METHOD” (now U.S. Pat. No. 6,118,817);
`Ser. No. 08/818,804, entitled “PRODUCTION OF A
`VIDEO STREAM WITH SYNCHRONIZED ANNOTA
`TIONS OVERACOMPUTER NETWORK" (now U.S. Pat.
`20
`No. 6,006,241); Ser. No. 08/819,586, entitled “METHODS
`AND APPARATUS FOR IMPLEMENTING CONTROL
`FUNCTIONS IN A STREAMED VIDEO DISPLAY SYS
`TEM” (now U.S. Pat. No. 6,014,706); Ser. No. 08/818,769,
`entitled “METHODS AND APPARATUS FOR AUTO
`25
`MATICALLY DETECTING PROTOCOLS IN A COM
`PUTER NETWORK" (now U.S. Pat. No. 5,999,979); Ser.
`No. 08/818,127, entitled “DYNAMIC BANDWIDTH
`SELECTION FOR EPFICIENT TRANSMISSION OF
`MULTIMEDIA STREAMS IN A COMPUTER NET
`30
`WORK” (now U.S. Pat. No. 6,292,834); Ser. No. 08/819,
`585, entitled “STREAMING AND DISPLAYING AVIDEO
`STREAM WITH SYNCHRONIZED ANNOTATIONS
`OVER A COMPUTER NETWORK" (now U.S. Pat. No.
`6,173,317); Ser. No. 08/818,644, entitled SELECTIVE
`RETRANSMISSION FOR EPFICIENT AND RELIABLE
`STREAMING OF MULTIMEDIA PACKETS IN A COM
`PUTER NETWORK” (now U.S. Pat. No. 5,918,002); Ser.
`No. 08/819,579, U.S. Patent Application Publication No.
`US-2001-0017941-A1, entitled METHOD AND APPARA
`40
`TUS FOR TABLE-BASED COMPRESSION WITH
`EMBEDDED CODING” (abandoned); Ser. No. 08/822,156,
`entitled “METHOD AND APPARATUS FOR COMMUNI
`CATION MEDIA COMMANDS AND DATA USING THE
`HTTP PROTOCOL” (now U.S. Pat. No. 6,128,653); Ser.
`45
`No. 08/818,826, entitled “DIGITAL VIDEO SIGNAL
`ENCODER AND ENCODING METHOD" (now U.S. Pat.
`No. 5,903,673); provisional U.S. Patent Applications: Serial
`No. 60/036,661, entitled “VCR-LIKE FUNCTIONS FOR
`RENDERING VIDEO ON DEMAND (VOD)”; Serial No.
`50
`60/036,662, entitled “METHODS AND APPARTUS FOR
`AUTODETECTING PROTOCOLS IN A COMPUTER
`NETWORK"; which are all incorporated herein by refer
`ence. U.S. patent application Ser. No. 08/623,299, filed Mar.
`28, 1996, entitled “TABLE-BASED COMPRESSION
`55
`WITH EMBEDDED CODING” (now U.S. Pat. No. 6,215,
`910)is incorporated herein by reference.
`BACKGROUND OF THE INVENTION
`1. Field of the Invention
`The present invention relates to a method and apparatus
`for compression of multimedia data. More specifically, the
`present invention relates to a method and apparatus for
`predictive compression of video frames.
`2. Description of the Related Art
`The creation of pictures or images has been a human
`activity since the beginning of humanity. However, until
`
`35
`
`60
`
`65
`
`US 6,571,016 B1
`
`5
`
`2
`recent history viewing of an image required the viewer to be
`physically present at the image. This was geographically
`cumbersome. Photography, both still and motion, broke this
`geographic constraint by allowing pictures to be captured
`and transported independent of the physical images they
`represented. Television enhanced transmission of images, by
`sending images, recorded or live, to any geographic location
`capable of receiving a radio signal. But, for the most part,
`viewers of television can only view images that are sched
`uled for transmission, rather than selecting images at will.
`With the development of computers, and more specifi
`cally computers that are linked across a network, images
`stored on one computer may be demanded by a viewer, and
`almost instantaneously provided to the viewer’s computer
`over the computer network. One computer network that is
`increasingly being used is the Internet, the well-known
`international computer network that links various military,
`government, education, nonprofit, industrial and financial
`institutions, commercial enterprises, and individuals.
`Images are typically of two types: 1) single pictures; or 2)
`moving pictures. Single pictures include photographs, com
`puter art, faxes and web pages. Moving pictures typically
`include a number of single images or frames organized into
`a particular sequence. Within a computer network, images
`are captured and stored on one computer, and then trans
`mitted over the network to another computer for viewing. An
`example of this is provided in FIG. 1, to which reference is
`now made.
`FIG. 1 illustrates a computer system 100 that includes a
`server 102 connected to a number of mass storage devices
`104. The mass storage devices 104 are used to store a
`number of video frames 120. The video frames 120 could be
`still images, or could be combined into sequences to create
`moving pictures, as described above. The sequences reside
`on the mass storage devices 104, and upon request, may be
`transmitted by the server 102 to other computers 108 via a
`network 106. In addition, the video frames 120 may be
`transferred to remote computers, such as the computer 112,
`via a network 116, using a router 110 and/or a modem 114.
`One skilled in the art should appreciate that the network 116
`could be a dedicated connection, or a dial-up connection,
`and could utilize any of a number of network protocols such
`as TCP/IP or Client/Server configurations.
`In operation, a user sitting at any of the computers 108,
`112 would request video frames 120 from the server 102,
`and the server would retrieve the video frames 120 from the
`mass storage devices 104, and transmit the frames 120 over
`the network 106. Upon receipt of the video frames 120, the
`computers 108, 112 would display the images for the
`requester.
`It should be appreciated that the computers 108, 112 may
`be positioned physically close to the server 102, or may be
`thousands of miles away. The computers 108, 112 may be
`connected to the server 102 via a direct LAN connection
`such as Ethernet or Token Ring, or may utilize plain old
`telephone service (POTS), ISDN or ADSL, depending on
`the availability of each of these services, their cost, and the
`performance required by the end user. As is typically of
`computer equipment and services, higher performance
`means more COSt.
`In most cases, the amount of data required to represent a
`video frame, or more specifically a sequence of video frames
`120 is significant. For example, a color image or frame is
`typically represented by a matrix of individual dots or pixels,
`each having a particular color defined by a combination of
`red, green and blue intensities (RGB). To create a palette of
`
`0019
`
`

`
`US 6,571,016 B1
`
`3
`
`16 million colors (i.e., true color), each of the RGB inten-
`sities are represented by an 8-bit value. So, for each pixel,
`24-bits are required to define a pixel’s color. A typical
`computer monitor has a resolution of 1024 pixels (across) by
`768 pixels (down). So, to create a full screen image for a
`computer requires 1024><768><24 bits=18,874,368 bits, or
`2,359,296 bytes of data to be stored. And that is just for one
`image.
`If a moving picture is to be displayed, a sequence of
`images are grouped, and displayed one after another, at a rate
`of approximately 30 frames per second. Thus, a 1 second,
`256 color, full screen movie could require as much as 60
`megabytes of data storage. With present technology, even
`very expensive storage systems, and high speed networks
`would be overwhelmed if alternatives were not provided. By
`way of example, as the resolution and the frame rate
`requirements of a video increase, the amount of data that is
`necessary to describe the video also increases.
`One alternative to reducing the amount of data required to
`represent images or moving pictures is to simply reduce the
`size of frames that are transmitted and displayed. One
`popular frame size is 320 pixels in width and 240 pixels in
`height, or 320x240. Thus, a 256 color frame of this size
`requires 320><240><24=1,843,200 bits, or 230 kilobytes of
`data. This is significantly less (1/10”“) than what is required
`for a full screen image. However, as frames are combined
`into moving pictures,
`the amount of data that must be
`transmitted is still significant.
`An additional solution to reducing the amount of storage
`space required for video frames involves compressing the
`data. The extent to which data is compressed is typically
`measured in terms of a compression ratio or a bit rate. The
`compression ratio is generally the number of bits of an input
`value divided by the number of bits in the representation of
`that input value in compressed code. Higher compression
`ratios are preferred over lower compression ratios. The bit
`rate is the number of bits per second of compressed data
`required to properly represent a corresponding input value.
`There are three basic methods involved in any data
`compression scheme: 1) transformation, 2) reduced preci-
`sion (quantization), and 3) minimization of number of bits
`(encoding). Each of these methods may be used
`independently, or may be combined with the other methods
`to obtain optimum compression. Although the number of
`scheme combinations is large,
`typically compression is
`accomplished by a sequential process of transformation,
`precision reduction, and coding. Coding is always the final
`stage of the process, but there are sometimes several trans-
`formation and precision reduction iterations. This process is
`summarized in FIG. 2, to which attention is now directed.
`In FIG. 2, a block 202 is shown to illustrate the step of
`transformation, a block 204 is shown to illustrate the step of
`quantization, and a block 206 is shown to illustrate the step
`of coding. The transformation block 202 transforms a data
`set into another equivalent data set that is in some way
`smaller than the original. Some transformations reduce the
`number of data items in a set. Other transformations reduce
`the numerical size of data items that allow them to be
`
`represented with fewer binary digits.
`To reduce the number of data items in a set, methods are
`used that remove redundant
`information within the set.
`
`Examples of such methods include Run-Length-Encoding
`(RLE) and LZW encoding. RLE is a pattern-recognition
`scheme that searches for the repetition of identical data
`values in a list. The data set can be compressed by replacing
`the repetitive sequence with a single data value and a length
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`4
`value. Compression ratios obtainable from RLE encoding
`schemes vary depending on the type of data to be encoded,
`but generally range from 2:1 up to 5:1. LZW encoding
`replaces repeated sequences within a data set with particular
`codes that are smaller than the data they represent. Code-
`books are used during encoding and decoding to transform
`the data set back and forth from raw data to encoded data.
`
`Compression ratios for video images range from 2:1 to 9:1.
`Transformations that reduce the size of individual data
`
`items within a data set includes Differencing. Differencing is
`a scheme that attempts to reduce the size of individual data
`values within a data set by storing the difference between
`pixels values, rather than the actual data values for each
`pixel. In many cases the difference value is much smaller in
`magnitude than the original data value, and thus requires a
`smaller data space for storage.
`Other transformation schemes exist to transform a set of
`
`data values from one system of measurement into another,
`where the properties of the new data set facilitate the data’s
`compression. One such scheme called colorspace conver-
`sion transforms the RGB pixel values into luminance Y, and
`chrominance Cb and C, values. This is referred to as
`RGB/YUV conversion. Less important values, such as the
`C, component may be ignored without significantly affect-
`ing the image perceived by a viewer.
`Another scheme that transforms a set of data values from
`
`one system of measurement into another is the Discrete-
`Cosine-Transform. The DCT transforms a block of original
`data that typically represents color intensity (YUV) into a
`new set of values that represent cosine frequencies over the
`original block of data. Lower frequencies are stored in an
`upper left portion of the data block with higher frequencies
`stored in the rest of the block. If higher frequency compo-
`nents are ignored, an entire block of data may be represented
`by just a few data values in a block.
`It should be appreciated that each of the schemes
`described above are well known in the art, and may be
`combined, for a particular frame of data, to achieve maxi-
`mum compression. However, each of these schemes are
`applied to a single video frame, called intra-frame
`compression, which is independent of other video frames.
`For
`full motion video,
`including multicast video,
`teleconferencing, and interactive video, compressing each
`video frame separately is not sufficient, because of the large
`number of frames in even a short video sequence. Further
`compression may be achieved by taking advantage of the
`similarities between frames. In many instances, the differ-
`ence between one frame and the next is small because of the
`short
`time interval between frames. These schemes are
`
`referred to as inter-frame compression.
`One simple scheme stores only the pixels that actually
`change from one frame of the video sequence to the next.
`Said in a technical way, the scheme is to store only the pixels
`that produce a nonzero difference when subtracted from
`their corresponding pixels in a previous frame. Thus, rather
`than having to transmit all of the pixel values in a video
`block, only those pixels that have changed need to be
`transmitted.
`
`Another approach to video compression is to calculate the
`differences between corresponding pixels in consecutive
`frames and then encode the differences instead of the origi-
`nal values. This is called motion compensation. But,
`in
`motion pictures, pixel values often shift their spatial location
`from one frame to the next. To locate shifted pixels, a
`number of pixel values are grouped together to form a block.
`Then, a block within a present frame is compared to blocks
`
`0020
`
`0020
`
`

`
`US 6,571,016 B1
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`5
`in a previous frame to determine an offset such that all of the
`pixel differences are minimized. This is called motion esti
`mation. An offset is typically represented as a pair of
`numbers that specify a shift in the horizontal and vertical
`directions. This is referred to as a motion vector. If a motion
`vector can be determined for a particular block, that block
`may be encoded simply by supplying the motion vector,
`rather than by encoding the entire block.
`With each of the above transformation schemes, reduced
`precision may be used to further compress data, as shown by
`block 204. As was mentioned above, one of the chrominance
`values, C, could be ignored without significantly affecting
`the quality of the image. In addition, after performing a DCT
`transform, higher frequency components can be ignored.
`Furthermore, by calculating differences between pixel
`values, and ignoring minor differences, further compression
`may be achieved. This illustrates the repetition between the
`transformation block 202 and quantization block 204 of
`FIG. 2.
`The third block shown in FIG. 2 is the Code block 206.
`This block encodes a data set to minimize the # of bits
`required per data item. The coding process assigns a unique
`code value to data items in a set. One coding scheme that is
`used in compressing video frames is Huffman coding. Huff
`man codes assign a variable-length code to each possible
`data item, such that the values that occur most often in the
`data set have smaller length codes while the values that
`occur less frequently have longer-length codes. Huffman
`coding creates a tree structure where the leaf nodes are the
`original probabilities associated with each data value from
`the data set. Each branch in the tree is labeled with a one or
`a zero. The Huffman code assigned to each original data
`value is the set of labels along a path from the root node to
`the associated leaf node.
`The above provides a general overview of a number of
`different compression schemes for compressing video
`frames prior to transmitting the frames over a network to a
`remote computer. It should be appreciated that specific
`implementation of any of these schemes, or more accurately,
`a combination of particular ones of these schemes, requires
`significant preprocessing (encoding) of the video frames
`prior to transmission, as well as post processing (decoding)
`of the frames.
`As the complexity that is associated with compression and
`decompression increases, the efficiency with which video
`frames may be encoded and decoded drops. Stated another
`way, higher compression ratios require more processing, and
`take longer to encode/decode than do lower compression
`ratios. However, higher compression ratios allow more data
`to be delivered over a network in less time. Therefore, a
`tradeoff is generally made between obtaining a particular
`compression ratio, and obtaining a satisfactory bit rate of
`transfer. If a high compression ratio takes too long to decode,
`viewed images will appear choppy or disjunct. If an inad
`equate bit rate is obtained, a viewer will be kept waiting for
`the image, or the image will replay in slow motion.
`SUMMARY OF THE INVENTION
`What is needed is an apparatus and method that improves
`the efficiency of encoding/decoding video frames while
`maintaining a desired bit rate for a given resolution. More
`specifically, what is needed is an apparatus and method that
`incorporates several forms of motion estimation, and selects
`the best form for each block of data to be encoded.
`Accordingly, it is a feature of the present invention to
`provide a method to encode a video frame that is transmitted
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`6
`over a communications medium. The method includes: 1)
`obtaining a video frame; 2) separating the frame into blocks;
`3) encoding a plurality of blocks using inter compression; 4)
`encoding the plurality of blocks using predictive intra com
`pression; and 5) selecting better block compression between
`the inter and predictive intra compression; wherein the steps
`of encoding the plurality of blocks is performed on a block
`by block basis, to provide optimum compression of the
`video frame for a given bit rate.
`DESCRIPTION OF THE DRAWINGS
`These and other objects, features, and advantages of the
`present invention will become better understood with regard
`to the following description, and accompanying drawings
`where:
`FIG. 1 is block diagram of a prior art computer network
`for encoding, transmission and decoding of video images.
`FIG. 2 is a prior art block diagram of encoding method
`ology for video images.
`FIG. 3 is a related art block diagram illustrating encoding,
`delivery and decoding of video images.
`FIGS. 4a and 4b illustrate a video frame divided into a
`number of macroblocks, and a macroblock divided into a
`number of blocks, each block having a number of different
`pixels.
`FIG. 5 is a process flow chart illustrating a process of
`encoding a video frame according to the present invention.
`FIG. 6a illustrates a comparison of two macroblocks in
`the same spatial location, but in two separate video frames.
`FIG. 6b is a flow chart illustrating motion compensation
`according to the present invention.
`FIG. 7a illustrates a comparison of two macroblocks in
`different spatial locations, in two separate video frames.
`FIG. 7b is a flow chart illustrating motion detection
`according to the present invention.
`FIGS. 8a and 8b illustrate predicted mean intra block
`compression according to the present invention.
`FIG. 9a is a tree representation of a classic Huffman
`decoder.
`FIG. 9b is a table used with a two-stage Huffman decoder
`according to the present invention.
`FIG. 9c is a table of a first stage decoding table used with
`a two stage Huffman decoder according to the present
`invention.
`FIG. 9d is a table of a second stage decoding table used
`with a two stage Huffman decoder according to the present
`invention.
`FIG. 9e is a table illustrating another second stage decod
`ing table used with a two stage Huffman decoder according
`to the present invention.
`FIG. 10a is a process flow diagram that illustrates the
`steps associated with preprocessing a codebook that is used
`with a color transformation of bits encoded using motion
`compensation according to the present invention.
`FIG. 10b is a process flow diagram that illustrates the
`steps associated with performing a color transformation on
`bits encoded using motion compensation according to the
`present invention.
`FIG. 11 is a block diagram of a color transformation
`performed on bits encoded using motion compensation
`according to the present invention.
`FIG. 12a is a process flow diagram illustrating a segmen
`tation process for blocks in accordance with the present
`invention.
`
`0021
`
`

`
`7
`FIG. 12B is a block diagram of a macroblock that is
`divided into smaller blocks according to the segmentation
`process of FIG. 12a.
`FIG. 12c is an encoding map tree illustrating segmenta
`tion of a block according to the segmentation process of
`FIG. 12a.
`FIG. 12d is an block diagram of a block that is segmented
`according to the segmentation process of FIG. 12a, and
`represented by the encoding map tree of FIG. 12c.
`FIG. 13 is a process flow diagram illustrating the steps of
`decoding blocks in a frame that has been encoded and
`transmitted by the motion compensation, segmentation and
`encoding methods of the present invention.
`DETAILED DESCRIPTION OF THE
`INVENTION
`The present invention provides a complex but efficient
`method and apparatus for compressing/decompressing video
`information that is distributed over a computer network.
`However, before discussing the detailed portions of the
`invention, a general overview will be provided.
`Referring to FIG. 3, a block diagram 300 of a data
`encoder/decoder system is shown. The system 300 includes
`original data 302, or data that is generally unencoded. The
`original data 302 may be a sequence of video frames, as
`described above, having a resolution of 320x240 pixels, and
`a color palette of 256 colors. The original data 302 is
`provided to an encoder 304 that encodes or compresses the
`data 302, and provides encoded data 306 as output. Although
`any suitable compression method may be used compress the
`original data 302, a preferred method includes that described
`in U.S. patent application Ser. No. 08/623,299 referenced
`above.
`The encoded data 306 is provided to a network delivery
`system 308 that accepts the encoded data 306 and generates
`as output encoded data 310. Typically, the network delivery
`system 308 is used to send encoded data 306 from one
`computer system on the network to another computer. The
`channels of communication used by the network delivery
`syste

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket