`US005225904A
`
`United States Patent
`
`[191
`
`[11] Patent Number:
`
`5,225,904
`
`
`
`Golin et al. Jul. 6, 1993
`[45] Date of Patent:
`
`[54] ADAPTIVE DIGITAL VIDEO
`COMPRESSION SYSTEM
`
`[75]
`
`Inventors: Stuart J. Golin, East Windsor; Allen
`H. Simon, Belle Mead; Brian Astle,
`Cranbury, all of N.J.; John M. Keith,
`Washington Crossing, Pa.
`
`Intel Corporation, Santa Clara. Calif.
`[73} Assignee:
`[21] Appl. No.: 802,169
`
`[22] Filed:
`
`Dec. 4, 1991
`
`Related U.S. Application Date
`
`[63}
`
`Continuation of Ser. No. 408.085, Sep. 15, 1989, Pat.
`No. 5,079,630.
`
`Int. c1.= ............................................. .. new 7/12
`[511
`[52} U.S. Cl. .................................. .. 358/133; 358/135;
`358/13
`
`[58] Field of Search ............... .. 358/133, 135, 12, 136.
`358/13 ‘
`
`[56]
`
`References Cited
`U.S. PATENT DOCUMENTS
`
`4,745,473
`
`5/1988 Hall ................................... .. 358/133
`
`Primary Examt’ner—Victor R. Kostak
`Attorney, Agent. or Ft‘rm——Carl L. Silverman; William
`H. Murray; Frank M. Linguiti
`
`ABSTRACT
`[57]
`A full motion color digital video signal is compressed,
`formatted for transmission, recorded on compact disc
`
`media and decoded at conventional video frame rates.
`
`During compression. regions of a frame are individually
`analyzed to select optimum fill coding methods specific
`to each region. Region decoding time estimates are
`made to optimize compression thresholds. Region de-
`scriptive codes conveying the size and locations of the
`regions are grouped together in a first segment of a data
`stream. Region fill codes conveying pixel amplitude
`indications for the regions are grouped together accord-
`ing to fill code type and placed in other segments of the
`data stream. The data stream segments are individually
`variable length coded according to their respective
`statistical distributions and formatted to form data
`frames. The number of bytes per frame is dithered by
`the addition of auxiliary data determined by a reverse
`frame sequence analysis to provide an average number
`selected to minimize pauses of the compact disc during
`playback thereby avoiding unpredictable seek mode
`latency periods characteristic of compact discs. A de-
`coder includes a variable length decoder responsive to
`statistical information in the code stream for separately
`variable length decoding individual segments of the
`data stream. Region location data is derived from re-
`gion descriptive data and applied with region fill codes
`to a plurality of region specific decoders selected by
`detection of the fillcode type (e.g.. relative, absolute,
`dyad and DPCM) and decoded region pixels are stored
`in a bit map for subsequent display.
`‘
`
`37 Claims. 32 Drawing Sheets
`
`1
`
`Google Inc.
`(3000 1006
`IPR of US Pat. No. 7,974,339
`
`
`
`US. Patent
`
`July 6, 1993
`
`Sheet 1 of 32
`
`5,225,904
`
` ENCODER
`(FIG S. 2"47)
`
`RECORDING SYSTEM
`
`S4 BIT-STREAM
`‘-31.23! IO‘ BITS/SEC.
`
`
`
`I-76.1
`
`S20
`
`
`
`PROCESSOR
`
`
`(FIGS. 48-63}
`
`
`
`U.S. Patent
`
`July 6, 1993
`
`Sheet 2 of 32
`
`5,225,904
`
`$2 S1
`
`- VIDEO IN
`
`AUXIRATA
`204
`
`206*‘ '
`
`PRE-COMPRESSION
`
`PROCESS}?
`
`(FIGS. I4-I5)
`
`
`
`1I
`
`5 1
`
`s?
`
`5,0
`
`s3
`
`Eucoogg
`gs
`-
`
`FORMATIER
`
`Z5-O
`
`(FIGS. 8-13)
`
`°‘TT5J%%‘§fi”T
`RECORDER I8
`
`FIG. 2
`
` DIGITALVIDEOA
`
`m 1
`S6
`33
`(Hes Is-47)
`
`-J
`N
`:
`MONITOR
`212
`2 _
`E
`BUFFER
`cecooe
`I
`-1
`
`cormzsson
`
`234
`-
`
`
`
`-
`
`232
`
`
`
`U.S. Patent
`
`July 6, 1993
`
`Sheet 3 of 32
`
`5,225,904
`
`1-+——.o~E FRAME %—a-I
`I
`.
`1
`'
`
`FIELD 2
`
`7
`
`5:
`H 3
`
`FlE.D I
`
`NTSC
`S2 gamposng
`worse
`
`1%” osconso
`F1574 _T AND
`Iéjjij °'°‘T‘ZE°
`737,230 ewes
`
`no 5
`'
`
`Y
`255 x 240
`
`SUB-SAMPLEIZ.)
`__ YIO
`
`"G-6
`
`sazzo arras
`
`>45oo awrzs
`
`compacsssn
`
`mama»
`°°°E°
`
`He. 7
`
`:
`
`H a
`
`VIDEO "STREAM"
`v nun
`
`<45oo BYTES
`
`gum
`VARLENG
`
`COEED
`
`Fnmsicoupnzssso: IIGITAL
`
`F[(,‘_ 3 HEADERI
`DIGIEQL
`E
`VID
`
`.
`
`FORMATTED
`r. DATA FRAME
`
`BITSTREAM,
`FR“-ME?
`FM
`FlG..9 W FRAMES DITHERED
`29.97 FPS, 5l25.l2
`AVERAGE
`
`
`
`US. Patent
`
`July 6, 1993
`
`Sheet 4 of 32
`
`5,22§,904 _
`
`FRAME F1
`
`FRAME F2 FRAME F3 CNERSIZEI
`
`FIG. I0@W
`
`575- 1/
`
`$053 :§?:>°c>
`
`‘§3oE3 lE‘£'3%
`
`‘$353
`
`DDMPRESSPCN
`‘.”°“5‘55°
`
`Hate 3000
`
`VIDEO ‘PAD
`
`VIDEO
`
`«coo-
`
`VIDEO
`
`"BoRRowEo" mom
`PREVIOUS FRAME
`
` S4
`
`I350
`
`BIT'STRE-AM
`OUT '11.‘!
`CD-ROM
`REWRDER
`Ia
`
`DATA FROM
`BUFFER 212
`
`I350
`
`
`
`US. Patent
`
`July 6, 1993
`
`Sheet 5 of 32
`
`5,225,904
`
`PROCESSOR
`220
`
`
`
`" S2 FROM SOURE I2‘
`
`PROGRAMMABLE
`GRAPHICS
`
`
`
`
`WORK STATION
`
`OPERATOR
`
`TERMINAL
`
`OUT TO SUB-FRAME
`SELECTOR 224
`
`
`
`U.S. Patent
`
`July 6, 1993
`
`Sheet 6 of 32
`
`5,225,904
`
`% “ "'“°” °PE""PR IN mom LP runes I404-I4-08
`TERMINAL 22
`R
`°
`°
`azxasfia
`
`502
`
`I504
`
`[505
`
`E N D
`
`'
`
`DIGITIZE
`5::I 480 PIXELS/FRAME
`
`L529
`“ " ' ‘ ‘ ' ' ' ' ' ' " ‘
`
`SKIP ALTERNATE
`FIELDS
`
`_ _ _ _g§__Ift_ ______ " g|I‘<>:Eia-flsLTERNATE
`
`R
`
`G
`
`B
`
`256 X240 PDCE15/FRAME
`
`TIMING
`AND
`CONTROL
`
`I530
`
`M AT R I X
`
`I516
`
`Y
`
`I
`
`Q
`
`ISIS
`
`HOR
`
`HOR
`
`I52
`
`OUT TO
`
`swmca 224
`
`VERT
`
`VERT
`
`0 UNES/PETURE
`
`fiasm
`
`Y I 0
`
`m2
`
`DISC
`STORE
`
`22
`- - - -:~- —
`
`Sta? 3 OF 4 LINES
`
`1‘.-E2
`" ' ' - * " "
`
`SKIP SOF4 PIXELS
`
`"
`
`'
`
`19:54 3: so cnch
`
`Y: 256 x 240
`
`
`
`U.S. Patent
`
`July 5, 1993
`
`Sheet 7 of 32
`
`5,22s,9o4_
`
`IN FROM YIO
`SELECTOR SWITCH 224
`
`[-‘R94 TH E3:-|o
`‘N c(NTR0LR233 LD
`
`I6|0
`
`
`
`REGION SPECIFIC
`OODER
`(FIGS I7-38)
`
`sn
`
`tnE -101:
`
`
`
` INTER-FRAME
`REGION SPECIFIC
`
`SIS (FIGS. 3943.81.32)
`I620
`
`CODER
`
`
`
` AREA DEPENDENT
`ADAPTIVE QUANTIZER '
`(FlGS.4~4. 45)
`
`I630
`
`
`
` I640
`
`-
`
`STREAM SEGMENTED
`VAR. LENGTH OODER
`(FIGS. 45, 47)
`
`OUT To
`
`DIGITAL VIDEO COMPRESSOR 3339
`
`FIG. /6
`
`
`
`U.S. Patent
`
`July 5, 1993
`
`Sheet 3 of 32
`
`5,225,904
`
`I
`
`6
`
`2
`
`I720
`
`I734 Y5
`
`3735
`
`.33
`
`W "°
`
`ANY
`REGICINB
`
`SELECT
`REGQN
`
`3 03
`
`W
`
`YES
`
`ESTIMATE
`ROUGNESS,R
`
`N0
`
`"W2
`
`SP‘-‘T
`
`no
`
`1725
`
`I
`
`I
`
`Dfiabhw
`
`DETERMINE
`BI-LINEARFILL
`A=*3y*°
`
`I724
`
`ENCCDEPIXELS
`INDNIDUALLY
`WITH or-cm
`
`
`
`ao’Jr5§3é’x5§~o
`
`
`
`
`
`U.S. PatentU.S. Patent
`
`
`
`July 6, 1993July 6, 1993
`
`
`
`Sheet 9 of 32Sheet 9 of 32
`
`
`
`5,225,9045,225,904
`
`
`
`
`
`
`
`U.S. Patent
`
`July 6, 1993
`
`Sheet 10 of 32
`
`5,225,904 ,
`
`
`
`@ @
`
`2:02
`
`N0 "SLOPE" OR GRADIENT
`
`FlLL=Ax+ By + c
`A‘-=0
`
`B=O
`
`cooE:c;§soo5 2
`
`W 2, @ @
`
`.
`
`"~:::.':.:*3:;s:r2PE
`
`MI
`
`(‘-3) @ ©
`
`cone-$53104
`
`on
`
`'
`
`.
`
`”“:=L".f;’ff'é,iL‘2“
`
`2 02
`
`G2 F © <9‘
`
`F7 . 2 @ @ @
`69 ® (9
`
`2302
`
`A=0
`cooE=iE§ O"! 6
`
`a=-|
`
`6) (9 ® ”"?;‘CTix*1°B’§;f‘e°’°E
`F1623 @ @ @
`3:1!
`® @ ©
`
`O0DE=
`
`Pl 5
`
`
`
`
`
`US. PatentUS. Patent
`
`
`
`5,225,9045,225,904
`
`
`
`US. Patent
`
`July 6, 1993
`
`Sheet '12 of 32
`
`5,225,904
`
`2Il
`
`ZI4
`
`ZIB
`
`
`
`ALDO
`
`25 COMSFESSOR
`
`55
`lN
`
`EST
`
`BY"TE Lffdfl‘
`( 256 J
`
`
`
`FIG. 25
`
`Fla 26
`
`2604
`
`2605
`
`
`
`QUAD-TREE
`REG IONALIZATI ON
`
`
`
`U.S. Patent
`
`July 6, 1993
`
`Sheet 13 of 32
`
`5,225,904_
`
` 27%
`
`SPLIT 3
`
`SPLIT 2
`
`TREE CODING
`usms N005 vgwgs
`
`TREE CODING
`, USING NODE DIFFERENCES
`
`SPLIT 1(122'l28= '6)
`
`
`
`FIG. 30
`
`FIG. 3/
`
`sPI_IT/ FILL
`TREE cons
`I-IF I41 VHF 90 F 93 F I12
`H: HORIZONTAL SPLJT ACTION
`v.- VERTICAL SPLl'T ACTION
`P: FILL ACTION
`
`sI=I.JT/ I-‘ILL
`TREE CODE
`3. ‘-§;l_r-é*I§_vF§I68 ‘S1186!-;__F;:
`'
`°' '2? 75 F535 5 '5 F”
`s-sImpIa SPLIT )
`
`A-Aliernoie SPLIT
`
`
`
`
`
`U.S. PatentU.S. Patent
`
`
`
`July 6, 1993July 6, 1993
`
`
`
`Sheet 14 of 32Sheet 14 of 32
`
`
`
`5,225,9045,225,904
`
`
`
`U.S. Patent
`
`July 6, 1993
`
`Sheet 15 of 32
`
`5,225,904
`
`DETECT EDGES
`m QUADRANT5
`CF REGION
`
`-
`
`‘
`
`MULVPD’
`VF=H N13 - V24
`HF=W'HI2' H34
`
`FIG. 33A
`
`FI6‘. 338
`
`16
`
`
`
`US. Patent
`
`July 6, 1993
`
`Sheet 16 of 32
`
`5,22s,904_
`
`
`
`"SIMPLE" SPLITS
`r---““——fi
`3402
`340
`
`H>W
`
`SPL|'|'
`HORIZ.
`
`I H W
`
`>H
`
`SPLIT
`VERTICAL
`
`FIG. 34A
`
`FIG‘. 348
`
`'hLTERN ATE" SPLITS
`T
`3405
`3408
`
`H>W
`
`SPLTT
`VERTICAL
`
`W" W
`
`>H
`
`SPLIT
`HORIZ.
`
`
`
`U.S. Patent
`
`July 6, 1993
`
`Sheet 17 of 32
`
`5,225,904
`
`PIXEL mo
`FTSNEAR
`uarsnsons
`
`® ® G)
`
`.
`FILTER
`wasms
`
`Q) G) G)
`
`3502
`
`® 1:: © 3503 ® 6) ®
`C9
`G)
`G) @ 0
`FIG 3544
`FIG.35B
`
`
`
`onnn-212:0 VALUES
`
`*1
`“H 1 11° ? *f””%f”°”§”‘*f
`35oe{1I 5/\5 9
`1o'1o
`11 121313 1 17%
`
`5
`
`1
`
`35o4{1
`
`2
`
`1
`
`2
`
`1
`
`2
`
`2
`
`wara-rrzo MEDIAN (FILTER oLm=-UT)
`
`wsnewreo onpsaso muss
`H6350
`
`FIG. 355
`
`3510
`
`® ® ® EXEMFMRY
`wznemsron
`
`® ® <9
`G9 G) G)
`
`
`
`U.S. Patent
`
`July 6, 1993
`
`Sheet 13 of 32
`
`5,225,904
`
`
`
`I-76. 363
`
`I
`
`2/3
`
`PING FACTOR. D
`
`P. . 3 g
`'
`
`‘I00
`
`“.30
`
`O
`
`50
`
`DIFFERENCE SIGNAL
`
`0 O o 2‘
`E00
`255
`
`
`
`
`
`U.S. Patent
`
`July 6, 1993
`
`Sheet 19 of 32
`
`5,225,904
`
`VERI-.FLT_ _
`
`Yv
`
`
`
`702
`
`EGION
`WITHOUT
`EDGES
`
`- - -I
`I
`
`Y
`
`I
`I
`
`I HDRIZ.
`I
`FIT
`
`Ax-i-C I
`
`I I
`
`I
`
`FI6‘. 37
`
`
`
`
`MEASURE VERT
`FIT OF ay+c
`TO Yv DATA
`
`
`
`MEASURE HORIZ
`FIT OF AX4-C
`TO YH DATA
`
`'
`
`FIG‘. 38
`
`20
`
`
`
`U.S. Patent
`
`July 6, 1993
`
`Sheet 20 of 32
`
`5,225,904
`
`no
`
`3929
`
`3910 ES
`
`moons
`XoYo
`
` RTE
`
`
`
`US. Patent
`
`"July 6, 1993
`
`Sheet 21 of 32
`
`5,225,904
`
`i-vo_:" " ,
`
`P'°°§£EFo?.°3?'“°
`FIG: 40?
`
`4002
`I
`
`HOVEMENT
`
`
`
`"" PREVICIJS FRAME.C
`
`REGION or-'
`
`CURRENT FRAME
`{TA.RGET.T)
`
`LU
`
`RU
`
`BEST
`DIR
`
`L
`
`R
`
`FIG. 4/
`
`NEXT
`
`D
`
`RD
`
`FIG. 42
`
`
`
`
`
`TEST UPTO B
`DIRECTIONS
`FOR BEST DIR.
`
`
`
`
`REPR ESENTATI VE
`PIXEL C-OORDINATES
`
`
`STEP IN BEST
`DIRECTKJN
`
`LNTIL N0
`IMPROVEMENT
`
`22
`
`
`
`U.S. Patent
`
`July 6, 1993
`
`Sheet 22 of 32
`
`5,22S,904_
`
`RELATIVE FILL CODE: A + B + C -4- 76+ Yo
`
`EXAMPLE I,
`
`“I,
`
`0,
`
`Kb, Yo
`
`23
`
`
`
`U.S. Patent
`
`July 6, 1993
`
`Sheet 23 of 32
`
`5,225,904
`
`REGION
`A¥¥f§
`
`QUANTIZATION
`uHT3}
`
`was
`
`oummza
`sans
`
`FIG. 45
`
`’ 45:5
`
`<3»
`
`4518
`
`4502
`
` No
`
`4530
`
`N0
`
`LA
`" EGIO
`
`YES
`C33
`
`24
`
`5 BITS
`
`OUANTIZE
`4 BITS
`
`4528
`
`QUAN'11ZE
`3 BITS
`
`S1DRE
`mm
`
`4510
`
`
`
`U.S. Patent
`
`July 6, 1993
`
`Sheet 24 of 32
`
`5,225,904
`
`FIG. 46
`
`VAR. LENGTH GDER 1640
`
`BUFFER 232
`
`1542
`
`
`
`- SI4 IN FROM
`OUANTIZER I530
`
`DYAD FILL
`
`n—.———.a-— -an 1
`
`RELATIVE
`C-u——-pun-
`FILL
`
`TREE
`DE$RiPT|ON
`
`FIG. 47
`
`FIXED LENGTH GDDED
`
`VAR LENGTH CDDED WITH IMPLICIT TABLE
`
`VAR. LENGTH CODED W1TH COL.2 TABLES
`
`
`
`RELATIVE ABSOLUTE
`DATA
`DATA
`
`DPCM
`DATA
`
`DYAD
`DATA
`
`
`
`CODE
`TABLES
`
`THE
`DESCR.
`_
`
`STREAM FORMAT FOR EACH OF ‘(,1 AND O
`SUBFRAMES
`
`25
`
`
`
`U.S. Patent
`
`July 6, 1993
`
`Sheet 25 of 32
`
`5,22$,904_
`
`FIG. 48
`
`AND
`
`MATRIX
`
`26
`
`
`
`U.S. Patent
`
`July 6, 1993
`
`Sheet 26 of 32
`
`5,225,904
`
`‘‘%'%'=‘%’‘‘1 cm we
`
`
`
`X0 Y0 (REL OR
`
`DYAD REGIONS)
`
`FIG: 52
`
`*—
`§_
`
`4826
`
`' ' ' "".v,;gg,o
`" " ' ' ' ' ' " "
`4323
`5202
`---- a 4329 1
`Jkadezo
`
`DECODING
`IMAGE
`
`PREV.
`IMAGE
`
`GRAPHICS ""“‘"""*“"* ON
`BUFFER
`PIPELJNE BUF.
`DISPLAY
`
`27
`
`
`
`U.S. Patent
`
`July 6, 1993
`
`Sheet 27 of 32
`
`5,225,904
`
`
`
`P(1C.yJ=P'(X.5)+Ax+B_y+C T
`
`AREA
`
`mu:-’?1’E
`am FROM
`
`MAP4822
`
`RELATIVE FILL DEC-ODER £55
`
`547°
`
`3
`3 @
`
`fig;
`
`DE-QUAN1’.
`
`5472 PROCESS --
`
`F1!-L
`
`@
`
`LOCATION
`FROM
`REG ION
`TABLE 4824
`
`LATC H
`
`5402
`
`28
`
`
`
`U.S. Patent
`
`July 6, 1993
`
`Sheet 23 of 32
`
`5,225,904.
`
`
`
`P(x,y)=Ax 4- By -1- C
`
`HG. 55
`
`AREA
`
`ABSOLUTE FflJ.DECODER 4854
`
`_
`8 ® 5610
`5606
`3331??
`48mg
`NU.
`BW
`DEQUANT- ® sees W
`
`LOGIC
`
`4326
`
`Afi
`FROM MAP
`
`5572
`
`PROCESSOR
`
`IIC II
`DATA
`
`FROM
`REGION
`TABLE
`4334
`
`ADDRESS
`courmaas
`5504
`
`LATCH
`
`5502
`
`X.Y.H.W
`
`FIG. 56‘
`
`29
`
`
`
`U.S. Patent
`
`July 6, 1993
`
`Sheet 29 of 32
`
`5,225,904
`
`
`
`FIG 58
`
`DPCM FflJ.DECODER 4852
`
`DPCM
`DATA
`
`FROM
`REGKN
`‘M313
`4824
`
`CONSTANT
`(:23)
`
`5803
`
`
`
`30
`
`
`
`U.S. Patent
`
`July 6, 1993
`
`Sheet 30 of 32
`
`5,225,904
`
`nu. DATA IN
`
`W-“E5
`
`BOIO
`
`
`
`BIT
`SHIFTER
`
`FILL DATA OUT
`
`
`
`U.S. Patent
`
`July 6, 1993
`
`Sheet 31 of 32
`
`5,225,904 _
`
`|<— x.——».!
`
`FIG‘. 6‘!
`
`C T
`
`ABLE LOOK-UP (CODE; )= R,;,
`
`-
`
`A,;--C; + R; B;-= D..;+ S;
`
`32
`
`
`
`U.S. Patent
`
`July 6, 1993
`
`Sheet 32 of 32
`
`5,225,904
`
`
`
`DYAD FILL
`ozcooea
`4355
`
`ADDRESS
`GEN
`
`x,v,H,w,x.:r.
`FROM REGION
`TABLE4B24
`
`33
`
`
`
`1
`
`5,225,904
`
`2
`pression procedures to providing an optimum coding
`specific to the characteristics of the region being coded.
`In accordance with another aspect of the invention, a
`digital motion video signal is compressed using com-
`pression thresholds controlled as a function of the num-
`ber of bytes per frame and an estimate of the decoding
`time per frame of the compressed signal.
`In accordance with yet another aspect of the inven-
`tion, a video frame is split repeatedly to provide a plu-
`rality of regions to be individually encoded, and the
`split direction, vertical or horizontal, is determined by a
`comparison of distributions of_._pixel parameters associ-
`ated with the regions.
`
`BRIEF DESCRIPTION OF THE DRAWING
`
`The foregoing and further features of the invention
`are shown in the accompanying drawing in which like
`elements are denoted by like reference designators and
`in which:
`
`FIG. 1 is a block diagram ofa digital video interac-
`tive system embodying the invention providing record-
`ing and reproduction of full-motion video, multi-cItan-
`nel digital audio and auxiliary (e.g., interactive) data
`using a compact disc read-only memory {CD-ROM} as
`the recording media;
`-
`FIG. 2 is a block diagram of a digital video encoder
`used in a recording portion of the system of FIG. 1;
`FIGS. 3-9 are diagrams illustrating digital video
`signal formats at various stages of processing in the
`encoder of FIG. 2;
`FIGS. 10-12 are diagrams illustrating two methods of
`processing "oversized“ frames in the encoder of FIG. 2;
`FIG. 13 is a block diagram of a forrnatter providing
`padding and dithering for use in the encoder of FIG. 2;
`FIG. 14 is a block diagram of a pre-compression
`processor used in the encoder of FIG. 2;
`FIG. 15 is a block diagram illustrating details of a
`portion of the processor of FIG. 14;
`FIG. 16 is a block diagram ofa digital video compres-
`sor used in the encoder of FIG. 2 providing intra-frame
`and inter-frame region-specific coding, quantization by
`region area and frame-segmented variable length cod-
`3113;
`FIG. 17 is a flow chart illustrating operation of an
`intra-frame coder used in the compressor of FIG. 16 for
`compressing still video frames and the first frame of a
`motion video sequence;
`FIG. 18 is a region diagram illustrating image edge
`analysis used in the compressor of FIG. 16;
`FIG. 19 is a block diagram of a roughness estimator
`providing split/fill decisions for use in the compressor
`of FIG. 16;
`'
`_
`FIGS. 20-23 are region diagrams illustrating bi-linear
`absolute fill coding used in the compressor of FIG. 16;
`FIG. 24 is a region diagram illustrating measurement
`of boundary errors;
`FIG. 25 is a block diagram of an audio compressor
`used in the encoder of FIG. 2;
`FIG. 26 is a diagram illustrating quad-tree regional-
`ization;
`FIG. 27 is a diagram illustrating binary tree regional-
`ization of an image in the compressor of FIG. 16;
`FIGS. 28 and 29 are examples of split/fill coding
`diagrams for the regionalized image of FIG. 27;
`FIGS. 30 and 31 are examples of "tree" codes for the
`coding diagrams of FIGS. 28 and 29, respectively;
`
`ADAPTIVE DIGITAL VIDEO COMPRESSION
`SYSTEM
`
`This is a continuation of copending application Ser.
`No. 07/403,085 filed on Sep. 15, 1989 now U.S. Pat. No.
`5,079,630.
`
`FIELD OF THE INVENTION
`
`This invention relates to video signal processing gen-
`erally and particularly to systems for reducing the
`amount of digital data required to represent a digital
`video signal to facilitate uses. for example, such as the
`transmission, recording and reproduction of the digital
`video signal.
`BACKGROUND OF THE INVENTION
`
`The need for compression to facilitate recording of a
`digital video signal on relatively narrow-band media,
`such as a compact disc (CD), has been recognized. In a
`system proposed by Taltahashi et al. in U.S. Pat. No.
`4,520,401, a digital video signal is encoded using differ-
`ential pulse code modulation (DPCM) for recording on
`a digital audio disc. In the known system, luminance (Y)
`and chromirtance (R-Y, B-Y) components of a video
`frame are separately compressed using DPCM. A cir-
`cuit divides the components into picture element data
`groups of a specific number of rows or columns which
`are adjacent on a screen. A header signal is provided
`having a synchronizing signal, a picture mode identifi-
`cation signal and a picture information quantity identifi-
`cation code. The header signal is added to the beginning
`position of each of the divided picture eiement data
`groups to produce a digital video output signal having a
`signal format in which the digital luminance, the two
`kinds of digital color difference signal and the header
`signal are time sequentially multiplexed and recorded.
`In an example of the Takahashi et at. system still
`frames of digital video are recorded and updated at a
`rate of about four seconds per frame. The division of the
`compressed data into groups of lines with each group
`containing complete color information provides a psue-
`do-motion effect in that the line groups may be sequen-
`tially updated awhile displaying the previous frame
`thereby providing a partially moving picture.
`SUMMARY OF THE INVENTION
`
`The present invention is directed to meeting the need
`for a compression system for providing a compressed
`digital video signal representative of a full motion color
`video signal. which is suitable for recording or transmis-
`sion using relatively narrow band media and which may
`be decompressed at speeds at least equal to conven-
`tional video frame rates. In a specific embodiment de-
`scribed herein over one-hour of recording time has been
`achieved for compact disc read only memory (CD-
`ROM) recording media for 30 frame-per-second full
`motion color digital video recording.
`In accordance with an aspect of the invention. the
`first and second frames of a digital motion video signal
`are compressed using diflerent compression methods
`and an output signal is fortned including an identifica-
`tion code signifying each compression method.
`In accordance with another aspect of the invention,
`each frame of a digital video signal is divided to form a
`plurality of regions and each region is separately ana-
`lyzed and encoded by a selected one of several com-
`
`5
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`45
`
`50
`
`55
`
`65
`
`34
`
`
`
`5,225,904
`
`3
`FIGS. 32A—.I are region diagrams illustrating edge
`distribution analysis for determining a most favorable
`region split direction;
`FIG. 3A is a flow chart for computer apparatus for
`determining a most favorable split direction in the com-
`pressor of FIG. 16 by analysis of the distribution of
`horizontal and vertical edges in a region;
`FIG. 33B is _a table listing of parameters for the appa-
`ratus of FIG. 3A;
`FIGS. 34A and 34B are diagrams illustrating two
`forms of region splitting in the compressor of FIG. 16;
`FIGS. 35A—35E are diagrams illustrating weighted
`median filtering in the compressor of FIG. 16;
`FIGS. 36A-36C are diagrams illustrating non-linear
`low-pass filtering for use in the encoder of FIG. 16;
`FIG. 37 is a diagram illustrating finding a most favor-
`able split direction by polynomial fit comparisons;
`FIG. 38 is a flow chart for computer apparatus imple-
`menting the split direction method of FIG. 3‘.-';
`FIG. 39 is a flow chart illustrating operation of an
`inter-frame coder used in the compressor of FIG. 16 for
`coding the second frame and all subsequent frames of a
`motion video sequence;
`FIG. 40 is a diagram illustrating region translation in
`the inter-frame coder of FIG. 39;
`FIGS. 41 and 42 are vector and flow chart diagrams,
`respectively,
`illustrating selection of a best
`region
`search direction in the inter-frame coder of FIG. 39;
`FIG. 43 is a diagram illustrating region translation
`and relative coding used in the inter-frame coder of 30
`FIG. 39;
`FIG. 4-4 is a table illustrating region area dependent
`adaptive quantization used in the compressor of FIG.
`16;
`FIG. 45 is a flow chart illustrating operation of the 35
`apparatus in FIG. 16 providing area dependent quanti-
`zation of FIG. 44;
`FIG. 4-6 is a block diagram of a stream segmented
`variable length coder for use in the compressor of FIG.
`16;
`FIG. 47 is a diagram illustrating the format of data
`‘‘streams‘’ provided by the compressor of FIG. 16;
`FIG. 48 is block diagram of a compressed digital
`video signal decoder used in the playback system 8 of
`FIG. 1;
`FIGS. 49, 50 and 51 are examples of table listings of
`data stored in a region location memory of the decoder
`of FIG. 48 for absolute, relative, dyad and DPCM
`coded regions of FIG. 48;
`FIG. 52 is a block diagram illustrating a memory 50
`organization for use in the decoder of FIG. 4-8;
`FIG. 53 is a diagram illustrating relative region de-
`coding of an inter-frame coded region by the decoder of
`FIG. 48;
`FIG. 54 is a block diagram of apparatus providing the 55
`relative decoding of FIG. 53;
`FIG. 55 is a diagram illustrating absolute region de-
`coding in the decoder of FIG. 48 of an intra-frame
`coded region;
`FIG. 56 is block diagram of apparatus providing the 60
`absolute decoding of FIG. 55;
`FIG. 57 is a diagram illustrating DPCM decoding of
`a region in the decoder of FIG. 48;
`FIG. 53 is block diagram of apparatus providing the
`region DPCM decoding of FIG. 57;
`FIG. 59 is a table listing of area dependent adaptive
`quantization values for "dequantizing" pixel data in the
`decoder of FIG. -I-8;
`
`40
`
`4
`FIG. 60 is a block diagram of apparatus for providing
`area dependent dequantization in the decoder of FIG.
`43;
`FIGS. 61 and 62 are diagrams illustrating dyad de-
`coding in the decoder of FIG. 4-8; and
`'
`FIG. 63 is a block diagram of a dyad decoder for use
`in the decoder of FIG. 48;
`
`I DETAILED DESCRIPTION
`The digital video interactive system of FIG. 1 com~
`prises a recording system 6 and a playback system 8.
`The recording system includes sources 10, 12 and 14
`which provide,
`respectively, a multi-channel sound
`signal SI, a color motion video signal 52 and an auxil-
`iary data signal 53. An encoder I6 encodes and com-
`bines signals 51. S2 and S3 to form a digital recording
`signal S4 (hereinafter, "bit-stream") that is recorded on
`a compact disc read-only memory (CD-ROM} disc 20
`by means of a CD-ROM recorder 18. Auxiliary data
`signal 83 may comprise interactive data associated with
`the video or audio signals or some other type of digital
`data which may be independent of the audio or video
`data.
`'
`The average data rate of the bit-stream S4 is con-
`trolled by a selection of encoding parameters to equal-
`the standard CD«-ROM record/playback bit-rate of
`about 1.2 mega-bits per second. The parameters are
`selected, as will be explained, so as to enable recording
`of up to one hour of full-motion digitally encoded color
`video, multi-channel digital audio and auxiliary data on
`CD-ROM disc 20.
`The encoding of the digital full-motion color video
`portion of the recording signal to meet the relatively
`low channel capacity of the CD-ROM disc player re-
`quires very substantial data reduction. In a specific
`example to be described. this data reduction is on the
`order of about 150:1 for an exemplary video frame rate
`of 30 FPS (frames per second). To meet this critical
`requirement, while avoiding visible "artifacts" associ-
`ated with conventional video compression techniques,
`encoder 16 converts the video signal S2 to a color frame
`sequential component form and separately subjects each
`frame of each component
`to a number of specially
`adapted processes as will be described. Briefly listed,
`these include variable sub-sampling, variable irtter-
`frame and intra-frame compression employing what
`will herein be tenned "region-specific” encoding, area
`dependent adaptive quantization, "segmented" variable
`length coding, reverse frame sequence reformatting,
`padding and frame dithering.
`-
`The selection of the individual processes, the selec-
`tion of the share of data reduction provided by each and
`the selection of variable compression parameters (e.g.,
`thresholds, operating modes and, particularly, when to
`quit compressing) represents critical choices in meeting
`the objective of encoding full motion color video for
`storage on CD-ROM digital audio tape (DAT) or other
`bandwidth limited media. Such choices depend on more
`than merely the channel capacity of the CD-ROM me-
`dia. They depend as well on variables such as the video
`frame rate, the desired spatial resolution, certain spe-
`cific characteristics of the video image content and on
`parameters of the decoder that is ultimately used for
`reconstituting the image. As will be explained, each
`individual video frame is converted to a component
`fonn and each component is divided to form a number
`of blocks (herqafter “regions") of picture eletnents
`(“pixels"). Each region is then individually “custom"
`
`S
`
`IO
`
`15
`
`20
`
`25
`
`45
`
`65
`
`35
`
`
`
`5,225,904
`
`5
`encoded. This process is hereafter referred to as "re-
`gion-specific" coding. The coding for each region is
`selected from a group of codes to enable the video
`decoder in playback system 8 to meet the strict require-
`ment of completing all decoding tasks assigned to it in
`“real time", that is, within one video frame interval (a.
`variable).
`The foregoing and other aspects of recording system
`6 are discussed in detail with reference to FIGS. 2-47
`and 61, 62. Details of playback system 8 are discussed
`later with reference to FIGS. 48-63.
`
`Encoder 16. in FIG. 2. includes input terminals 202,
`204 and 206 for receiving audio signal S] from source
`10, video signal 52 from source 12 and auxiliary data
`signal 53 from source 14, respectively. As an overview
`of the audio processing, signal S] is subjected to chan-
`nel selection and analog-to-digital (A/D) conversion,
`compressed with provisions for preventing fran'fe-to-
`frame propogation of errors and stored for later recov-
`ery as blocks of audio data to be included in each video
`frame of bit stream 54 thereby providing audio/video
`synchronization.
`In detail, audio signal S1 is applied to a channel selec-
`tor and analog-to-digital (A/D) converter unit 208
`which includes operator controls (not shown) for se-
`lecting the number of channels to be encoded and the
`channel sampling rate. One channel
`is selected for
`monophonic recording. two for stereo. four for stereo/—
`bilingual. etc. The sampling rate currently used for high
`quality audio recording is 31.25 KHZ which supports a.
`15 KHZ audio bandwidth. The rate may be halved for
`standard quality or quartered for voice grade audio
`applications.
`The data rate of the digitized audio signal 55 is re-
`duced for recording by means of an adaptive differential
`pulse code modulation (ADPCM) encoder 210 which
`encodes the sample-to-sample differences of signal 55 to
`form a compressed digital audio signal S6. Since succes-
`sive audio samples are often highly correlated, fewer
`bits are required to encode the sample differences. The
`term “adaptive" means that the encoder is of a type that
`changes the bit significance of encoded differences as a
`function of the previous encoded difference so as to
`provide fine resolution over a wide dynamic range.
`Encoder 210 may be of conventional design but it is
`highly desirable for purposes of overall audio/video
`coding that provision be made either to bypass or reset
`it on a periodic basis so as to periodically encode an
`audio sample with full resolution. Illustratively, en-
`coder 21!} (FIG. 25) is reset once every 256 bytes. Re-
`call that the audio signal is ultimately organized in a
`block form with one block of audio data included with
`each block of video data in bit stream S4. The formation
`of audio data “bloclts" is supported via buffer store 212
`which stores signal S6. Later the formatter 250 recovers
`the stored signal (S?) periodically on a frame-by-frame
`basis when the audio and video data are combined as
`will be explained. Typical audio block sizes currently
`used are 130 and 134 bytes for a video frame rate of 30
`FPS and voice grade audio. The audio block size de-
`pends on the sampling rate, the number of audio chan-
`nels to be recorded, and audio dithering within the
`formatter 250.
`.
`One reason for periodically resetting or bypassing
`DI-‘CM encoder 210 is to prevent audio errors, which
`may occur in the CD-ROM transmission system, from
`propagating from frame—to-frame. This feature also
`facilitates subsequent editing of sequences to enable any
`
`6
`frame to be chosen as an edit point. This feature is im-
`plemented as, shown irt FIG. 25 by means of a compara-
`tor 2l4 which supplies a reset signal to reset input R of
`audio A DPCM encoder 211 when the byte count of the
`compressed audio signal 56 (produced by a byte
`counter 216) exceeds the byte limit set by a byte limit
`source 218.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`45
`
`50
`
`55
`
`65
`
`VIDEO CODING OVERVIEW
`
`The principal elements providing video encoding in
`FIG. 2 comprise a pre-compression processor 220, a
`digital video compressor 230 and an output signal for-
`matter 250 which are described herein in detail with
`
`reference to FIGS. 3-47. As an overview. processor 220
`provides conversion of video signal 52 to a non-stand-
`ard format that provides a variable amount of data re-
`duction, facilitates subsequent compression and contrib-
`utes to certain features of the system relating to variable
`frame-rate processing for controlling spatial-temporal
`resolution. Some images are converted at one frame rate
`for subsequent display at an entirely different rate.
`Compressor 230 employs, broadly speaking, four
`types of processing for reducing the quantity of digital
`data to encode a frame to a specific “optimum" value.
`This value is related to the CD-ROM channel capacity
`but varies as a function of several variables including
`the frame rate, the desired spatial-temporal resolution.
`and other factors relating to error propagation and
`visual appearance. The processing “types" include in-
`tra-frame region-specific coding for still frames and for
`the first frame of a motion video sequence. Inter-frame
`region-specific coding is used for the second and subse-
`quent frames of a motion video sequence. Encoded
`frames are subjected to further data reduction by two
`processes in compressor 230 which will be referred to
`herein as “area dependent adaptive quantization" and
`"segmented stream variable length coding". These pro-
`cesses are applied to each video frame to reach the
`desired "optimum” value noted above. Some sequences
`of frames may be repeatedly compressed with a change
`of compression thresholds to reach the optimum com-
`pression value.
`From time to time, an ‘‘impossible‘‘ frame may be
`encountered which is hopelessly oversized and can not
`be reduced to the desired byte count by altering com-
`pression parameters without_ introducing noticeable
`visual artifacts. Such oversized frames receive special
`treatment in formatter 250 which combines the audio,
`video, auxiliary (e-.g.,
`interactive) and other data to
`create the recording bit-stream signal S4. Specifically,
`formatter 250 analyzes frames backwards from the last
`frame to the first and “borrows" space from short
`frames to hold the extra data of the oversized frames.
`Other functions provided by formatter 250 include add-
`ing "padding" data to undersized frames and dithering
`the number of bytes of data per frame to arrive at a
`specific average frame rate selected to keep the CD-
`ROM system operating at its maximum channel capac-
`ity and to avoid pauses during playback. Pauses are
`avoided because the recovery time (the "seek mode
`latency") of a CD-ROM player can be lengthy and
`unpredictable.
`Details of video processing are discussed in the fol-
`lowing five sections entitled “Video Pre-Compression
`Processing", “Video Compression Processing“, "Post-
`Compression Processing“, "Playback System", and
`“Video Decoding".
`
`36
`
`
`
`'7
`
`5,225,904
`
`8
`coded frame or may more accurately encode the same
`number of pixels as at the higher frame rate (30 FPS).
`FIG. 14 shows a specific implementation of pre-com-
`pression processor 220 for providing the variable sub-
`sampling and format conversion functions previously
`described. Processor 220 comprises an RGB decoder
`1402 which converts the composite video signal
`to
`RGB component form. The RGB components are ap-
`plied via anti-aliasing (2 MHz} low-pass filters (1404,
`1406 and 1408) to inputs of a programmable graphics
`workstation 1410. A suitable workstation is the "Adage
`3000 Color Raster Display System". Operator control
`unit 222 of FIG. 2 comprises a terminal unit 222' (in
`FIG. 14) which supplies a "skip list“ of fields, lines and
`pixels to workstation 1410 as well as anti-alias filter
`coefficients and sample rate control data. Data reduced
`sub-frames of Y, I and Q samples are produced by the
`work station and stored in a disc store 1412.
`FIG. 15 is a block diagram illustrating the specific
`programmed configuration of workstation 1410 for use
`in processing video signal S2 to create the non-standard
`sub-frame signal format of FIG. 5. The anti—alias filtered
`analog RGB signals provided by filters 1404-1408 are
`applied to respective
`analog-to-digital
`converters
`1502-1506 which digitize the signals at a rate selected to
`provide 512 pixels per active line interval as controlled
`by terminal 222’ coupled to the workstation timing and
`control unit 1530. The digitized RGB signals (FIG. 4)
`are sub-sampled by two banks of switches 1510 and
`1514. Switches 1510 are timed by unit 1530 to skip alter-
`nate fields'of the RGB signals. Switches 1514 skip alter-
`nate pixels, so that the resultant digitized and sub-sam-
`pled RGB signals each comprise arrays of 256x240
`pixels per frame.
`A matrix 1516 converts the sub-sampled RGB signals
`to YIQ form. The I and Q colo