`(12) Patent Application Publication (10) Pub. No.: US 2003/0091338A1
`(43) Pub. Date:
`May 15, 2003
`Snow et al.
`
`US 20030091.338A1
`
`(54) METHOD AND APPARATUS FOR
`EXTRACTING DIGITAL DATA FROMA
`MEDIUM
`
`(76) Inventors: Kevin Snow, Mountain View, CA (US);
`David Clifford, San Jose, CA (US)
`
`Correspondence Address:
`Thomas C. Webster
`BLAKELY, SOKOLOFF, TAYLOR & ZAFMAN
`LLP
`Seventh Floor
`12400 Wilshire Boulevard
`Los Angeles, CA 90025-1026 (US)
`
`(21) Appl. No.:
`(22) Filed:
`
`09/991,088
`Nov. 13, 2001
`Publication Classification
`
`(51) Int. Cl." ............................. H04N 5/76; H04N 5/781
`(52) U.S. Cl. ................................................. 386/96; 386/98
`
`(57)
`ABSTRACT
`A computer-implemented method and apparatus for extract
`ing digital data from a medium. Digital audio extraction
`techniques are implemented with additional features that
`improve the quality of the resulting audio playback files. In
`one embodiment of the invention, a user can extract digital
`audio data from a Source medium and Store the data as a file
`or, alternatively, Stream the data into a memory. The file/
`Stream can then be analyzed to determine the precise loca
`tions, in time, at which the Sound levels represented by the
`data croSS a Specified threshold, particularly at locations near
`and between track edges. This information can be Stored for
`reference. Subsequently, the file/stream can be accurately
`divided into Smaller Segments wherein each Segment con
`tains one or more complete and distinct tracks, and the data
`representing Sound levels below the Specified threshold can
`be excluded from the resulting Segments. The Segments may
`then be encoded and/or further divided into standard play
`back files. Another embodiment of the invention includes an
`audio player. The player can play the playback files, in their
`original order, and can automatically include Sections of
`Silence between tracks, where required, and will refrain from
`including Sections of Silence or pauses between tracks that
`are intended to be played end-to-end.
`
`705
`
`Abbey Road
`Title:
`1968
`Year:
`Beatles
`Artist:
`Track: 1 "Come Together"
`
`
`
`
`
`
`
`760
`
`730
`
`-1-
`
`Amazon v. Audio Pod
`US Patent 9,319,720
`Amazon EX-1080
`
`
`
`Patent Application Publication May 15, 2003. Sheet 1 of 8
`
`US 2003/0091338A1
`
`
`
`
`
`I MOVAL ? o C |
`
`2
`
`/1
`
`|
`
`| | |
`
`|
`
`|0?, ?,
`
`? 7-09 | 09 V-J
`
`L/T, !
`
`| | | | |
`
`-2-
`
`
`
`Patent Application Publication May 15, 2003. Sheet 2 of 8
`
`US 2003/0091338A1
`
`
`
`-3-
`
`
`
`Patent Application Publication May 15, 2003. Sheet 3 of 8
`
`US 2003/0091338A1
`
`-302
`
`-304
`
`31 O
`
`Audio Track
`
`Data Extraction
`
`N-320
`NY-330
`
`
`
`PCM Data
`
`PRIOR ART
`
`FIG. 3
`
`-4-
`
`
`
`Patent Application Publication May 15, 2003. Sheet 4 of 8
`
`US 2003/0091338A1
`
`
`
`
`
`|- v »Ovell -->|<- exoval-->|<--
`
`|
`
`| { |
`
`i
`
`| i
`
`997–
`
`G97 –
`
`===
`
`
`
`| wae”
`
`077·
`
`|
`
`021,J
`
`
`
`
`
`vel 4
`
`
`
`|
`
`-5-
`
`
`
`Patent Application Publication May 15, 2003 Sheet 5 of 8
`
`US 2003/0091338A1
`
`
`
`Raw Audio Data is read
`from source medium
`
`510
`
`Raw Audio Data is Stored
`520
`TOC
`in file or is streamed from C. Data
`medium
`524
`
`
`
`
`
`
`
`Raw Data
`File or
`
`Stream
`420
`
`Raw Audio Data is
`530
`index
`analyzed to determine CD 35
`SOUndevels
`
`Raw Audio Data is
`divided into segments
`
`Segments are encoded
`
`540
`
`550
`
`Segments are spliced
`into disCrete tracks
`
`560 -
`
`-
`
`FIG. 5
`
`-6-
`
`
`
`Patent Application Publication May 15, 2003 Sheet 6 of 8
`
`US 2003/0091338A1
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Raw Audio Data is read
`from Source medium
`
`610
`
`Raw Audio Data is stored in
`file Or is streamed from
`SOurce medium
`
`Raw Audio Data is
`analyzed to determine
`SOund levels
`
`630
`Index
`> / File:35
`
`Raw Audio Data is
`encoded
`
`
`
`Data File
`645
`
`Portions of EnCOded
`Audio Data are played
`back with reference to
`Index File
`
`: 650
`
`FIG. 6
`
`-7-
`
`
`
`Patent Application Publication May 15, 2003 Sheet 7 of 8
`
`US 2003/0091338A1
`
`09/
`
`09/.
`
`90/
`
`
`
`
`
`
`
`-8-
`
`
`
`Patent Application Publication May 15, 2003. Sheet 8 of 8
`
`US 2003/0091338A1
`
`N
`
`A
`
`A
`
`o
`d
`so
`
`g
`
`OO
`S.
`l
`
`S.
`
`o
`
`O
`ce
`o
`
`O
`N
`o
`
`t
`
`a
`
`-- :
`
`
`
`-9-
`
`
`
`US 2003/0091338A1
`
`May 15, 2003
`
`METHOD AND APPARATUS FOR EXTRACTING
`DIGITAL DATA FROMA MEDIUM
`
`BACKGROUND OF THE INVENTION
`0001) 1. Field of the invention
`0002 The invention relates generally to the field of data
`extraction and encoding, and more particularly to an
`improved System and method of extracting digital audio data
`from a medium to one or more playable files.
`0003 2. Background of the Invention
`0004 Digital Audio Extraction (“DAE), also known
`generally as “ripping,” is the process of copying a track from
`an audio disc, usually music, to a hard drive or other Storage
`medium by creating a file (or group of files) in any number
`of encoded and/or compressed formats (e.g., WAV, MP3 ...
`etc). A wide variety of Software packages that utilize DAE
`are now available, and the average computer user can easily
`“rip” any number of tracks from a CD collection to one or
`more files on a computer hard drive. Subsequently, these
`tracks can be played back with Software designed to read and
`play extracted audio files.
`0005 Although ripping has become a common practice
`for many computer users, high quality audio extraction can
`be difficult because of the complexities inherent in the way
`data are Stored on audio discs. Audio CD data are organized
`into Sectors in order to ensure a constant read rate. Each
`Sector consists of 2,352 bytes of Sound data along with
`Synchronization, error correction, and control/display bits.
`These Sectors are further broken down into Sound Samples.
`Each sector contains 588 samples of sound for each of two
`Stereo channels, and each sample contains two bytes (16
`bits) of sound data. The standard sampling rate of CD
`playerS is 44,100 Samples per Second.
`0006 Sectors are not arranged in distinct physical units.
`Instead, the data in one Sector are interleaved with data in
`other Sectors So that a defect in the disc will not destroy a
`Single Sector beyond correction. In addition, each track's
`location, or address, is recorded in the disc's Table of
`Contents (“TOC), which is stored in the “lead in” area of
`every disc. Accordingly, an audio disc's TOC, much like a
`books, is a good resource for determining where tracks
`begin and end. The TOC indicates the minute, Second, and
`sector (to /7s" of a second) at which each track begins.
`0007 Extraction of audio/video content from a compact
`disk to a hard disk using current DAE Software can be a
`difficult task. Every byte of a 2,352-byte sector of audio data
`is used Strictly for audio. ESSentially, no header exists, there
`is no information in the Sector that allows for the exact
`positioning of a read head over a Specific Sector. To address
`an audio sector, a CD-ROM drive uses the TOC data to
`approximate how far out along the CD it must Scan in order
`to find the beginning of a specified track. Drives typically
`reach an audio address that is within it four Sector addresses
`of the address being sought (+%s" of a second in playback
`time), and a read request may return any one of the nine
`Sectors. This inexact positioning may cause undesired clickS
`and pops, commonly referred to as "jitter, in extracted
`audio files.
`0008 Graph 110 of FIG. 1 is a plot (not to scale) of the
`audio level (e.g., audio volume, audio intensity, audio ampli
`
`tude ... etc) of an audio recording 120 over time (horizontal
`axis). Track divisions 130 represent where tracks (e.g.,
`songs, of audio recording 120) begin and end. Threshold 140
`represents a predetermined level threshold and lines 150
`represent the points at which the Sound level of audio
`recording 120 drops above or below threshold 140. For
`example, the level of audio recording 120 is below threshold
`140 during time lapses 152 and 154. Time lapses 152 and
`154 represent the dead Silence that may exist at the begin
`nings and ends of Songs on a CD, respectively. Lines 160
`represent the points at which the Sound level of audio
`recording 120 significantly drops but does not drop below
`threshold 140. For example, the level of audio recording 120
`is dropped significantly during time lapse 164. Time lapse
`164 represent a lull in the level of audio recording 120 that
`may occur between tracks, Such as clapping in between
`songs of a live album. Finally, it is shown that a lull in the
`level of audio recording 120 does not exist in between tracks
`3 and 4. This is an example of two tracks that blend into each
`other during playback without any lull in Sound level.
`0009 Current DAE software can be used to extract audio
`recording 120, and FIG. 2 shows how current DAE pro
`grams function. A current DAE program will extract each
`track of audio recording 120 Separately and create a pulse
`code-modulation (PCM) file for each track (PCM files
`201-204). These PCM files can eventually be converted to
`encoded file formats (encoded files 211-214) that may be
`read for playback of audio recording 120. These encoded file
`formats may be uncompressed or compressed (e.g., via MP3
`or WAV file formats).
`0010. One disadvantage to current extraction techniques
`is that the Software extracts each track from the source CD
`separately. First the Software will read the CD TOC to
`determine the locations of the tracks to be extracted. Then
`each track will be extracted from a beginning point that may
`or may not be where the track actually Starts and will end
`extraction at a point that may or may not be where the track
`actually ends. Again, the read head's accuracy in finding
`Sector addresses is low, and it can only approximately find
`the Start of a track. Given these uncertainties, one or more
`Sectors of a track may be lost during extraction, or one or
`more Sectors may be unintentionally added. For example,
`FIG. 3 illustrates some of the drawbacks of using DAE
`techniques currently known in the art. Audio track 310,
`beginning at time 302 and ending at time 304, can be
`extracted from an audio CD. The resulting PCM (pulse-code
`modulation) data file may contain missing Sectors (e.g.,
`PCM file 320), extra sectors (e.g., PCM file 330), or both
`missing and extra sectors (e.g., PCM file 340) at either or
`both ends of the file. This problem is further exacerbated
`when a file contains extra Sectors that overlap with Sectors
`contained in a consecutive track (e.g., overlapping Sector
`360 of PCM file 350). Overlapping sectors will cause
`increased jitter when extracted tracks are played back in
`their original order. Jitter is particularly noticeable between
`extracted tracks that are intended to blend into each other
`during playback (e.g., a Segway, a house mix, or a live
`recording). These types of recordings may not have the same
`dead Silence between tracks that typical multi-track record
`ings have.
`0011. The problems described above are caused because
`current DAE programs do not analyze the bridges between
`tracks to determine if there exists dead Silence or just a lull
`
`-10-
`
`
`
`US 2003/0091338A1
`
`May 15, 2003
`
`in the Sound, as in a live recording. Instead, current pro
`grams Simply add a Small amount of Silence between
`extracted tracks during playback even though that Silence
`may be undesirable for certain track Sets. Finally, if there is
`Some noticeable Sound between tracks, there is a clear loSS
`of Sound quality during playback because current DAE
`techniques cannot adequately compensate for jitter.
`
`SUMMARY OF THE INVENTION
`0012. A computer-implemented method of extracting
`digital audio data is described comprising: reading digital
`audio data from a medium; analyzing the digital audio data
`to determine when the levels of audio Sound contained
`therein croSS Specified thresholds over time, and dividing the
`digital audio data into Segments that each contain one or
`more complete tracks and exclude data representing Sound
`levels below a specified threshold.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`0013 Abetter understanding of the present invention can
`be obtained from the following detailed description in
`conjunction with the following drawings, in which:
`0.014
`FIG. 1 illustrates a graph of an exemplary audio
`recording including Sound levels plotted over time.
`0015 FIG. 2 illustrates a method by which digital audio
`extraction is used in the current art.
`0016 FIG. 3 illustrates some of the drawbacks in using
`current digital audio extraction techniques including Some of
`the causes of jitter.
`0017 FIG. 4 illustrates one embodiment of the digital
`audio data extraction process, wherein audio data is
`extracted, Segmented, encoded and Sliced.
`0018 FIG. 5 illustrates a flow diagram of one embodi
`ment of the digital audio data extraction process.
`0019 FIG. 6 illustrates a flow diagram of another
`embodiment of the digital audio data extraction process.
`0020 FIG. 7 illustrates S CD/DVD storage and playback
`System as it is used in conjunction with one embodiment of
`the invention.
`
`DETAILED DESCRIPTION OF THE
`INVENTION
`0021. In the following description, for the purposes of
`explanation, numerous specific details are Set forth in order
`to provide a thorough understanding of the present inven
`tion. It will be apparent, however, to one skilled in the art
`that the present invention may be practiced without Some of
`these Specific details. In other instances, well-known Struc
`tures and devices are shown in block diagram form to avoid
`obscuring the underlying principles of the present invention.
`0022. The process of digital audio extraction first requires
`a Source medium that contains digital audio data. The Source
`medium can be one of a variety of plastic disc forms (e.g.,
`CDs, MiniDiscs, DVDs) or magnetic disk forms (e.g.,
`floppy or hard disks). The audio data contained on the Source
`medium is arranged in a format, e.g., pulse code modulation
`(PCM) format, that can be read by a standard audio player
`Such as CD player. In addition, the audio data is typically
`divided into tracks, where each track represents a distinct
`
`block of audio data Such as a Song. Source mediums can
`contain a multitude of tracks and additional data that indi
`cate the Start of each track relative to the data in minutes,
`Seconds, and Sectors. For example, the Table of Contents
`(“TOC) of a CD contains data that indicate to a CD player
`the points at which CD tracks begin.
`0023 FIG. 4 shows graph 110 which is also depicted in
`FIGS. 1 and 2. Again, graph 110 is a plot (not to scale) of
`the sound level (vertical axis) of audio recording 120 over
`time (horizontal axis). Track divisions 130 represent where
`tracks, e.g., Songs, of audio recording 120 begin and end.
`Threshold 140 represents a predetermined level threshold
`and lines 150 represent the points at which the sound level
`of audio recording 120 drops above or below threshold 140.
`For example, the level of audio recording 120 is below
`threshold 140 during time lapse 442, 444, and 448. Time
`lapses 442, 444, and 448 represent the dead silence that may
`exist at the beginnings and ends of Songs on a CD, respec
`tively. Lines 160 represent the points at which the sound
`level of audio recording 120 significantly drops but does not
`drop below threshold 140. For example, the level of audio
`recording 120 is dropped significantly during time lapse
`444. Time lapse 444 represents a lull in the level of audio
`recording 120 in between tracks, Such as clapping in
`between songs of a live album. Finally, it is shown that a lull
`in the level of audio recording 120 does not exist in between
`tracks 3 and 4. This is an example of two tracks that blend
`into each other during playback without any lull in Sound
`level.
`0024 FIG. 5 shows a flow diagram of the steps taken by
`one embodiment of the invention to extract, analyze, and
`Store digital audio data. It will appreciated by those skilled
`in the art that the steps diagramed in both FIGS. 5 and 6
`may followed in the indicated order or in a different order
`while achieving many of the same benefits of the present
`invention. In addition, certain Steps may be skipped and/or
`alternate Steps may be employed while complying with the
`underlying principles of the invention.
`0025. At block 510 the raw audio data contained on the
`Source medium is read by a computer device Such as a
`CD-ROM, and at block 520 the raw data is stored as raw file
`420 in an addressable memory Space Such as a hard drive or
`RAM. In one embodiment, rather than storing the raw data
`as a “file” at 420, the raw data may be streamed from the CD
`to an application which operates on the data Stream as
`described herein. If a CD is being ripped, this file/stream 420
`contains PCM data that includes audio data and possibly
`additional data (e.g., TOC data) used to address Specific
`locations within the PCM data. It will be understood that the
`TOC data read from the source medium can be stored as part
`of raw file/stream 420 or as a separate file (e.g., TOC file
`524).
`0026 Audio recording 120 of FIG. 4 is an exemplary
`plot of the Sound levels produced when the audio data of raw
`file/stream 420 are played. At block 530, of FIG. 5, raw
`file/stream 420 is analyzed to determine the points at which
`the sound level of audio recording 120 drops below thresh
`old 140. Threshold 140 may be set at a sound level com
`parable to the threshold of human hearing. In one embodi
`ment, each point analyzed is equivalent to a Sound Sample
`(e.g., two bytes of data for each channel), which will
`represent Some amplitude of Sound. One embodiment of the
`
`-11-
`
`
`
`US 2003/0091338A1
`
`May 15, 2003
`
`invention locates these points using the TOC data Stored in
`raw file/stream 420. The TOC data will indicate at which
`minute, Second, and Sector each track of the raw audio data
`(or other types of multimedia data) begins. For example,
`track edge 430 includes an address representing where track
`two begins. Once these locations are known, Samples of
`audio data located at predetermined distances behind and/or
`ahead of, relative to time, each track edge is analyzed to
`determine if the indicated Sound levels are above or below
`threshold 140. For example, it will be determined that the
`data Sectors located across time lapse 442 contain Sound
`levels below threshold 140. The data sectors marking the
`locations where the indicated Sound levels croSS threshold
`140 may be stored in an index file, e.g., index file 535. In
`addition, if there exists no data Sectors that indicate Sound
`levels below threshold 140 in the vicinity of a track edge,
`then that information is stored as well.
`0027. Once the raw file/stream 420 (or the streamed
`content) has been completely analyzed, moving onto block
`540, one embodiment of the invention divides raw file/
`stream 420 into smaller chunks or segments of data. This
`division will occur at every track edge except at those edges
`where Sound levels are maintained above threshold 140. For
`example, a Segment division will occur attrack edge 430 but
`not at track edge 432 or 434. This is because the Sound levels
`represented by time lapse 442 drop below threshold 140 and
`the sound levels represented by time lapses 444 and 446 are
`maintained above threshold 140. Sections of audio data that
`contain Sound levels below threshold 140 will be referred to
`herein as “silent Sections. During or Subsequent to the
`division of raw file/stream 420 (see block 540), the silent
`sections, represented by time lapses 440, 442 and 448, will
`be excluded from the resulting Segments. Accordingly, Seg
`ments 450 and 455 are generated and their corresponding
`shorter lengths are shown in FIG. 4. In one embodiment,
`segments 450 and 455 are merely two smaller files similar
`or identical in format to raw file/stream 420. In addition,
`while segment 450 contains only track 1, segment 455
`contains tracks 1, 2 and 3. It will be appreciated by one
`skilled in the art that eXcluding Sections of data that do not
`benefit the end user, e.g., Silent Sections, will be advanta
`geous for a variety of reasons. For example, there will be
`leSS data to encode, the resulting files will be Smaller in Size,
`and the multimedia content stored within the files will start
`and end more precisely.
`0028 Depending on the number tracks and silent sections
`contained in raw file/stream 420, a multitude of Segments
`may result after the division at block 540 of FIG. 5. Again,
`each of the Segments can contain one or more tracks, and if
`one Segment contains more than one track the Sound levels
`produced at the track edges are high enough to be heard by
`an end user. At block 550, segments 450 and 455 are
`encoded to a data format that may be read by a Standard
`digital audio player. For example, segments 450 and 455
`may be encoded to a WAV file format or they may be
`encoded and compressed to an MP3 file format such that a
`standard MP3 player may read and play them. It will be
`understood that the encoding step of block 550 may occur
`before the segmenting step of block 540 or before the
`analyzing step of block 530 without losing the inventive
`features described herein. In addition, it will be appreciated
`that any number of segments produced at block 540 may be
`encoded and a variety of encoded formats may result fol
`lowing block 550. As shown in FIG. 4, segment 450 is
`
`encoded to playback file 460 and segment 455 is encoded to
`multi-track playback file 465. It will be understood that the
`size of these files will be different after encoding and/or
`compression takes place, but the resulting length of each
`Segment, in time, will be preserved during playback. Either
`playback file 460 or 465 may be played by a standard digital
`audio player; however, a user may desire to Store one of the
`three tracks contained on multi-track playback file 465
`Separately or insert one of the three tracks into a different
`arrangement of tracks. Therefore, at block 560, multi-track
`playback files are split Such that a separate file is created for
`each distinct track.
`0029. One embodiment of the invention contains a file
`Slicer that can divide one or more encoded files into their
`Separate tracks. The file Slicer can Separate two or more
`tracks from a multi-track playback file Such that no pertinent
`data are lost. This is accomplished with the help of the TOC
`data found in either TOC file 524 or raw file/stream 420,
`depending on where the TOC data were stored at block 520.
`The slicer uses the TOC data to find the precise sector(s) at
`which each track of playback file 465 begins. Then the slicer
`divides the playback file into smaller playback files, wherein
`each Smaller playback file will contain a complete and
`distinct track. The Slicer extracts all the corresponding
`groups of Sectors of data that are representative of each track
`and then Stores each extracted group as a Smaller playback
`file. Therefore, no Sectors are lost. For example, playback
`file 465 of FIG. 5 can be sliced into smaller playback files
`472, 474 and 476. Each of the smaller playback files
`contains a complete and distinct track. These Smaller play
`back files may be played Separately or in Sequence without
`any noticeable loSS of Sound quality during the time a player
`is Switching between tracks.
`0030. It will be understood that the track addresses indi
`cated by the TOC data are only compatible with the raw
`audio data stored in PCM format. Once encoding occurs the
`format of the audio data will change and the number and
`arrangement of Sectors in the encoded data file may also
`change. However, one embodiment of the invention will
`anticipate this change and covert the address given by the
`TOC data to an address that is compatible with the encoded
`data format.
`0031 One embodiment of the present invention also
`includes an audio player. This audio player is capable of
`reading a variety of encoded file formats including those
`uncompressed and compressed. The audio player can play
`back the files generated by block 550 or 560 of the extraction
`process. One advantage of the audio player is that it can
`recognize tracks that are generated from a multi-track Seg
`ment. For example, playback files 472, 474 and 476 are three
`Separate tracks originating from Segment 455. These tracks
`are unique because no Sections of Silence exist in between
`any two contiguous tracks. This is common for audio
`recordings that contain tracks that blend into each other,
`without any pauses of Silence, or off a live album where a
`listener can Still hear the Sounds of the band and audience
`between Songs. Therefore, it is advantageous to playback
`these tracks in their original order without any undesired
`silence or jitter. With reference to index file 535, the audio
`player can determine if two or more tracks are intended to
`be played end-to-end without a pause of Silence, and, if So,
`the player moves from playing one track to the next without
`a pause or noticeable loSS of Sound quality. Furthermore, if
`
`-12-
`
`
`
`US 2003/0091338A1
`
`May 15, 2003
`
`a pause of Silence is meant to exist between two contiguous
`tracks (e.g., between tracks 1 and 2 of FIG. 4), the player
`can reference index file 535 to determine exactly how long
`the pause of Silence was in the original recording and can
`reinsert that amount of Silence in between the tracks cur
`rently being played.
`0032. As seen in FIG. 6, a variation of the audio data
`extraction process described above may be executed while
`achieving the same benefits of the present invention. The
`steps represented by blocks 610, 620 and 630 are identical
`to those of blocks 510,520 and 530, respectively. However,
`at block 640, the raw audio data is not broken up into
`Segments, but is rather encoded into a large playback file,
`e.g., playback file 645. It will be understood that playback
`file 645 may contain all or less than all of the tracks found
`on raw data file 420. A player, in another embodiment of the
`invention, may playback any number of tracks off of play
`back file 645, in any Sequence, and can use the information
`stored in index 535 to determine the exact positions to begin
`and end playback Such that there is no noticeable loSS in
`Sound quality.
`0033. In FIG. 7 a CD/DVD storage and playback system
`700 is illustrated in another embodiment of the invention.
`The System 700 includes a storage and playback apparatus
`705 which communicates over a network 740 to one or more
`servers 750. The steps represented by FIG. 5 or FIG.6 may
`be embodied in program code stored on storage device 730
`of playback apparatus 705 and may be loaded into memory
`715 and executed by a microprocessor 710 contained
`therein. Playback apparatus 705 may be a stand-alone ter
`minal or a client terminal connected with a network, e.g.,
`network 740, further connected with one or more servers,
`e.g., server 750, via network interface card 720.
`0034.
`In one embodiment, a user loads each of his or her
`CDs into playback apparatus 705 and all of the digital audio
`data contained on each CD is extracted and Stored into
`separate raw data files on storage device 730. Each raw data
`file may be segmented, encoded, and sliced given the
`methods described above. Ultimately, playback files con
`taining individual tracks are generated, and the user may
`playback such files on the playback apparatus 705. The
`playback apparatus 705 may also include user display 760
`which can display information relevant to a Selected track.
`For example, user display 760 may display the artist name,
`album title, Song title, track number (relative to a track Set)
`and album release date. It will be appreciated that this
`CD-related information may be retrieved from data stored
`on the CD, from manual user input, or from a database found
`on a server 750.
`0035) In one embodiment, the techniques described
`herein may be implemented on a specialized multi-CD
`ripper apparatuS Such as that described in co-pending appli
`cation entitled “MULTIMEDIA TRANSFER SYSTEM
`(Ser. No. 09/717,458) which is assigned to the assignee of
`the present application and which is incorporated herein by
`reference. For example, using this System, content from
`multiple CDS may be concurrently processed and Stored on
`a user's playback apparatus 705 in the manner described
`above.
`0036) One specific embodiment of a system for reading,
`Slicing, encoding and Splitting multimedia content as
`described herein is illustrated in FIG. 8. The system is
`
`illustrated as a plurality of modules which may be embodied
`in hardware, Software, or any combination thereof. AS
`illustrated, the multimedia content 810 is initially streamed
`from a CD 820 (or other medium) into memory by a media
`ripper module 830. The Sound levels of the raw streamed
`content are then analyzed by a wave slicer module 840
`which Separates the raw Stream data into Segments based on
`its analysis (e.g., using the techniques described above).
`Index data, TOC data and/or other types of multimedia
`related data may also be processed by the media ripper
`module 830 and the wave slicer module 840 during the
`foregoing processing Stages. One or more multimedia
`encoder modules 850 then encode each of the multimedia
`Segments using a particular encoding algorithm (e.g., MP3,
`AC-3, . . . etc., for audio; MPEG-2, MPEG-4, RealVideo
`8, .
`.
`. etc., for Video). Once encoded, each individual
`Segment may then be logically divided by a Splitter module
`860 as described above. For example, if the raw multimedia
`data is audio data, the splitter module 860 may divide the
`encoded audio segments into a plurality of distinct MP-3
`files. AS previously mentioned, audio encoding may occur
`before or after audio Segment Splitting, depending on the
`embodiment.
`0037 Embodiments of the present invention include vari
`ouS Steps, which were described above. The StepS may be
`embodied in machine-executable instructions. The instruc
`tions can be used to cause a general-purpose or Special
`purpose processor to perform certain Steps. Alternatively,
`these Steps may be performed by Specific hardware compo
`nents that contain hardwired logic for performing the steps,
`or by any combination of programmed computer compo
`nents and custom hardware components.
`0038 Elements of the present invention may also be
`provided as a machine-readable medium for Storing the
`machine-executable instructions. The machine-readable
`medium may include, but is not limited to, floppy diskettes,
`optical disks, CD-ROMs, and magneto-optical disks,
`ROMs, RAMs, EPROMs, EEPROMs, magnet or optical
`cards, propagation media or other type of media/machine
`readable medium Suitable for Storing electronic instructions.
`For example, the present invention may be downloaded as a
`computer program which may be transferred from a remote
`computer (e.g., a server) to a requesting computer (e.g., a
`client) by way of data signals embodied in a carrier wave or
`other propagation medium via a communication link (e.g., a
`modem or network connection).
`0039 Throughout the foregoing description, for the pur
`poses of explanation, numerous specific details were Set
`forth in order to provide a thorough understanding of the
`present System. It will be apparent, however, to one skilled
`in the art, that the System and method may be practiced
`without Some of these specific details. For example, while
`the techniques described above were employed in the con
`text of ripping audio from CDS, the Same techniques may be
`employed using a variety of different media (e.g., DVD
`audio/video). Accordingly, the Scope and Spirit of the inven
`tion should be judged in terms of the claims which follow.
`What is claimed is:
`1. A computer-implemented method comprising:
`reading digital audio data from a Source medium, the
`digital audio data divided into a plurality of tracks, each
`representing a discrete chunk of digital audio data and
`having Sound levels varying as a function of a variable;
`
`-13-
`
`
`
`US 2003/0091338A1
`
`May 15, 2003
`
`analyzing the digital audio data to determine the levels of
`Sound as a function of the variable and determining one
`or more precise values of the variable at which the level
`of Sound crosses a specified threshold; and
`dividing the file into one or more Segments, each Segment
`containing one or more tracks and each Segment
`excluding the digital audio data having levels of Sound
`below the specified threshold.
`2. The method of claim 1 wherein the Source medium
`contains Tabl