throbber
United States Patent
`(12)
`(10) Patent No.:
`US 7,704,687 B2
`
`Wang et al.
`(45) Date of Patent:
`Apr. 27, 2010
`
`US007704687B2
`
`(54) DIGITAL KARYOTYPING
`
`(75)
`
`Inventors: Tian_Li Wanga Egaltilnore3 MD ms);
`Victor Velculescu Dayton MD (US)
`.
`5
`~
`5
`’
`Kenneth Kinzler, Bel Air, MD (US);
`Bert Vogelstein, Baltimore, MD (US)
`
`(73) Assignee: The Johns Hopkins University,
`Baltimore MD (US)
`
`( * ) Notice:
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`USC. 154(b) by 1063 days.
`
`(21) Appl. No.: 10/705,874
`~
`.
`Ffled‘
`
`(22)
`
`N0“ 13’ 2003
`.
`.
`.
`Prlor Publication Data
`US 2004/0096892 A1
`May 203 2004
`
`6,498,013 B1
`2002/0048767 A1 *
`
`12/2002 Velculescu et a1.
`4/2002 Bensimon et a1.
`
`.............. 435/6
`
`..... 702/20
`2002/0147549 A1 * 10/2002 Yoshida et a1.
`2003/0124584 A1*
`7/2003 Mohammed ................... 435/6
`2003/0186251 A1
`10/2003 Dunn et a1.
`
`11/2004 Dunn et a1.
`2004/0219580 A1
`FOREIGN PATENT DOCUMENTS
`
`W0
`
`1/2002
`W0 0202805 A2 *
`OTHER PUBLICATIONS
`
`New England Biolabs Technical online reference (www.neb.com/
`nebecomni/techireference/restrictionienzymes/overviewasp) pp.
`1-4, 2007.*
`
`Dunn, J. et a1., ‘I‘Genomic Signature Tags (GSTs)” A System for
`Profiling Genomic DNA, Genome Research, pp. (2002) 1221756-
`1765.
`Wang,T. etal.,“Digitalkaryotyping,”PNAS, (Dec. 10,2002),v01.99,
`No.25, pp. 16156-16161.
`Kallioniemi, A. et al., “Comparative Genomic Hybridization for
`Molecular Cytogenetic Analysis of Solid Tumors” Science, vol. 258
`(Oct. 30, 1992), pp. 818-821.
`
`(65)
`
`(60)
`
`Related US. Application Data
`Il’goggsoznal application No. 60/426,406, filed on NOV.
`’
`'
`
`* cited by examiner
`Primary ExamineriNancy Vogel
`Assistant Examiner%atherine Hibbert
`
`(51)
`
`Int Cl
`(2006.01)
`C12Q 1/68
`(2006.01)
`G06F 19/00
`(388281)
`23511;]53%;
`(2006.01)
`CI2P 19/34
`(
`’
`)
`(52) US. Cl.
`............................... 435/6; 702/19; 702/20;
`_
`_
`_
`435/91'2
`(58) Field of .Class1fication Search ...............: ....... None
`See application file for complete search history.
`References Cited
`U.S. PATENT DOCUMENTS
`
`(56)
`
`5,200,336 A *
`5,391,480 A *
`5,663,048 A *
`5,695,937 A
`5,981,190 A
`
`................. 435/199
`4/1993 Kong et a1.
`
`.....
`2/1995 Davis et a1.
`
`9/1997 Winkfein et a1.
`............... 435/6
`12/1997 Kinzler et a1.
`11/1999 Israel
`
`(74) Attorney, Agent, or FirmiBanner & Witcoff, Ltd.
`
`ABSTRACT
`(57)
`Alterations in the genetic content of a cell underlie many
`human diseases, including cancers. A method called Digital
`Karyotyping provides quantitative analysis of DNA copy
`number at high resolution. This approach involves the isola-
`tion and enumeration of short sequence tags from specific
`genomic loci. Analysis of human cancer cells using this
`method identified gross chromosomal changes as well as
`amplifications and deletions, including regions not previ-
`ously known to be altered. Foreign DNA sequences not
`present in the normal human genome could also be readily
`identified. Digital Karyotyping provides a broadly applicable
`means for systematic detection ofDNA copy number changes
`on a genomic scale.
`
`27 Claims, 5 Drawing Sheets
`
`PGDX EX. 1002
`
`Page 1 of 21
`
`PGDX EX. 1002
`Page 1 of 21
`
`

`

`US. Patent
`
`Apr. 27, 2010
`
`Sheet 1 0f5
`
`US 7,704,687 B2
`
`Step 1. lsolate genomic DNA
`
`Cleave with mapping enzyme (Sacl) —‘/
`
` Item—rm—
`Step 2. Ligate to biotinylated linkershk
`
`
`
`
`Step 3. Cleave with fragmenting enzyme (Nlalll)
`Isolate with streptavidin magnetic beads
`
`Step 4. Ligate to linkers containing
`
`tagging enzyme site (Mme!) (o)
`
`\Step 5. Release genomic tags using tagging enzyme (Mmel)
`
`ow
`-C>.- ”who
`on pom we we
`
`Step 6. Ligate to form ditags. PCR amplify, concatenate.
`and sequence
`Step 7. Map tags to chromosome, evaluate tag density
`
`Densit
`
`Chromosome Position
`
`FIGURE 1
`
`PGDX EX. 1002
`
`Page 2 of 21
`
`PGDX EX. 1002
`Page 2 of 21
`
`

`

`US. Patent
`
`Apr. 27, 2010
`
`Sheet 2 0f5
`
`US 7,704,687 B2
`
`
`
`
`
`Copiesperhaploidgenome
`
`
`Ka ryotype
`
`Digital
`Karyotype
`
`Digital
`
`00—50100150 Mb
`2:
`
`0......
`01mm
`
`
`
`01.....-”
`22mm;
`
`Position along chromosome
`
`FIGURE 2
`
`PGDX EX. 1002
`
`Page 3 of 21
`
`PGDX EX. 1002
`Page 3 of 21
`
`

`

`US. Patent
`
`Apr. 27, 2010
`
`Sheet 3 0f5
`
`US 7,704,687 B2
`
`
`
`
`
`Qammmmmmmmfim
`
`
`E-Efixfimfliflmfi
`
`
`‘Erflfimm-Efiflng
`mmnmmm-mmum
`
`
`
` éEXEME?
`
`
`
`
`
`Copiesperhaploidgenome
`
`
`
`Copiesperhaploid
`
`genome
`
`FIGURE 3
`
`PGDX EX. 1002
`
`Page 4 0f 21
`
`PGDX EX. 1002
`Page 4 of 21
`
`

`

`US. Patent
`
`Apr. 27, 2010
`
`Sheet 4 0f5
`
`US 7,704,687 B2
`
`to OOO
`
`1000
`
`Observed
`
`tags EBV genome
`
`.
`Other vrral
`seq uences
`
`Bacterial
`sequences
`
`FIGURE 4
`
`PGDX EX. 1002
`
`Page 5 0f 21
`
`PGDX EX. 1002
`Page 5 of 21
`
`

`

`U.S. Patent
`
`Apr. 27, 2010
`
`Sheet 5 of 5
`
`US 7,704,687 B2
`
`250me
`
`.335
`
`830me
`.355
`M
`oonon
`
`zoou.
`
`830me
`
`.355
`
`xEO
`
`Nn
`
`r
`
` ' n
`
`mm:6"a50“
`
`aaM8anW-M,e
`
`u5EU
`
`
`
`35:mEomoEoEo956common.
`
`m
`
`.m2:9".
`
`
`
`
`
`ewouefi pgoldeu Jed segdoo
`
`PGDX EX. 1002
`
`Page 6 0f 21
`
`PGDX EX. 1002
`Page 6 of 21
`
`
`
`
`

`

`1
`DIGITAL KARYOTYPING
`
`US 7,704,687 B2
`
`2
`
`This application claims the benefit of provisional applica-
`tion Ser. No. 60/426,406 filed Nov. 15, 2002, the contents of
`which are expressly incorporated herein.
`The work underlying this invention was supported in part
`by the US. government. Thus the US. government retains
`certain rights in the invention according to the provisions of
`grant nos. CA 43460, CA 57345, CA 62924 of the National
`Institutes of Health.
`
`A portion of the disclosure of this patent document con-
`tains material which is subject to copyright protection. The
`copyright owner has no objection to the facsimile reproduc-
`tion by anyone of the patent document or the patent disclo-
`sure, as it appears in the Patent and Trademark Office patent
`file or records, but otherwise reserves all copyright rights
`whatsoever.
`
`FIELD OF THE INVENTION
`
`The invention relates to the field of genetics. In particular,
`it relates to the determination of karyotypes of genomes of
`individuals.
`
`BACKGROUND OF THE INVENTION
`
`Somatic and hereditary variations in gene copy number can
`lead to profound abnormalities at the cellular and organismal
`levels. In human cancer, chromosomal changes, including
`deletion of tumor suppressor genes and amplification of
`oncogenes, are hallmarks of neoplasia (1). Single copy
`changes in specific chromosomes or smaller regions can
`result in a number of developmental disorders, including
`Down, Prader Willi, Angelman, and cri du chat syndromes
`(2). Current methods for analysis of cellular genetic content
`include comparative genomic hybridization (CGH) (3), rep-
`resentational difference analysis (4), spectral karyotyping/M-
`FISH (5, 6), microarrays (7-10), and traditional cytogenetics.
`Such techniques have aided in the identification of genetic
`aberrations in human malignancies and other diseases (11-
`14). However, methods employing metaphase chromosomes
`have a limited mapping resolution (~20 Mb) (15) and there-
`fore cannot be used to detect smaller alterations. Recent
`
`implementation of comparative genomic hybridization to
`microarrays
`containing
`genomic
`or
`transcript DNA
`sequences provide improved resolution, but are currently lim-
`ited by the number of sequences that can be assessed (16) or
`by the difficulty of detecting certain alterations (9). There is a
`continuing need in the art for methods of analyzing and com-
`paring genomes.
`
`BRIEF SUMMARY OF THE INVENTION
`
`5
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`In a first embodiment a method is provided for karyotyping
`a genome of a test eukaryotic cell. A population of sequence
`tags is generated from defined portions of the genome of the
`test eukaryotic cell. The portions are defined by one or two
`restriction endonuclease recognition sites. The sequence tags
`in the population are enumerated to determine the number of
`individual sequence tags present in the population. The num-
`ber of a plurality of sequence tags in the population is com-
`pared to the number of the plurality of sequence tags deter-
`mined for a genome of a reference cell. The plurality of
`sequence tags are within a window of sequence tags which are
`calculated to be contiguous in the genome ofthe species ofthe 65
`eukaryotic cell. A difference in the number of the plurality of
`sequence tags within the window present in the population
`
`60
`
`from the number determined for a reference eukaryotic cell
`indicates a karyotypic difference between the test eukaryotic
`cell and the reference eukaryotic cell.
`According to a second embodiment of the invention, a
`dimer
`is provided. The dimer comprises two distinct
`sequence tags from defined portions of the genome of a
`eukaryotic cell. The portions are defined by one or two restric-
`tion endonuclease recognition sites. Each of said sequence
`tags consists of a fixed number of nucleotides of one of said
`defined portions of the genome. The fixed number of nucle-
`otides extend from one of said restriction endonuclease rec-
`
`ognition sites.
`According to a third embodiment of the invention, a con-
`catamer of dimers is provided. The dimers comprise two
`distinct sequence tags from defined portions ofthe genome of
`a eukaryotic cell. The portions are defined by one or two
`restriction endonuclease recognition sites. Each of said
`sequence tags consists of a fixed number ofnucleotides ofone
`of said defined portions of the genome. The fixed number of
`nucleotides extend from one of the restriction endonuclease
`
`recognition sites.
`According to a fourth embodiment of the invention a
`method of karyotyping a genome of a test eukaryotic cell is
`provided. A population of sequence tags is generated from
`defined portions ofthe genome ofthe test eukaryotic cell. The
`portions are defined by one or two restriction endonuclease
`recognition sites. The sequence tags in the population are
`enumerated to determine the number of individual sequence
`tags present in the population. The number of a plurality of
`sequence tags in the population is compared to the number of
`said plurality of sequence tags calculated to be present in the
`genome of the species of the eukaryotic cell. The plurality of
`sequence tags are within a window of sequence tags which are
`calculated to be contiguous in the genome ofthe species ofthe
`eukaryotic cell. A difference in the number of the plurality of
`sequence tags within the window present in the population
`from the number calculated to be present in the genome ofthe
`eukaryotic cell indicates a karyotypic abnormality.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1. Schematic of Digital Karyotyping approach. Col-
`ored boxes represent genomic tags. Small ovals represent
`linkers. Large blue ovals represent streptavidin-coated mag-
`netic beads. This figure is described in more detail below.
`FIG. 2. Low resolution tag density maps reveal many sub-
`chromosomal changes. The top graph corresponds to the
`Digital Karyotype, while the lower graph represents CGH
`analysis.An ideogram of each normal chromosome is present
`under each set of graphs. For all graphs, values on the Y-axis
`indicate genome copies per haploid genome, and values on
`the X-axis represent position along chromosome (Mb for
`Digital Karyotype, and chromosome bands for CGH). Digital
`Karyotype values represent exponentially smoothed ratios of
`DiFi tag densities, using a sliding window of 1000 virtual tags
`normalized to the NLB genome. Chromosomal areas lacking
`Digital Karyotype values correspond to unsequenced por-
`tions ofthe genome, including heterochromatic regions. Note
`that using a window of 1000 virtual tags does not permit
`accurate identification alterations less than ~4 Mb, such as
`amplifications and homozygous deletions, and smaller win-
`dows need to be employed to accurately identify these lesions
`(see FIG. 3 for example).
`FIGS. 3A and 3B. High resolution tag density maps iden-
`tify amplifications and deletions. (FIG. 3A) Amplification on
`chromosome 7. Top panel represent bitmap viewer with the
`region containing the alteration encircled. The bitmap viewer
`
`PGDX EX. 1002
`
`Page 7 of 21
`
`PGDX EX. 1002
`Page 7 of 21
`
`

`

`US 7,704,687 B2
`
`3
`is comprised of ~39,000 pixels representing tag density val-
`ues at the chromosomal position of each Virtual tag on chro-
`mosome 7, determined from sliding windows of 50 virtual
`tags. Yellow pixels indicate tag densities corresponding to
`copy numbers <110 while black pixels correspond to copy
`number 2110. Middle panel represents an enlarged view of
`the region of alteration. The lower panel indicates a graphical
`representation of the amplified region with values on the
`Y-axis indicating genome copies per haploid genome and
`values on the X-axis representing position along the chromo-
`some in Mb. (FIG. 3B) Homozygous deletion on chromo-
`some 5. Top, middle and lower panels are similar to those for
`(FIG. 3A) except that the bitmap viewer for chromosome 5
`contains ~43,000 pixels, tag density values were calculated in
`sliding windows of 150 virtual tags, and yellow pixels indi-
`cate copy numbers >0.1 while black pixels indicate copy
`numbers £01. Bottom panel represents detailed analysis of
`the region containing the homozygous deletion in DiFi and
`C052. For each sample, white dots indicate markers that were
`retained, while black dots indicate markers that were
`homozygously deleted. PCR primers for each marker are
`listed in Table 4
`
`FIG. 4. Identification of EBV DNA in NLB cells. NLB,
`genomic tags derived from NLB cells after removal of tags
`matching human genome sequences or tags matching DiFi
`cells. DiFi, genomic tags derived from DiFi cells after
`removal of tags matching human genome sequences or tags
`matching NLB cells. The number of observed tags matching
`EBV, other viral, or bacterial sequences is indicated on the
`vertical axis.
`
`FIG. 5. Low resolution tag density maps of the DiFi tumor
`genome. For each chromosome, the top graph corresponds to
`the Digital Karyotype while the lower graph represents CGH
`analysis. An ideogram of each chromosome is depicted under
`each set of graphs. For all graphs, values on the Y-axis indi-
`cate genome copies per haploid genome, and values on the
`X-axis represent position along chromosome (Mb for Digital
`Karyotypes, and chromosome bands for CGH). Digital
`Karyotype values represent exponentially smoothed ratios of
`DiFi tag densities, using a sliding window of 1000 virtual tags
`normalized to the NLB genome. Chromosomal areas lacking
`Digital Karyotype values correspond to unsequenced por-
`tions of the genome, including heterochromatic regions.
`
`DETAILED DESCRIPTION OF THE INVENTION
`
`It is a discovery ofthe present inventors that the genome of
`an organism can be sampled in groups of small pieces to
`determine karyotypic properties of an organism using a sys-
`tematic and quantitative method. Changes in copy number of
`portions ofthe genome can be determined on a genomic scale.
`Such changes include gain or loss of whole chromosomes or
`chromosome arms, amplifications and deletions ofregions of
`the genome, as well as insertions of foreign DNA. Rearrange-
`ments, such as translocations and inversions, would typically
`not be detected by the method.
`Our data demonstrate that the method, called Digital
`Karyotyping, can accurately identify regions whose copy
`number is abnormal, even in complex genomes such as that of
`the human. Whole chromosome changes, gains or losses of
`chromosomal arms, and interstitial amplifications or dele-
`tions can be detected. Moreover, the method permits the
`identification of specific amplifications and deletions that had
`not been previously described by comparative genomic
`hybridization (CGH) or other methods in any human cancer.
`These analyses suggest that a potentially large number of
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`4
`
`in cancer
`undiscovered copy number alterations exist
`genomes and that many of these could be detected through
`Digital Karyotyping.
`Like all genome-wide analyses, Digital Karyotyping has
`limitations. First, the ability to measure tag densities over
`entire chromosomes depends on the accuracy and complete-
`ness of the genome sequence. Fortunately, over 94% of the
`human genome is available in draft form, and 95% of the
`sequence is expected to be in a finished state by 2003. Second,
`a small number of areas of the genome are expected to have a
`lower density of mapping enzyme restriction sites and be
`incompletely evaluated by our approach. We estimate that
`less than 5% of the genome would be incompletely analyzed
`using the parameters employed in the current study. More-
`over, this problem could be overcome through the use of
`different mapping and fragmenting enzymes. Finally, Digital
`Karyotyping cannot generally detect very small regions, on
`the order of several thousand base pairs or less, that are
`amplified or deleted.
`Nevertheless,
`it is clear from our analyses that Digital
`Karyotyping provides a heretofore unavailable picture of the
`DNA landscape of a cell. The approach should be immedi-
`ately applicable to the analysis of human cancers, wherein
`identification of homozygous deletions and amplifications
`has historically revealed genes important in tumor initiation
`and progression. In addition, one can envisage a variety of
`other applications for this technique. First, the approach
`could be used to identify previously undiscovered alterations
`in hereditary disorders. A potentially large number of such
`diseases are thought to be due to deletions or duplications too
`small to be detected by conventional approaches. These may
`be detectable with Digital Karyotyping even in the absence of
`any linkage or other positional information. Second, use of
`mapping enzymes that are sensitive to DNA methylation (e.g.
`NotI) could be employed to catalog genome-wide methyla-
`tion changes in cancer or diseases thought to be affected by
`genomic imprinting. Third, the approach could be as easily
`applied to the genomes of other organisms to search for
`genetic alterations responsible for specific phenotypes, or to
`identify evolutionary differences between related species.
`Moreover, as the genome sequences of increasing numbers of
`microorganisms and viruses become available, the approach
`can be used to identify the presence of pathogenic DNA in
`infectious or neoplastic states.
`Populations of sequence tags are generated from defined
`portions of the genome. The portions are defined by one or
`two restriction endonuclease recognition sites. Preferably the
`recognition sites are located in a fixed position within the
`defined portions of the genome. In one embodiment three
`different restriction endonucleases are used to generate
`sequence tags. In this embodiment, the restriction endonu-
`cleases used to generate the tags can be termed mapping
`(first), fragmenting (second), and tagging restriction endonu-
`clease. The defined portions extend from the fragmenting
`(second) restriction endonuclease site to the closest mapping
`(first)
`restriction endonuclease site. The sequence tags
`derived from these definedportions are generated by cleavage
`with a tagging enzyme. The closest nucleotides adjacent to
`the fragmenting (second) restriction endonuclease comprise
`the sequence tags. The number of nucleotides is typically a
`fixed number (defined here to include a range of numbers)
`which is a function of the properties of the tagging (third)
`restriction endonuclease. For example, using MmeI the fixed
`number is 20, 21 or 22. Other Type IIS restriction endonu-
`cleases cleave at different distances from their recognition
`sequences. Other Type IIS restriction endonucleases which
`can be used include Bva, BvaI, BinI, FokI, HgaI, thI,
`
`PGDX EX. 1002
`
`Page 8 of 21
`
`PGDX EX. 1002
`Page 8 of 21
`
`

`

`US 7,704,687 B2
`
`5
`Mboll, Mnll, SfaNl, Taqll, Tthlllll , BsmFl, and Fokl. See
`Szybalski, W., Gene, 40:169, 1985. Other similar enzymes
`will be known to those of skill in the art (see, Current Proto-
`cols in Molecular Biology, supra). Restriction endonucleases
`with desirable properties can be artificially evolved, i.e., sub-
`jected to selection and screening, to obtain an enzyme which
`is useful as a tagging enzyme. Desirable enzymes cleave at
`least 18-21 nucleotides distant from their recognition sites.
`Artificial restriction endonucleases can also be used. Such
`
`endonucleases are made by protein engineering. For
`example, the endonuclease Fokl has been engineered by
`insertions so that it cleaves one nucleotide further away from
`its recognition site on both strands ofthe DNA substrates. See
`Li and Chandrasegaran, Proc. Nat. Acad. Sciences USA
`90:2764-8, 1993. Such techniques can be applied to generate
`restriction
`endonucleases with
`desirable
`recognition
`sequences and desirable distances from recognition site to
`cleavage site.
`In an alternative embodiment a single restriction endonu-
`clease can define a defined portion of the genome. A fixed
`number of nucleotides on one or both sides of the restriction
`
`endonuclease recognition site then forms the sequence tags.
`For example, the restriction endonuclease Bcgl can be used to
`provide a 36 bp fragment. The 12 bp recognition site (having
`6 degenerate positions) lies in the middle of a fragment; 12 bp
`flank the site on either side. Other similar enzymes which can
`be used in this embodiment include Bpll and BsaXl. Prefer-
`ably the enzyme used releases a fragment having a sum of at
`least 18 or 20 nucleotides flanking its recognition sequence.
`Enumeration of sequence tags generated is performed by
`determining the identity of the sequence tags and recording
`the number of occurrences of each such tag or of genomically
`clustered tags. Preferably the determination of identity of the
`tags is done by automated nucleotide sequence determination
`and the recording is done by computer. Other methods for
`identifying and recording tags can be used, as is convenient
`and efficient to the practitioner. According to one embodi-
`ment of the invention sequence tags are ligated together to
`form a concatenate and the concatenates are cloned and seqe-
`unces. In a preferred embodiment the sequence tags are
`dimerized prior to formation of the concatenate. The
`sequence tags can be amplified as single tags or as dimers
`prior to concatenation.
`A feature of the data analysis which enables the efficient
`practice of the method is the use of windows. These are
`groups of sequence tags which are genomically clustered.
`Virtual tags can be extracted from the genomic data for the
`species being tested. The virtual tags are associated with
`locations in the genome. Groups of adjacent virtual tags
`which are clustered in the genome are used to form a window
`for analysis of actual experimental tags. The term adjacent or
`contiguous as used herein to describe tags does not imply that
`the nucleotides of one tag are contiguous with the nucleotides
`ofanother tag, but rather that the tags are clustered in the same
`areas of the genome. Because of the way that sequence tags
`are generated, they only sample the genome; they do not
`saturate the genome. Thus, for example, a window can com-
`prise sequence tags that map within about 40 kb, about 200
`kb, about 600 kb, or about 4 Mb. Typically such windows
`comprise from 10 to 1000 sequence tags. Use of windows
`such as these permits the genome to be sampled rather than
`comprehensively analyzed. Thus, far less than 100% of the
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`6
`sequence tags must be counted to obtain useful information.
`In fact, less than 50%, less than 33%, less than 25%, less than
`20%, even less than 15% ofthe sequence tags calculated to be
`present in the genome of the eukaryotic cell need be enumer-
`ated to obtain useful data. The karyotypic analysis can be used
`inter alia to compare a cancer cell to a normal cell, thereby
`identifying regions of genomic change involved in cancer.
`The karyotypic analysis can be used to identify genes
`involved in hereditary disorders. The karyotypic analysis can
`be used to identify genetic material in a eukaryotic cell
`derived from an infectious agent.
`Changes in amount ofparticular regions ofthe genome can
`identify aneuploidy if (a) sequence tags of one or more auto-
`somes are determined to be present in the test eukaryotic cell
`relative to the reference eukaryotic cell at a ratio of 3 or
`greater or less than 1.5; or (b) sequence tags of one or more
`sex chromosomes in a male are determined to be present in
`the test eukaryotic cell relative to the reference eukaryotic cell
`at a ratio of 1 .5 or greater or less than 0.7; or (c) sequence tags
`ofX chromosomes in a female are determined to be present in
`the test eukaryotic cell relative to the reference eukaryotic cell
`at a ratio of 3 or greater or less than 1.5, or relative to a
`reference female eukaryotic cell at a ratio of 1.5 or greater or
`less than 0.7. Similarly, such changes can be measured with
`reference to nucleotide sequence data for the genome of a
`particular species.
`Preferably the method ofthe present invention employs the
`formation of dimers of sequence tags. Such dimers permit the
`elimination of certain types of bias, for example that which
`might be introduced during amplification of tags. Typically
`dimers which do not comprise two distinct tags are excluded
`from analysis. Two sequence tags which form a dimer are
`desirably joined end-to-end at the ends distal to the second
`restriction endonuclease (fragmenting) site. Such distal ends
`are typically formed by the action of the tagging enzyme, i.e.,
`the third restriction endonuclease. Preferably the distal ends
`are sticky ends. All or part of the oligonucleotide linkers can
`remain as part of the dimers and can remain as part of the
`concatenates of dimers as well. However, the linkers are
`preferably cleaved prior to the concatenation.
`
`EXAMPLES
`
`Example 1
`
`Principles of Digital Karyotyping
`
`These concepts are practically incorporated into Digital
`Karyotyping ofhuman DNA as described in FIG. 1. Genomic
`DNA is cleaved with a restriction endonuclease (mapping
`enzyme) that is predicted to cleave genomic DNA into several
`hundred thousand pieces, each on average <10 kb in size
`(Step 1). A variety of different endonucleases can be used for
`this purpose, depending on the resolution desired. In the
`current study, we have used Sacl, with a 6-bp recognition
`sequence predicted to preferentially cleave near or within
`transcribed genes. Biotinylated linkers are ligated to the DNA
`molecules (Step 2) and then digested with a second endonu-
`clease (fragmenting enzyme) that recognizes 4-bp sequences
`(Step 3). As there are on average 16 fragmenting enzyme sites
`between every two mapping enzyme sites (46/44), the major-
`ity of DNA molecules in the template are expected to be
`
`PGDX EX. 1002
`
`Page 9 of 21
`
`PGDX EX. 1002
`Page 9 of 21
`
`

`

`US 7,704,687 B2
`
`7
`cleaved by both enzymes and thereby be available for subse-
`quent steps. DNA fragments containing biotinylated linkers
`are separated from the remaining fragments using streptavi-
`din-coated magnetic beads (Step 3). New linkers, containing
`a 5-bp site recognized by Mmel, a type HS restriction endo-
`nuclease (18), are ligated to the captured DNA (Step 4). The
`captured fragments are cleaved by Mmel, releasing 21 bp tags
`(Step 5). Each tag is thus derived from the sequence adjacent
`to the fragmenting enzyme site that is closest to the nearest
`mapping enzyme site. Isolated tags are self-ligated to form
`ditags, PCR amplified en masse, concatenated, cloned, and
`sequenced (Step 6). As described for SAGE (17), formation
`of ditags provides a robust method to eliminate potential PCR
`induced bias during the procedure. Current automated
`sequencing technologies identify up to 30 tags per concata-
`mer clone, allowing for analysis of ~100,000 tags per day
`using a single 384 capillary sequencing apparatus. Finally,
`tags are computationally extracted from sequence data,
`matched to precise chromosomal locations, and tag densities
`are evaluated over moving windows to detect abnormalities in
`DNA sequence content (Step 7).
`The sensitivity and specificity of Digital Karyotyping in
`detecting genome-wide changes was expected to depend on
`several factors. First, the combination of mapping and frag-
`menting enzymes determines the minimum size of the alter-
`ations that can be identified. For example, use of Sacl and
`Nlalll as mapping and fragmenting enzymes, respectively,
`was predicted to result in a total of 730,862 virtual tags
`(defined as all possible tags that could theoretically be
`obtained from the human genome). These virtual tags were
`spaced at an average of 3,864 bp, with 95% separated by 4 bp
`to 46 kb. Practically, this resolution is limited by the number
`oftags actually sampled in a given experiment and the type of
`alteration present (Table 1). Monte Carlo simulations con-
`firmed the intuitive concept that fewer tags are needed to
`detect high copy number amplifications than homozygous
`deletions or low copy number changes in similar sized
`regions (Table 1). Such simulations were used to predict the
`size of alterations that could be reliably detected given a fixed
`number of experimentally sampled tags. For example, analy-
`sis of 100,000 tags would be expected to reliably detect a
`10-fold amplification 2 100 kb, homozygous deletions 2600
`kb, or a single gain or loss of regions :4 Mb in size in a
`diploid genome (Table 1).
`
`TABLE 1
`
`8
`Example 2
`
`Analysis of Whole Chromosomes
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`We characterized 210,245 genomic tags from lymphoblas-
`toid cells of a normal individual (NLB) and 171,795 genomic
`tags from the colorectal cancer cell line (DiFi) using the
`mapping and fragmenting enzymes described above. After
`filtering to remove tags that were within repeated sequences
`or were not present in the human genome (see Materials and
`Methods), we recovered a total of 111,245 and 107,515 fil-
`tered tags from the NLB and DiFi libraries, respectively. Tags
`were ordered along each chromosome, and average chromo-
`somal tag densities, defined as the number of detected tags
`divided by the number of virtual tags present in a given
`chromosome, were evaluated (Table 2). Analysis of the NLB
`data showed that the average tag densities for each autosomal
`chromosome was similar, ~0.16+/—0.04. The small variations
`in tag densities were likely due to incomplete filtering of tags
`matching repeated sequences that were not currently repre-
`sented in the genome databases. The X andY chromosomes
`had average densities about half this level, 0.073 and 0.068,
`respectively, consistent with the normal male karyotype of
`these cells. Analysis of the DiFi data revealed a much wider
`variation in tag densities, ranging from 0.089 to 0.27 for
`autosomal chromosomes. In agreement with the origin of
`these tumor cells from a female patient (20), the tag density of
`theY chromosome was 0.00. Estimates of chromosome num-
`
`ber using observed tag densities normalized to densities from
`lymphoblastoid cells suggested a highly aneuploid genetic
`content, with E15 copies of chromosome 1, 4, 5, 8, 17, 21
`and 22, and :3 copies of chromosome 7, 13 and 20 per
`diploid genome. These observations were consistent with
`CGH analyses (see below) and the previously reported karyo-
`type of DiFi cells (20).
`
`Theoretical detection of copy number alterations using Digital Karyotyping*
`
`Size ofAlteration’“
`
`Homozygous
`
`#
`virtual
`
`Amplification
`Copy number = 10
`
`deletion Copy
`number = 0
`
`Heterozygous loss
`Copy number = 1
`
`Subchromosomal gain
`Copy number = 3
`
`# bp
`
`100,000
`200,000
`600,000
`2,000,000
`4,000,000
`
`tags
`
`30
`50
`150
`500
`1000
`
`100,000
`
`1,000,000
`
`100,000
`
`1,000,000
`
`100,000
`
`1,000,000
`
`100,000
`
`1,000,000
`
`100%
`100%
`100%
`100%
`100%
`
`100%
`100%
`100%
`100%
`100%
`
`0.06%
`1%
`96%
`100%
`100%
`
`100%
`100%
`100%
`100%
`100%
`
`0.008%
`0.01%
`0.07%
`11%
`99%
`
`0.02%
`3%
`100%
`100%
`100%
`
`0.006%
`0.01%
`0.05%
`3%
`97%
`
`0.08%
`0.7%
`100%
`100%
`100%
`
`*Copy number alteration refers to the gain or loss of chromosomal regions in the context of the normal diploid
`genome, where the normal copy number is 2. The limiting feature of these analyses was not sensitivity for detect-
`ing the alteration, as this was high in every case shown (>99% for amplifications and homozygous deletions and
`>92% for heterozygous losses or subchromosomal gains). What was of more concern was the positive predictive
`value (PPV), that is, the probability that a detected mutation represents a real mutation. PPVs were calculated from
`100 simulated genomes, using 100,000 or 1,000,000 filtered tags, and shown in the table as percents.
`+Size of alteration refers to the approximate size of the genomic alteration assuming an average of 3864 bp
`between virtual tags.
`
`PGDX EX. 1002
`
`Page 10 of 21
`
`PGDX EX. 1002
`Page 10 of 21
`
`

`

`US 7,704,687 B2
`
`10
`contrast, the DiFi tag density map (normalized to the NLB
`data) revealed widespread changes, including apparent losses
`in large regions of 5q, 8p and 10q, and gains of2p, 7q, 9p, 12q,
`13q, and 19q (FIG. 2 and FIG. 5). These changes included
`regions ofknown tumor suppressor genes (21)

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket