`
`[19]
`
`[11] Patent Number:
`
`5,437,975
`
`McClelland et a1.
`Aug. 1, 1995
`[45] Date of Patent:
`
`llllllillllllllll|||||Illllllllllllll||ll|||||||||l|lllllllllllllllllllllll -
`USOOS437975A
`
`[54] CONSENSUS SEQUENCE PRINIED
`POLYMERASE CHAIN REACTION
`METHOD FOR FINGERPRINTING
`GENOMES
`
`[75]
`
`Inventors: Michael McClelland, Del Mar; John
`T. Welsh, Leucadia, both of Calif.
`
`[73] Assignee:
`
`California Institute of Biological
`Research, La Jolla, Calif.
`
`[21] Appl. No.: 661,591
`
`[22] Filed:
`
`Feb. 25, 1991
`
`[51]
`
`Int. 01.6 ........................ c12Q 1/68; C12P 19/34;
`:
`C07H 21/04
`
`[52] U.S. Cl. ....................................... 435/6; 435/912;
`536/2433; 935/78
`[58] Field of Search .......................... 435/91, 16, 91.2;
`536/27, 124.33; 935/78
`
`[56]
`
`References Cited
`PUBLICATIONS
`
`Nelson et al., PNAS 86, 6686-6690 (1989).
`Cinco et al., FEMS Microbiol. Immunol, 47:511—514
`(1989).
`Fox, Ann. Rev. Genet, 21:67—91 (1987).
`Giroux et al., J. Bacteriol, 170:5601-5606 (1988).
`Jeffreys et al., Nature, 316:76—79 (1985).
`typhimurium
`Jinks—Robertson et a1., “E. coli and S.
`Neidhardt ed., American Society for Microbiology
`Press”, Washington, D.C, pp. 1358—1385 (1987).
`Julier et al., Proc. Natl. Acad. Sci. USA. 87:4585—4589
`(1990).
`McBride et a1., Genomics, 5:561—-573 (1989).
`Ohyama et al., Nature, 322:572—574 (1986).
`
`Rogers et a1., Israel J. Med. Sci, 20:768—772 (1984).
`Vold, Microbiol. Rev., 49:71—80 (1985).
`Welsh. et al., Nucl. Acids Res, 18:7213—7218 (1990).
`
`Primary Examiner—~Margaret Parr
`Assistant Examiner—Kenneth R. Horlick
`Attorney, Agent, or Firm—Pennie & Edmonds
`
`[57]
`
`ABSTRACI‘
`
`A rapid method for generating a set of discrete DNA
`amplification products characteristic of a genome as a
`“fingerprint” for typing the genome comprises the steps
`of: forming a polymerase chain reaction (PCR) admix-
`ture by combining, in a PCR buffer, genomic DNA and
`at least one structural RNA consensus primer, and sub-
`jecting the PCR admixture to a plurality of PCR ther-
`mocycles to produce a plurality of DNA segments,
`thereby forming a sét of discrete DNA amplification
`products. The method is known as the consensus se-
`quence primed polymerase chain reaction (CF-PCR)
`method and is suitable for the identification of bacterial
`Species and strains,
`including Staphylococcus and
`Streptococcus
`species, mammals and plants. The
`method of the present invention can identify species
`rapidly, using only a small amount of biological mate-
`rial, and does not require knowledge of the nucleotide
`sequence or other molecular biology of the nucleic
`acids of the organisms to be identified. Only one primer
`sequence is required for amplification and/or identifica-
`tion. The method can also be used to generate detect-
`able polymorphisms for use in genetic mapping of ani-
`mals and humans.
`
`4 Claims, 3 Drawing Sheets
`
`Ric:
`Ibiza
`234567
`
`Hana:
`
`Sure-:5
` 5314‘.
`
`S. 79m5
`
`
`
`Ariosa Exhibit 1023, pg. 1
`|PR2013-00276
`
`Ariosa Exhibit 1023, pg. 1
`IPR2013-00276
`
`
`
`US. Patent
`
`Aug. 1, 1995
`
`Sheet 1 of 3
`
`5,437,975
`
`HG.1a
`
`Hem;
`
`Fl_G.1c
`
`
`
`Ariosa Exhibit 1023, pg. 2
`|PR2013-00276
`
`Ariosa Exhibit 1023, pg. 2
`IPR2013-00276
`
`
`
`US. Patent
`
`Aug. 1, 1995
`
`Shéet 2 of 3
`
`5,437,975
`
`31:
`36‘ 1'37
`24iE25 39E E
`E 3 5 7 9 EEE E3151618M2E 22EE27-E-28E E33 3"EE39E 4o
`MEZEQEE 6E88E10E12EE4E 5&EE71920E23EE26EE29EME32E EzEsEsaEE
`
`1636‘
`
`'
`
`EO‘ES f
`
`506/51? .
`
`396
`
`344.;
`
`298 ;
`
`220 5
`
`201. -‘
`
`154 “f
`
`E34 E,
`
`.
`75 __
`
`
`r
`
`g 5
`
`Ariosa Exhibit 1023, pg. 3
`|PR2013-00276
`
`Ariosa Exhibit 1023, pg. 3
`IPR2013-00276
`
`
`
`US. Patent
`
`Aug. 1, 1995
`
`Sheet 3 of 3
`
`5,437,975
`
`n———-—-—.———-—-—-n———-——_——‘n—————--I—-—————u~
`
` 506/517 fi
`
`396 a
`
`344 ...'._
`
`298 «y
`
`220 w
`
`20?
`
`154
`
`134
`
`FIG?)
`
`Ariosa Exhibit 1023, pg. 4
`|PR2013-00276
`
`Ariosa Exhibit 1023, pg. 4
`IPR2013-00276
`
`
`
`1
`
`5,437,975
`
`CONSENSUS SEQUENCE PRINIED
`POLYMERASE CHAIN REACTION METHOD FOR
`FINGERPRINTING GENOMES
`
`FIELD OF THE INVENTION
`
`This invention is directed toward a method of identi-
`fying segments of nucleic acid characteristic of a partic-
`ular genome by generating a set of discrete DNA ampli-
`fication products characteristic of the genome. This set
`of discrete DNA products can generate a fingerprint
`that can be used to identify the genome.
`
`BACKGROUND OF THE INVENTION
`
`For many purposes, it is important to be able to iden-
`tify the species to which an organism belongs rapidly
`and accurately. Such rapid identification is necessary
`for pathogens such as viruses, bacteria, protozoa, and
`multicellular parasites, and assists in diagnosis and treat-
`ment of human and animal disease, as well as studies in
`epidemiology and ecology. In particular, because of the
`rapid growth of bacteria and the necessity for immedi-
`ate and accurate treatment of diseases caused by them,
`it is especially important to have a fast method of identi-
`fication.
`Traditionally, identification and classification of bac-
`terial species has been performed by study of morphol-
`ogy, determination of nutritional requirements or fer—
`mentation patterns, determination of antibiotic resis-
`tance, comparison of isoenzyme patterns, or determina-
`tion of sensitivity to bacteriophage strains. These meth-
`ods are time-consuming, typically requiring at least 48
`to 72 hours, often much more. Other more recent meth-
`ods include the determination of RNA sequences (W0-
`ese, in “Evolution in Procaryotes” (Schleifer and Stack-
`ebrandt, Eds, Academic Press, London, 1986)), the use
`of strain-specific fluorescent oligonucleotides (DeLong
`et a1., Science 243, 1360-1363 (1989); Amann et al., J.
`Bact. 172, 762—770 (1990)), and the polymerase chain
`reaction (PCR) technique (U.S. Pat. Nos. 4,683,195 and
`4,683,202 to Mullis et a1., Mullis &. Faloona, Methods
`Enzymol. 154, 335—350 (1987)).
`In addition, DNA markers genetically linked to a
`selected trait can be used for diagnostic procedures.
`The DNA markers commonly used are restriction frag-
`ment length polymorphisms (RFLPs). Polymorphisms
`useful in genetic mapping are those polymorphisms that
`segregate in populations. Traditionally, RFLPs have
`been detected by hybridization methodology (e.g.
`Southern blot), but such techniques are time-consuming
`and inefficient. Alternative methods include assays for
`polymorphisms using PCR.
`The PCR method allows amplification of a selected
`region of DNA by providing two DNA primers, each
`of which is complementary to a portion of one strand
`within the selected region of DNA. These primers are
`used to hybridize to the separated strands within the
`region of DNA sought to be amplified, forming DNA
`molecules that are partially single-stranded and par-
`tially double-stranded. The double-stranded regions are
`then extended by the action of DNA polymerase, form-
`ing completely double-stranded molecules. These dou-
`ble-stranded molecules are then denatured and the de-
`natured single strands are rehybridized to the primers.
`Repetition of this process through a number of cycles
`results in the generation of DNA strands that corre-
`spond in sequence to the region between the originally
`used primers. Specific PCR primer pairs can be used to
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`65
`
`2
`identify genes characteristic of a particular species or
`even strain. PCR also obviates the need for cloning in
`order to compare the sequences of genes from related
`organisms, allowing the very rapid construction of phy-
`logenies based on DNA sequence. For epidemiological
`purposes, specific primers to informative pathogenic
`features can be used in conjunction with PCR to iden-
`tify pathogenic organisms.
`Although PCR is a very powerful method for ampli-
`fying DNA, conventional PCR procedures require the
`use of at least two separate primers complementary to
`specific regions of the genome to be amplified. This
`requirement means that primers cannot be prepared
`unless the target DNA sequence information is avail-
`able, and the primers must be “custom built” for each
`location within the genome of each species or strain
`whose DNA is to be amplified.
`Although the newer methods have advantages over
`previous methods for genome identification, there is
`still a need for a rapid, simple method that can be ap-
`plied to any species for which DNA can be prepared
`and that does not require reagents that are specific for
`each species or knowledge of the DNA sequence of the
`isolate being identified. It is also desirable that such a
`method be capable of identifying a species from a rela-
`tively small quantity of biological material. Addition-
`ally, it is highly desirable that such a method is also
`capable of generating polymorphisms useful in genetic
`mapping, especially of eukaryotes.
`In addition to identification of related plant, animal
`and bacteria species, DNA segments or “markers” may
`be used to construct human genetic maps for genome
`analysis. Goals for the present human genome project
`include the production of a genetic map and an ordered
`array of clones along the genome. Using a genetic map,
`inherited phenotypes such as those that cause genetic
`diseases, can be localized on the map and ultimately
`cloned. The neurofibromatosis gene is a recent example
`of this strategy (Xu et a1., Cell 62:599—608 (1990)). The
`genetic map is a useful framework upon which to as-
`semble partially completed arrays of clones. In the short
`term, it is likely that arrays of human genomic clones
`such as cosmids or yeast artificial chromosomes (YACs,
`Burke et a1., Science 236:806—812 (1987)) will form dis-
`connected contigs that can be oriented relative to each
`other with probes that are on the genetic map or the in
`situ map (Lichter et al., Science 24:64—69 (1990)), or
`both. The usefulness of the contig map will depend on
`its relation to interesting genes, the locations of which
`may only be known genetically. Similarly, the restric-
`tion maps of the human genome generated by pulsed
`field electrophoresis (PFE) of large DNA fragments,
`are unlikely to be completed without the aid of closely
`spaced markers to orient partially completed maps.
`Thus, a restriction map and an array of clones covering
`an entire mammalian genome, for example the mouse
`genome, is desirable.
`Recently, RFLPs that have Variable Number Tan-
`dem Repeats (VNTRs) have become a method of
`choice for human mapping because such VNTRs tend
`to have multiple alleles and are genetically informative
`because polymorphisms are more likely to be segregat-
`ing within a family. The production of fingerprints by
`Southern blotting with VNTRs (Jeffreys et al., Nature
`316:76—79 (1985)) has proven useful in forensics. There
`are two classes of VNTRs; one having repeat units of 9
`to 40 base pairs, and the other consisting of minisatellite
`
`Ariosa Exhibit 1023, pg. 5
`|PR2013-00276
`
`Ariosa Exhibit 1023, pg. 5
`IPR2013-00276
`
`
`
`5,437,975
`
`3
`DNA with repeats of two or three base pairs. The
`longer VNTRs have tended to be in the proterrninal
`regions of autosomes. VNTR consensus sequences may
`be used to display a fingerprint. VNTR fingerprints
`have been used to assign polymorphisms in the mouse
`(Julier et al., Proc. Natl. Acad. Sci. USA, 87:4585—4589
`(1990)), but these polymorphisms must be cloned to be
`of use in application to restriction mapping or contig
`assembly. VNTR probes are useful in the mouse be-
`cause a large number of crosses are likely to be informa-
`tive at a particular position.
`The mouse offers the opportunity to map in interspe-
`cific crosses which have a high level of polymorphism
`relative to most other inbred lines. A dense genetic map
`of DNA markers would facilitate cloning genes that
`have been mapped genetically in the mouse. Cloning
`such genes would be aided by the identification of very
`closely linked DNA polymorphisms. About 3000
`mapped DNA polymorphisms are needed to provide a
`good probability of one polymorphism being within 500
`kb of the gene. To place so many DNA markers on the
`map it is desirable to have a fast and cost-effective ge-
`netic mapping strategy.
`SUMMARY OF THE INVENTION
`
`4
`belongs, by comparing the DNA amplification products
`produced by CP-PCR for the isolate to the patterns
`produced from known strains with the same primer.
`The CP-PCR method can also be used to verify the
`assignment of a bacterial isolate to a species by compar-
`ing the CP—PCR fingerprint from the isolate with the
`CP-PCR fingerprints produced by known bacterial
`species with the same primer. For this application, the
`primer is chosen as described herein to maximize inter-
`specific difference of the discrete DNA amplification
`products.
`.
`The target nucleic acid of the genome can be DNA,
`RNA or polynucleotide molecules. If the CP-PCR
`method is used to characterize RNA, the method also
`preferably includes the step of extending the primed
`RNA with an enzyme having reverse transcriptase ac-
`tivity to produce a hybrid DNA-RNA molecule, and
`priming the DNA of the hybrid with an arbitrary single-
`stranded primer. In this application, the enzyme with
`reverse transcriptase activity can be avian myeloblasto-
`sis virus reverse transcriptase or Moloney leukemia
`virus reverse transcriptase.
`The discrete DNA amplification products produced
`by the CP-PCR method can be manipulated in a number
`of ways. For example, they can be separated in a me-
`dium capable of separating DNA fragments by size,
`such as a polyacrylamide or agarose gel, in order to
`produce a fingerprint of the amplification products as
`separated bands. Additionally, at least one separated
`band can be isolated from the fingerprint and reampli-
`fied by conventional PCR. The isolated separated band
`can also be cleaved with a restriction endonuclease. The
`reamplified fragments can then be isolated and cloned in
`a bacterial host. The isolated band or reamplified frag-
`ments can be sequenced. These methods are particularly
`useful in the detection and isolation of DNA sequences
`that represent polymorphisms differing from individual
`to individual of a species.
`The ability of the CP-PCR method to generate poly-
`morphisms makes it useful, as well, in the mapping and
`characterization of eukaryotic genomes, including plant
`genomes, animal genomes, and the human genome.
`These polymorphisms are particularly useful
`in the
`generation of linkage maps and can be correlated with
`RFLPs and other markers.
`
`Consensus primers, particularly structural RNA con-
`sensus primers are also contemplated, as are kits con-
`taining the primers in combination with control geno-
`mic DNA for typing isolated genomes.
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`In the drawings forming a portion of this disclosure:
`FIG. 1 shows the CP—PCR patterns produced by
`using isolates representing five different
`'species of
`Staphylococcus, and illustrates the differences apparent
`between species, as described in Example 2. PCR was
`performed using the primers T5A in group a, T3A in
`group b, or T5A plus T3A in group c, at 50° C. Each
`numbered lane consists of three adjacent lanes having
`80, 16 or 3.2 ng of template. Lane 1: S. haemolyticus CC
`1212. Lane 2: S. homim‘s 27844. Lane 3: S. warneri
`CPB10E2. Lane 4: S. cohnni IL 143. Lane 5: S. aureus
`ISP-8.
`
`FIG. 2 shows the CP-PCR patterns produced by
`using forty strains of bacteria from three different gen-
`era, and illustrates the differences detectable between
`the strains and the general similarity of the patterns
`from the same species, as described in Example 2. PCR
`
`Ariosa Exhibit 1023, pg. 6
`|PR2013-00276
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`Accordingly, the methods of the present invention,
`referred to herein as consensus sequence primed poly-
`merase chain reaction or “CP-PCR” fingerprinting,
`provides a distinctive variation of the PCR technique
`by employing “consensus” sequence polynucleotide
`primers as defined herein. We have unexpectedly found
`that the use of at least one consensus primer, preferably
`a structural RNA consensus primer, in a standard PCR
`amplification procedure reproducibly generates specific
`discrete products that can be resolved into a manage-
`able number of individual bands providing a species
`“fingerprint”. The CP—PCR method is suitable for the
`rapid identification and classification of organisms
`throughout the plant, prokaryotic or eukaryotic king—
`doms and for the generation of polymorphisms suitable
`for genetic mapping of eukaryotes. Only a small sample
`of biological material is needed, and knowledge of the
`target DNA sequence to be identified is not required. In
`addition, reagents specific for a given species are not
`required.
`In general, CP-PCR is a method for generating a set
`of discrete DNA products (“amplification products”)
`characteristic of a genome by priming target nucleic
`acid obtained from a genome with at least one single-
`stranded primer to form primed nucleic acid. The
`primed nucleic acid is then amplified by performing at
`least one cycle of polymerase chain reaction (PCR)
`amplification, and preferably at least 10 cycles, of PCR
`amplification to generate a set of discrete DNA amplifi-
`cation products characteristic of the genome.
`The genome to which the CP-PCR method is applied
`can be a viral genome; a bacterial genome, including
`Staphylococcus and Streptococcus; a plant genome,
`including rice, maize, or soybean; or an animal genome,
`including a human genome. It can also be a genome of 60
`a cultured cell line. The cultured cell line can be a chi-
`meric cell line with at least one human chromosome in
`a non-human background i.e., a hybrid cell line.
`The CP-PCR method can be used to identify an or-
`ganism as a species of a genus of bacteria, for example,
`Staphylococcus, from a number of different species.
`Similarly, the method can be used to determine the
`strain to which an isolate of the genus Streptococcus
`
`45
`
`50
`
`55
`
`65
`
`Ariosa Exhibit 1023, pg. 6
`IPR2013-00276
`
`
`
`5,437,975
`
`6
`genome of a cultured cell line. The cultured cell line can
`be chimeric with at least one human chromosome in an
`
`otherwise non-human background. The non-human
`background can be rodent, such as mouse or Chinese
`hamster.
`
`5
`
`5
`was performed using the primers T5A plus T3A at 50°
`C. with 100 ng of template. The templates in lanes 1 to
`17 contain Streptococcus DNAs. Lanes 18 and 19 con-
`tain Enterococcus DNAs. Lanes 20 to 40 contain Staph-
`ylococcus DNAs. See Table 1 for the strains used in
`each lane.
`'
`FIG. 3 shows the CP-PCR patterns produced by
`using genomes from species across the three kingdoms
`and illustrates the existence of polymorphisms, as de-
`scribed in Example 2. The reaction was performed
`using 50 ng of template under the standard PCR condi-
`tions. The low temperature annealing step was 50° C.
`Lanes 1 to 9 used the primers TSA and T3A. Lanes 10
`to 19 used T5B and T3A. See Table l for the strains
`used in each lane.
`
`DETAILED DESCRIPTION OF THE
`INVENTION
`
`In order that the invention herein described may be
`more fully understood, the following detailed descrip-
`tion is set forth.
`
`20
`
`This invention relates to a method for generating a set
`of discrete DNA amplification products characteristic
`of a genome. This set of discrete DNA amplification
`products can be resolved by techniques such as gel
`electrophoresis, producing a distinctive pattern, known
`as a “fingerprint”, that can be used to identify the ge-
`nome. This method uses a distinctive and novel varia-
`
`tion of the polymerase chain reaction (PCR) technique
`that employs one or more consensus primers based on a
`consensus sequence described herein and is therefore
`designated the “consensus sequence primed polymerase
`chain reaction” (“CF-PCR”) method.
`I. THE GENERAL METHOD
`
`In generai, the method of the invention involves the
`following steps:
`(1) rendering target nucleic acids of the genome ac-
`cessible to priming;
`(2) priming the target nucleic acids of the genome
`with a preselected single-stranded consensus se-
`quence primer to form primed nucleic acids;
`(3) performing a number of cycles of PCR on the
`primed nucleic acids to generate a set of discrete
`amplification products; and
`(4) if the discrete DNA amplification products are to
`be used for the identification of a genome, compar-
`ing the amplification products with those produced
`from nucleic acids obtained from genomes of
`known species.
`Alternatively, the amplification products produced
`by the invention can be used to assemble genetic maps
`for genome analysis.
`Each of these steps is discussed in detail below.
`A. Selection of Genome
`
`The method of the present invention is particularly
`well suited to the generation of discrete DNA amplifi-
`cation products from nucleic acids obtained from ge-
`nomes of all sizes from 5 X 104 nucleotide bases (viruses)
`to 3 X109 bases and greater (animals and plants).
`“Nucleic acids” as that term is used herein means that
`class of molecules including single-stranded and double-
`stranded deoxyribonucleic acid (DNA),
`ribonucleic
`acid (RNA) and polynucleotides.
`The CP-PCR method can be applied to such econom-
`ically important plants as rice, maize, and soybean. It
`can also be applied to the human genome and to the
`
`25
`
`30
`
`35
`
`45
`
`50
`
`55
`
`65
`
`10
`
`15
`
`As described in Example 2, infra, the DNA amplifica-
`tion products can be used to determine that an unidenti-
`fied sample of an organism such as from a bacterium
`belongs to the genus Staphylococcus and can be used
`further to determine to which species and/or strain of
`that genus the organism belongs. »
`
`B. Rendering the Nucleic Acids of the Genome
`Accessible to Priming
`
`“Genomic DNA” is used in an art recognized manner
`to refer to a population of DNA that comprises the
`complete genetic component of a species. Thus geno-
`mic DNA comprises the complete set of genes present
`in a preselected species. The complete set of genes in a
`species is also referred to as a genome. Depending on
`the species, genomic DNA can vary in complexity, and
`in number of nucleic acid molecules. In higher organ-
`isms, genomic DNA is organized into discrete nucleic
`acid molecules (chromosomes).
`For species low in the evolutionary scale, such as
`bacteria, viruses, yeast, fungi and the like, a genome is
`significantly less complex than for a species high in the
`evolutionary scale. For example, whereas E. coli is esti-
`mated to contain approximately 2.4x 109 grams per
`mole of haploid genome, man contains about 7 .4x 1012
`grams per mole of haploid genome.
`Genomic DNA is typically prepared by bulk isolation
`of the total population of high molecular weight nucleic
`acid molecules present in a biological material derived
`from a single member of a species. Genomic DNA can
`be prepared from a tissue sample, from a whole organ-
`ism or from a sample of cells derived from the organism.
`Exemplary biological materials for preparing mam-
`malian genomic DNA include a sample of blood, mus-
`cle or skin cells, tissue biopsy or cells cultured from
`tissue, methods for isolating high molecular weight
`DNA are well known. See, for example, Maniatis et al.,
`in Molecular Cloning: A Laboratory Manual, Cold
`Spring Harbor Laboratory, New York (1982); and US.
`Pat. No. 4,800,159 to Mullis et a1.
`Rendering the nucleic acids of the genome accessible
`to priming requires that the nucleic acids be available
`for base—pairing by primers and that DNA polymerases
`and other enzymes that act on the primer-template com-
`plex can do so without interference. The nucleic acids
`must be substantially free of protein that would interfere
`with priming or the PCR process, especially active
`nuclease, as well as being substantially free of nonpro-
`tein inhibitors of polymerase action such as heavy met-
`als.
`A number of methods well-known in the art are suit-
`able for the preparation of nucleic acids in a condition
`accessible to priming. Typically, such methods involve
`treatment of cells or other nucleic acid—containing
`structures, such as virus particles, with a protease such
`as proteinase K or pronase and a strong detergent such
`as sodium dodecyl sulfate (“SDS”) or sodium lauryl
`sarcosinate (“Sarkosyl”) to lyse the cells. This is fol-
`lowed by extraction with phenol and chloroform to
`yield an aqueous phase containing the nucleic acid. This
`nucleic acid is then precipitated with ethanol and redis-
`solved as needed. (See Example 1, infra).
`
`Ariosa Exhibit 1023, pg. 7
`|PR2013-00276
`
`Ariosa Exhibit 1023, pg. 7
`IPR2013-00276
`
`
`
`7
`Alternatively, as where the genome is in bacteria, a
`small portion (~0.5 m2) of a single bacterial colony
`can be removed with a ZOO-uL automatic pipette tip and
`suspended in 5 p,L of TE (0.01M Tris-HCI, pH 8.0, 1
`mM EDTA) in a plastic microfuge tube and boiled for
`5 minutes. After the sample is boiled, the debris is pel-
`leted by centrifugation. The CP-PCR method can then
`be performed directly on the nucleic acids present in the
`supernatant sample after appropriate dilution.
`In some applications, it is possible to introduce sam-
`ples such as blood or bacteria directly into the PCR
`protocol as described below without any preliminary
`step because the first cycle at 94° C. bursts the cells and
`inactivates any enzymes present.
`
`C. Priming the Target Nucleic Acids
`
`l. The Consensus Primer Sequence
`a. General Considerations
`
`The sample of target nucleic acids is primed with a
`single-stranded primer. Individual single-stranded prim-
`ers, pairs of single-stranded primers or a mixture of
`single-stranded primers can be used.
`A primer for use in this inventions is a consensus
`sequence polynucleotide primer, or consensus primer.
`A consensus primer is a polynucleotide having a nucleo-
`tide sequence that comprises a region at its 3’ terminus
`that is homologous to a consensus sequence derived
`from a family of related genes within a genome, or
`derived from related genes found in the genomes of
`different species. The related genes from which a con-
`sensus primer is derived are a class of genes that occur
`in the genome as a cluster within the genome.
`Clusters of related genes are known to occur for a
`variety of gene families, any of which are suitable as a
`source of related genes for deriving a consensus primer
`for use in the present invention. Gene clusters are re-
`gions of a genome in which related genes are organized
`within a single nucleic acid molecule of the genome, i.e.,
`are genetically linked. Gene clusters comprise two do-
`mains: (1) the nucleotide sequences that define each of
`the related genes that are members of the cluster, and
`(2) the nucleotide sequences that define the spacer re-
`gion between each member of the related genes of the
`cluster. Whereas the related genes (members) of the
`cluster are conserved when compared at the level of
`sub-species, species, family, order or other division of
`evolutionary relatedness, the spacer region of a gene
`cluster is more variable in nucleotide sequence than the
`nucleotide sequence defining a member of a cluster.
`Variability in the spacer regions of a gene cluster
`provides the polymorphisms that produce a fingerprint
`by the present methods which is characteristic of the
`organism being analyzed. Variability in spacer regions
`can be manifest by differences in actual sequence, by
`differences in spacer length between members of the
`cluster, and even by differences in overall organization
`of the members of a cluster.
`Organization of related genes in a cluster can vary
`both in the linear order of the members of the cluster on
`the nucleic acid molecule defining the cluster and in the
`orientation of each member of the cluster relative to one
`another.
`
`Typical and preferred gene clusters are the structural
`RNA families, namely the family of genes that encode
`transfer RNA (tRNA) molecules, and the family of
`genes that encode ribosomal RNA (rRNA) molecules
`
`10
`
`15
`
`20
`
`25
`
`3O
`
`35
`
`4O
`
`45
`
`50
`
`55
`
`65
`
`5,437,975
`
`8
`known as 285, 165 and 55 rRNA’s. Other gene clusters
`are the linked genetic elements of an operon.
`The tRNA gene cluster is particularly preferred and
`is exemplary of the general methods described herein.
`Although the sequence of the primer can vary
`Widely, so long as it comprises a consensus sequence at
`it’s 3’ termini, some guidelines to primer selection are
`found in Innis and Gelfand, “Optimization of PCRs,” in
`PCR Protocols: A Guide to Methods and Applications (M.
`A. Innis, D. H. Gelfand, J. J. Sninsky and T. J. White,
`eds, Academic Press, New York, 1990), pp. 3-12, in-
`corporated herein by this reference. Briefly, the primer
`typically has 50 to 60% G+C composition and is free of
`runs of three or more consecutive C’s or G’s at the
`
`3’-end or of palindromic sequences, although having a
`(G+C)-rich region near the 3’-end may be desirable.
`These guidelines, however, are general and intended to
`be nonlimiting. Additionally, in many applications it is
`desirable to avoid primers with a T at the 3’ end because
`such primers can prime relatively efficiently at mis-
`matches, creating a degree of mismatching greater than
`desired, and affect the background amplification.
`The CP-PCR method is based on the rationale that
`
`for any preselected gene cluster, which comprises at
`least two related and genetically linked genes, there is a
`spacer region between the linked genes which is vari-
`able and contains nucleotide sequence differences when
`compared to the same region from a different sub-spe—
`cies, species, genus, family or other evolutionary divi-
`sion of organisms having members of the gene cluster.
`The consensus primer is selected to amplify one or more
`specific primer extensions products that contain the
`nucleotide sequence of one or more of the spacer re-
`gions between two genes of a cluster.
`The consensus primer amplifies DNA segments con-
`taining spacer regions because the primer is selected to
`provide a 3’ terminus for primer extension that “points”
`the direction of primer extension across the spacer re-
`gion. The consensus sequence of the primer is selected
`such that there is a degree of homology with a consen-
`sus region within the individual members of a gene
`cluster that the primers can be expected to anneal to
`many consensus sequences contained within a variety of
`the members of a gene cluster. Some of these will be
`within a few hundred basepairs of each other and on
`opposite strands thereby satisfying the requirements for
`PCR amplification. Thus, the sequences between these
`consensus sequence positions will be PCR amplifiable.
`The extent to which sequences amplify will depend on
`the efficiency 'of priming at each pair of primer anneal-
`ing sites. Because the sequence of the primer is selected
`to contain some degree of homology with a consensus
`sequence with respect to the target nucleic acid se-
`quence of the genome, a substantial degree of hybridiza-
`tion between the DNA strands of the primer and the
`target nucleic acids of the genome is expected to occur.
`“Substantial degree of hybridization” is defined herein
`to mean in the context of a primer extension reaction
`thermocycle in which primer annealing occurs, that the
`hybridizing conditions favor annealing of homologous
`nucleotide sequences under “high stringency condi-
`tions”. In some embodiments, where less evolutionary
`relatedness is desired, the hybridizing conditions can be
`carried out under low stringency or intermediate strin-
`gency conditions so that up to 10% of the nucleotide
`bases of a primer sequence are paired with inappropri-
`ate (non-complementary) bases in the target nucleic
`
`Ariosa Exhibit 1023, pg. 8
`|PR2013-00276
`
`Ariosa Exhibit 1023, pg. 8
`IPR2013-00276
`
`
`
`5,437,975
`
`9
`acid, e.g. a guanine base in the primer is paired with an
`adenine base in the target nucleic acid.
`As used herein, the phrase “internal mismatching” in
`its various grammatical forms refers to non-complemen-
`tary nucleotide bases in the primer, relative to a tem-
`plate to which it is hybridized, that occur between the
`5’-terminal most and 3'-termina1 most bases of the
`primer that are complementary to the template. Thus,
`5’—termina1 and/or
`3’-terminal non-complementary
`bases are not “internally mismatched” bases. A “sub-
`stantial degree of “internal mismatching” is such that at
`least 6.5% of the nucleotide bases of the primer se-
`quence are paired with inappropriate bases in the target
`nucleic acid.
`In the CP-PCR method of the invention the genome
`may be primed with a single consensus primer, a combi-
`nation of two or more primers or a mixture of heteroge-
`neous primers, each individual primer in the mixture
`having a different, but related sequence. When a mix-
`ture of primers is used, some, but not all, of the primers
`can match more efficiently. An example of use of a
`mixture of primers is provided in Example 2, infra.
`Preferably, the consensus primer is about 10 to about
`50 nucleotide bases long, and more preferably, about 17
`to about 40 bases long. In principle, the shorter the
`oligonucleotide, the more perfect a match must be in
`order to permit priming. The primer can be of any
`sequence so long as it comprises a consensus sequence
`as defined herein. The primer can have sequence red