`
`Highly parallel genomic assays
`
`Jian-Bing Fan*, Mark S. Chee‡ and Kevin L. Gunderson*
`Abstract | Recent developments in highly parallel genome-wide assays are transforming the
`study of human health and disease. High-resolution whole-genome association studies of
`complex diseases are finally being undertaken after much hypothesizing about their merit
`for finding disease loci. The availability of inexpensive high-density SNP-genotyping arrays
`has made this feasible. Cancer biology will also be transformed by high-resolution genomic
`and epigenomic analysis. In the future, most cancers might be staged by high-resolution
`molecular profiling rather than by gross cytological analysis. Here, we describe the key
`developments that enable highly parallel genomic assays.
`
`The elucidation of the role of human biology in health
`and disease requires a thorough understanding of the
`relationship between genomic information and the
`corresponding phenotype. The ability to collect this
`genomic information — including sequence, genotypic
`variation, expression levels and epigenetic status —
`rapidly and inexpensively has long been a bottleneck
`to realizing this goal. Cancer biology is one example, in
`which the use of genomic information to stage tumours
`should aid initial diagnosis and subsequent treatment1–5.
`The development and application of novel, highly paral-
`lel genomic assay systems have put us at the cusp of a
`genetic information explosion that will allow medical
`clinics to use individualized genetic information to make
`diagnostic, prognostic and therapeutic decisions.
`The first generation of microarray platforms for
`highly parallel genomic analysis was developed over
`15 years ago6,7. They facilitated the development more
`than 10 years ago of intrinsically parallel assays (more
`commonly known as gene-expression-profiling assays)
`to measure mRNA abundance8,9. However, intrinsically
`parallel whole-genome approaches to genotyping, epi-
`genetic profiling and sequencing have only recently been
`developed10,11, and are enabling scientific studies that
`were previously not feasible, such as whole-genome
`linkage disequilibrium (LD) association studies of
`case–control populations12,13.
`In this Review, we focus on the development of
`methods and platforms that have enabled highly parallel
`genomic assays for genotyping, copy-number measure-
`ments, sequencing and detecting loss of heterzygosity
`(LOH), allele-specific expression and methylation.
`We conclude with a discussion of parallel assays for
`epigenomics and some of the attendant challenges. We
`do not discuss array-based gene-expression profiling
`because it is a relatively mature technology with many
`
`excellent reviews (for example, for a recent review of
`gene-expression and tiling arrays see REF. 14).
`
`Highly parallel genotyping assays
`The ability to perform these highly parallel genomic
`assays depends on two fundamental characteristics of
`the assay: a highly parallel array-based read-out and an
`intrinsically scalable, multiplexing sample preparation.
`Gene-expression profiling was the first genomic assay to
`be parallelized, using high-density DNA arrays for read-
`out and a single-tube sample preparation9,15,16. Designing
`and developing a multiplexed sample preparation for
`genotyping has been more challenging, mainly because
`the need to assay a locus at single-base resolution in the
`context of the entire human genome introduces specific-
`ity problems and because of the need to detect analytes at
`low concentrations. A powerful way of obtaining genomic
`specificity is the detection of physically coincident events
`on genomic targets. PCR is a good example — ampli-
`fication of a specific locus requires coincident anneal-
`ing and extension of two primers at the desired locus.
`Other examples include ligase chain reaction (LCR) and
`padlock-probe amplification, both of which use an enzy-
`matic approach17–22. Despite the potential of the other
`assay formats, PCR has been the workhorse of genomic
`sample preparation because of its ability to efficiently
`amplify a specific locus within the context of the entire
`genome. Unfortunately, the ability of PCR to multiplex in
`array-based applications has been limited23.
`
`Multiplex PCR. Multiplex PCR reactions have been
`plagued by primer dimer (PD) formation and unequal
`amplification rates that depend on amplicon length,
`sequence and priming efficiency24,25. The PD effect is
`exacerbated by simultaneous amplification of many
`loci because the number of potential primer–primer
`
`Cancer staging
`Classification of cancer types
`into groups that reflect their
`localization, metastasis,
`prognosis, recommended
`treatment regimen and
`predicted clinical outcome.
`
`Linkage disequilibrium
`The property of two
`polymorphic loci in a
`population such that the
`polymorphic states at the two
`loci are not independent of one
`another, and as a result the
`state of the polymorphism at
`one locus has a higher
`probability of being associated
`with a particular state at the
`second locus. This association
`is usually measured with a
`metric called r 2 that ranges
`between zero (no linkage) and
`one (complete linkage).
`
`*Illumina Inc., 9885 Towne
`Centre Drive, San Diego,
`California 92121, USA.
`‡Prognosys Biosciences Inc.,
`505 Coast Boulevard, La
`Jolla, California 92037, USA.
`Correspondence to K.L.G.
`e-mail:
`kgunderson@illumina.com
`doi:10.1038/nrg1901
`
`632 | AUGUST 2006 | VOLUME 7
`
` www.nature.com/reviews/genetics
`
`© 2006 Nature Publishing Group
`
`© 2006 Nature Publishing Group
`
`Ariosa Exhibit 1025, pg. 1
`IPR2013-00276
`
`
`
`interactions increases roughly with the square of
`the number of loci26. Several approaches have been
`introduced to improve PCR multiplexing beyond the
`traditional limit of several dozen loci per reaction.
`PD artefacts during multiplex PCR have been
`reduced in part by using a two-stage PCR with bipartite
`primers that contain a genome-specific 3′ sequence that
`is concatenated to a 5′ universal or common adaptor
`sequence (also known as a ‘tag’ or ‘tail’ sequence). Two
`methods that use this approach are the multiplex geno-
`typing system (MGS)27 and homo-tag non-dimer system
`(HANDS)28 (FIG. 1). Both approaches use a two-stage PCR
`reaction that consists of several cycles of genomic prim-
`ing followed by cycles of universal priming. In HANDS,
`the switch from genome priming to universal priming is
`accomplished by increasing the annealing temperature
`and designing the genome-specific portion of the primers
`to have a lower Tm than the universal primers; in MGS
`the same switch is accomplished by inoculating the
`products from the initial genomic-priming PCR into a
`subsequent universal PCR reaction. Variations of this com-
`mon (universal)-primer PCR multiplex approach have
`been successfully used in large-scale array-based SNP
`discovery and genotyping assays29–31.
`An alternative approach to reducing PD interactions
`in multiplex PCR is to physically separate the primers on
`a solid-phase substrate32,33. In one example, PCR primer
`pairs were physically immobilized on beads that were
`pooled and used in a solid-phase PCR34. Although much
`less efficient than liquid-phase PCR, this solid-phase
`PCR generated accurate and locus-specific genotype
`results. Despite the initial success of this approach, its
`large-scale implementation remains problematic. The
`efficiency of solid-phase PCR has been improved by
`immobilizing only a single PCR primer on the bead
`and using both primers and the target in solution. The
`primer in solution that is identical to the one on the bead
`is used at low concentration (~50 nM), whereas the other
`primer is used at high concentration (1 µM)35.
`Bioinformatics approaches have recently been used
`to select sets of ‘non-interacting’ primers in a multiplex
`PCR reaction36, which allows successful multiplexing
`of over 1,000 loci in a single reaction tube using stan-
`dard PCR primers with minimal PDs. The scalability
`of this approach to larger numbers of loci remains to
`be demonstrated.
`
`Genotyping using universal arrays. Two successful
`highly multiplexed PCR-based genotyping assays that
`use universal PCR are the molecular inversion probe
`(MIP) assay37 and the GoldenGate assay38. These assays
`were used extensively in phase I of the International
`HapMap Project39 (FIG. 2a,b). Locus specificity is con-
`ferred by a two-step recognition that involves annealing
`of both upstream and downstream oligonucleotides to
`the SNP site. In effect, probe hybridization provides
`specificity for the correct locus in the genome whereas
`enzymatic mismatch discrimination confers additional
`genomic specificity and is also selective for a particular
`allele. GoldenGate uses allele-specific primer extension
`whereas MIP uses single nucleotide addition (fill-in)
`
`R E V I E W S
`
`Genomic priming
`
`Cycle 1
`
`Cycle 2
`
`and
`
`and
`
`Cycle 3
`Raise annealing temperature, prime and amplify using tag
`
`and
`
`Figure 1 | Multiplex PCR. Simultaneous amplification of
`many loci requires careful design of primers, which are kept
`at a relatively low concentration in the PCR reaction. The
`use of chimeric primers that contain both a universal
`priming sequence (blue) and locus-specific sequence
`(green) greatly reduces multiplex PCR artefacts. In the first
`few rounds of PCR, the low-concentration (~20 nM) locus-
`specific primers are used to amplify the appropriate loci.
`After a few rounds of amplification, the PCR reaction is
`switched to the use of high-concentration (~1 µM)
`universal primer either through inoculation into a universal
`PCR reaction or by switching annealing temperature to
`differentiate between locus-specific priming and universal
`priming. Modified with permission from REF. 28 © (1997)
`Oxford University Press.
`
`to score SNPs (FIG. 2). MIP achieves further specificity
`by using a circularizable probe that contains both the
`upstream and downstream query sequence. This enables
`cooperative annealing and the use of much lower oli-
`gonucleotide concentrations, adding to specificity and
`multiplexibility. In the GoldenGate assay, the upstream
`and downstream probes are separate but the probe is
`hybridized to genomic DNA (gDNA) that has been
`immobilized on a solid support. This enables stringent
`washing to remove excess and incorrectly hybridized
`probes. Both assays are read out through hybridization
`of the multiplex PCR amplicons to a universal array of
`address sequences. A tag that is complementary to the
`addresses is designed into the query oligonucleotides
`in a locus-specific manner, which allows a one-to-one
`mapping between an address sequence on the array and
`the locus being scored. Both approaches successfully
`multiplex to high levels38,40.
`
`Ligase chain reaction
`A cyclic amplification method
`for amplifying a target
`sequence that is similar in
`approach to PCR except that
`repeated rounds of thermally
`controlled denaturation,
`annealing and ligation of a pair
`of adjacent oligonucleotides
`are carried out.
`
`Padlock-probe amplification
`A ligation-mediated
`bimolecular assay for a target
`sequence in which the two
`query oligonuceotides (5′ and
`3′ sequences) are derived from
`the two ends of a contiguous
`oligonucleotide. Ligation of the
`two ends creates a circular
`structure that is intertwined
`with the target sequence.
`
`Primer dimer
`A parasitic product that is
`formed during PCR reactions
`and is caused by multiple
`primers interacting and
`extending upon themselves.
`Appropriate design of primer
`sequences can reduce this
`effect.
`
`Universal PCR
`A multiplex PCR reaction using
`a single or pair of universal
`primer sequences to amplify
`a broad range of target
`sequences that all contain
`common invariant 5′ and 3′
`tail sequences
`
`NATURE REVIEWS | GENETICS
`
` VOLUME 7 | AUGUST 2006 | 633
`
`© 2006 Nature Publishing Group
`
`© 2006 Nature Publishing Group
`
`Ariosa Exhibit 1025, pg. 2
`IPR2013-00276
`
`
`
`A
`
`A
`
`PCR
`
`bb
`
`bb
`
`Primer
`extension
`
`R E V I E W S
`
`a Anneal
`
`Gap-fill polymerization
`
`b
`
`gDNA
`
`CGGAGA T GGCCC A
`GCC T C T CCGGG T
`
`CGGAGA T GGCCC A
`GCC T C T A CCGGG T
`
`Gap-fill ligation
`
`Exonuclease selection
`
`CGGAGA T GGCCC A
`GCC T C T A CCGGG T
`
`Probe release
`
`Amplification
`
`c
`
`RE
`
`RE
`
`RE
`
`Adaptor ligation
`
`ASPE and
`ligation
`
`5′
`
`Universal
`PCR sequence 1
`Universal
`PCR sequence 2
`
`Amplifiable
`template
`
`5′
`Cy3 Universal primer 1
`Cy5 Universal primer 2
`
`PCR with
`common
`primers
`
`5′
`
`d
`
`gDNA
`
`WGA
`
`Fragmentation
`
`Denaturation
`
`Single-primer PCR
`
`Hybridization
`
`Bead
`
`A
`Bead chip
`
`B
`
`Bead
`
`A
`Bead chip
`
`Figure 2 | Highly parallel genotyping assays. Four approaches to creating highly parallel genotyping assays are
`shown, all of which rely on minimizing the interaction of query primers. a | Molecular-inversion probe (MIP) genotyping
`uses circularizable probes with 5′ and 3′ ends that anneal upstream and downstream of the SNP site leaving a 1 bp gap
`(genomic DNA is shown in blue). Polymerase extension with dNTPs and a non-strand-displacing polymerase is used to
`fill in the gap. Ligation seals the nick, and exonuclease I (which has 3′ exonuclease activity) is used to remove excess
`unannealed and unligated circular probes. Finally, the circularized probe is release through restriction digestion at a
`consensus sequence, and the resultant product is PCR-amplified using common primers to ‘built-in’ sites on the circular
`probe. The orientation of the primers ensures that only circularized probes will be amplified. The resultant product is
`hybridized and read out on an array of universal-capture probes (BOX 1). b | GoldenGate genotyping uses extension
`ligation between annealed locus-specific oligos (LSOs) and allele-specific oligos (ASOs). An allele-specific primer-
`extension (ASPE) step is used to preferentially extend the correctly matched ASO (at the 3′ end) up to the 5′ end of the
`LSO primer. Ligation then closes the nick. A subsequent PCR amplification step is used to amplify the appropriate
`product using common primers to ‘built-in’ universal PCR sites in the ASO and LSO sequences. As in MIP, the resultant
`products are hybridized and read out on an array of universal-capture probes (complementary to IllumiCodes).
`c | Reduced-complexity PCR representation using restriction enzyme (RE) digestion of genomic DNA (gDNA), common
`primer adaptor ligation and single-primer PCR. The single-primer PCR reaction effectively selects for restriction
`digestion products of 200–2,000 nucleotides. The reduced-complexity representation is read out on an array of locus-
`specific probes. The decrease in complexity improves the signal-to-noise ratio by increasing the partial concentration
`of any given locus and decreasing cross-hybridization. d | Whole-genome genotyping on bead arrays. gDNA is whole-
`genome amplified (WGA), fragmented, denatured and hybridized to an array of locus-specific capture probes (shown is
`an allele-specific primer extension assay using two bead types, A and B, per locus). SNPs are scored directly on the array
`surface by primer extension. The separation of the capture step from the SNP-scoring step allows efficient target
`capture and facilitates good discrimination between alleles. After extension, the array is stained and read out using
`standard immunohistochemical detection methods. b, biotin. Panel a is modified with permission from Nature
`Biotechnology REF. 37 © (2003) Macmillan Publishers Ltd.
`
`634 | AUGUST 2006 | VOLUME 7
`
` www.nature.com/reviews/genetics
`
`[T/A]
`IllumiCode address
`3′
`Universal
`PCR sequence 3′
`
`ligase
`
`Denature
`
`pol
`[T/C]
`
`GA
`
`IllumiCode
`3′
`
`b
`
`Universal primer P3
`
`IllumiCode for
`universal array read-out
`3′
`b
`
`B
`
`B
`
`bb
`
`b
`
`Bead
`
`A
`Bead chip
`
`Staining
`
`bbb
`
`© 2006 Nature Publishing Group
`
`© 2006 Nature Publishing Group
`
`Ariosa Exhibit 1025, pg. 3
`IPR2013-00276
`
`
`
`R E V I E W S
`
`In spite of the advantages of highly multiplexed SNP
`genotyping assays, data from over 2.4 million SNPs from
`phase II of the International HapMap Project were col-
`lected using a relatively low-multiplex long-range PCR
`that consisted of over 300,000 individual PCR reactions
`per sample41,42. About 8–9 amplicons were multiplexed
`per reaction, but the extended amplicon size covered,
`on average, about 8 SNPs for a total of ~64–72 SNPs
`per reaction. Amplified products were pooled in sets of
`~6,250 amplicons and hybridized to a collective set of
`49 SNP tiling arrays to generate the genotypes. About
`92% of the genome was amplified and hybridized to the
`array using this approach. Although an impressive effort,
`this approach does not seem cost effective for large-scale
`genotyping projects.
`
`Genotyping with genomic representations. Whole-genome
`representations are created by universal adaptor PCR of
`adaptor-ligated, restriction-enzyme-digested gDNA,
`a process that results in a reproducible fraction of the
`genome being amplified. The approach was originally
`described by Kinzler and Vogelstein in 1989 for the selec-
`tion of nucleic-acid sequences that are bound to regula-
`tory proteins43 (FIG. 2c). The fraction of the genome that is
`amplified depends on the restriction enzymes that are used
`and the PCR conditions. PCR-based, reduced-complexity
`genomic representations have been used for genomic pro-
`filing of tumour samples44–46 and cataloguing of copy-
`number variation in normal individuals47. Lucito et al.
`created both low-complexity and high-complexity rep-
`resentations using PCR of adaptor-ligated gDNA that
`had been created by digestion with restriction enzymes
`of 4–6-base recognition motifs. PCR amplification using
`universal primers provides an inherent size selection
`because representation inserts larger than ~1 kb are
`poorly amplified.
`The reduced-complexity representation approach
`has also been successfully used in genome-wide SNP
`genotyping assays. Kennedy et al. and Matsuzaki et al.
`developed whole-genome sampling analysis (WGSA)
`in which gDNA is digested with XbaI, ligated to an
`adaptor, amplified by PCR and hybridized to DNA
`genotyping arrays48,49 (FIG. 2c). The effective complex-
`ity (the amount of sequence from the human genome)
`of this representation was ~60 Mb, and provided suf-
`ficient signal to noise when hybridized to 25-mer
`oligonucleotide probe arrays to call genotype accu-
`rately. With an additional restriction enzyme and
`enhanced PCR conditions, WGSA has been scaled up
`to genotype several hundred thousand SNPs in repre-
`sentations of ~300 Mb complexity on a single array50.
`This complexity can potentially be further increased by
`using more sets of restriction enzymes and/or increas-
`ing the average duration of PCR. A drawback of this
`approach is that the ability to select SNPs is limited
`because only a portion of the genome is represented.
`Moreover, that portion is to a large extent randomly
`selected. Nonetheless, high-density arrays of random
`SNPs have been used successfully to identify loci that
`harbour disease-predisposing genetic variants through
`genome-wide association scans51,52.
`
`Whole-genome genotyping. A more global approach to
`genotyping can be accomplished if gDNA can be directly
`hybridized to an array of locus-specific capture probes
`and scored on the array using enzymatic allelic discrimi-
`nation, such as primer extension or ligation. This direct
`approach should allow access to most SNPs in the genome
`and eliminate the multiplexing bottleneck in sample
`preparation, making assay scalability solely dependent
`on array-feature density. Obtaining single-base resolution
`in the context of the sequence complexity and low molar
`concentration that is inherent in gDNA requires an assay
`design with high specificity and sensitivity.
`In whole-genome genotyping (WGG)53,54, specificity
`and sensitivity were achieved in a direct hybridization
`assay by using a combination of design elements (FIG. 2d).
`Sensitivity was greatly enhanced by first amplifying the
`gDNA in a whole-genome amplification (WGA) array55,56
`reaction to effectively increase the molar concentration
`of genomic loci. Specificity was achieved using the com-
`bination of a stringent 50-mer hybridization capture step
`followed by an ‘on-array’ polymerase primer-extension
`step. This combination of elements conferred both locus
`and allelic specificity on the assay. Finally, an array-
`based signal-amplification protocol further increased
`assay sensitivity. Two advantages of WGG are minimal
`constraints on SNP selection, which allows selection of
`maximally informative SNPs such as HapMap tag SNPs,
`and effectively unlimited multiplexing from a single
`sample preparation. The WGG assay has been used to
`develop several high-density SNP-genotyping arrays
`(BOX 1), including two different tag-SNP arrays that allow
`genotyping of over 317,000 and 550,000 tag SNPs on a
`single slide57.
`
`High-resolution SNP-CGH. Array technology can
`also be used to characterize copy-number aberrations.
`Comparative genomic hybridization (CGH) to DNA
`arrays is a powerful approach for detecting chromosomal
`aberrations. In CGH, differentially labelled gDNAs (one
`reference and one subject sample) are co-hybridized to
`an array of probes. Initially, CGH used metaphase chro-
`mosome spreads as the chromosomal yardstick, which
`limited the resolution to 10–20 Mb58. More recently array
`CGH was developed using DNA arrays of cDNA or BAC
`clones, which brought the resolution down to 100 kb59–61.
`High-density oligonucleotide arrays, some with as many
`as 385,000 probes per slide, have also been used for
`CGH62–64. Oligonucleotide probes (~25–85 nucleotides)
`are much shorter than BAC probes (~100kb)
`and therefore their hybridization can be variable and
`sequence-dependent65. However, oligonucleotide arrays
`are easier to manufacture and the short length of oligo-
`nucleotides provides better spatial resolution and locus
`discrimination. Furthermore, any drawbacks with sen-
`sitivity and precision of oligonucleotide probes can be
`minimized by averaging the signal from several probes.
`Oligonucleotide-based SNP-genotyping arrays
`have recently been co-opted for CGH applications to
`measure both physical copy-number aberrations and
`genetic aberrations such as LOH (FIG. 3); we refer to this
`application as SNP-CGH66–70. The ability of SNP-CGH
`
`Whole-genome
`representation
`A representation with a
`sequence complexity that is
`similar to that of the entire
`genome from which it was
`derived.
`
`Reduced-complexity
`genomic representation
`A representation with a
`sequence complexity that is a
`fraction of the original sample
`nucleic-acid complexity. In its
`simplest version, PCR of
`adaptor-ligated or restriction-
`enzyme-digested genomic
`DNA intrinsically generates a
`reduced-complexity
`representation.
`
`DNA-array feature
`An individual resolvable
`element of a DNA array that
`contains a defined sequence.
`This element can be created in
`several ways such as spotting,
`in situ synthesis or deposition
`of beads that harbour
`immobilized DNA sequences.
`
`Tag SNP and tagging SNP
`A tag SNP is defined as a SNP
`that proxies for a set of SNPs in
`linkage disequilibrium with
`itself (that is, they are in the
`same linkage disequilibrium
`bin). A haplotype tagging SNP,
`by contrast, is based on the
`haplotype block concept, in
`which a set of tagging SNPs are
`used to uniquely define the
`variation of all SNPs that reside
`in the haplotype block.
`
`NATURE REVIEWS | GENETICS
`
` VOLUME 7 | AUGUST 2006 | 635
`
`© 2006 Nature Publishing Group
`
`© 2006 Nature Publishing Group
`
`Ariosa Exhibit 1025, pg. 4
`IPR2013-00276
`
`
`
`R E V I E W S
`
`arrays, unlike conventional CGH, to detect copy-neutral
`genetic anomalies such as uniparental disomy (UPD) and
`mitotic recombination is important in understanding
`the aetiology of cancer (for example, see REFS 71,72). The
`second advantage of SNP-CGH arrays is their ability to
`collect allelic information on deletions, duplications and
`amplifications. One recent paper that used SNP-CGH
`describes how most observed amplifications in lung
`cancer arise as a result of monoallelic amplification73. It
`will be informative to see whether particular haplotypes
`are associated with increased incidence of monoallelic
`amplifications, LOH or deletions.
`
`A third advantage of SNP-CGH arrays is their ease of
`manufacture and their intrinsic ability to scale to higher
`feature densities with improvements in array manu-
`facture. This increase in feature density is important
`for oligonucleotide arrays because, as desribed above,
`oligonucleotide probes generally have intrinsically
`higher noise, which necessitates averaging across 5–10
`probes. The current SNP-CGH array densities of more
`than 500,000 SNPs per slide allow an effective resolu-
`tion of less than 50 kb. In the future, arrays with higher
`density will further improve this resolution. In summary,
`the ability of SNP-CGH arrays to make high-resolution
`
`Box 1 | Primer-on-DNA array technologies
`
`The two basic types of array that are used in genomic analysis are ordered arrays and random arrays. Ordered arrays are
`created by spotting or synthesizing known feature elements in a defined pattern on a planar surface. There are several
`methods for creating such ordered arrays including deposition of oligonucleotides with pins or an ink jet printer (see figure,
`part a, left). Alternatively, in situ oligonucleotide synthesis can be used to generate arrays of defined features by local
`delivery of oligonucleotide synthesis reagents or by local deprotection chemistry using photolithography or
`electrochemical-based deprotection (see figure, part a, right).
`Random arrays are created by self-assembly of bead-based feature elements. In this approach, oligonucleotides are
`individually immobilized on beads, pooled and assembled onto a patterned planar substrate (see figure, part b).
`The identities of the assembled beads are subsequently determined by a hybridization-based stepwise decoding scheme156
`that uses sets of combinatorially labelled complements to the bead sequences.
`Both random and ordered arrays can be either universal or locus-specific probes. A universal-capture probe (see
`figure, part c) binds to its complementary (address) sequence that is present in the products of the genomic assay.
`This address sequence creates a one-to-one correspondence between a locus and a particular feature on the array.
`A locus-specific probe (see figure, part d) is used in direct hybridization assays such as gene-expression assays —
`cDNA — or whole-genome genotyping — genomic DNA (gDNA).
`a Ordered arrays
`
`G
`
`G
`
`G
`
`Up to
`70 nt
`
`
`
`A T G AA
`
`C
`G
`
`GC
`
`Repeat
`
`GC
`
`C
`
`C
`G
`
`G
`
`G
`
`b Random array
`
`c
`
`d
`
`Chimeric assay
`product
`
`gDNA or
`cDNA
`
`Universal-capture
`probe
`15–25 nt
`
`Locus-specific
`probe
`26–70 nt
`
`Uniparental disomy
`This rare genetic condition can
`arise constitutionally through
`non-disjunction during meiosis
`that ultimately leads to a
`duplication of a segment or of
`the entire maternal or paternal
`chromosome in the affected
`individual. A form of apparent
`uniparental disomy can arise in
`the course of normal cell
`division (mitosis) through
`mitotic recombination (a rare
`crossover event during mitosis).
`
`636 | AUGUST 2006 | VOLUME 7
`
` www.nature.com/reviews/genetics
`
`© 2006 Nature Publishing Group
`
`© 2006 Nature Publishing Group
`
`Ariosa Exhibit 1025, pg. 5
`IPR2013-00276
`
`
`
`R E V I E W S
`
`measurements of both copy number and LOH will prob-
`ably lead to the replacement of conventional array CGH
`as a standard for measuring genome-wide chromosomal
`aberrations.
`One increasingly important application of genomic
`profiling is to formalin-fixed, paraffin-embedded
`(FFPE) samples. Large archives of annotated FFPE sam-
`ples exist across the world. The DNA from FFPE samples
`shows varying levels of degradation that depend on the
`method of fixation and extraction, and the sample age.
`Analysis of such degraded samples has been challeng-
`ing, especially when amplification is required. Recently,
`several groups have used CGH or SNP-CGH arrays to
`analyse FFPE samples74,75. Extension of this approach
`to platforms that use WGA would be beneficial and
`probably require analysis of paired samples in which
`both the reference and patient sample are amplified in
`the same manner76.
`
`Allele-specific expression (ASE). SNP genotyping can
`also be used to genotype cDNA samples and measure
`allelic transcript abundance or allele-specific expression.
`Several groups have used quantitative SNP genotyping
`to measure allele-specific expression in cDNA77–81. The
`allelic ratio of heterozygous SNPs within a transcript
`provides a measure of the relative expression levels of
`
`the paternal and maternal alleles. An allelic ratio of
`0.5 (1:1) indicates equal expression of both alleles. By
`contrast, if the gene is imprinted and only one allele is
`expressed, the allelic ratio will be 0 or 1.
`In principle, genotyping of cDNA provides a more
`precise measure of ASE than the use of classical gene-
`expression profiling as ‘self-normalized’ allelic ratios
`rather than absolute intensities are measured. Array-
`based genotyping technologies are just starting to be
`applied to quantitative allelic-ratio measurements
`to assess allelic expression82–84. On average, ~20% of
`expressed genes show > 1.5-fold differences in allelic
`abundance for any given individual, and > 50% of the
`genes show ASE across a population85. The origin of
`this heterogeneity in transcript abundance is ripe for
`investigation. Many factors contribute to differences in
`allelic abundance, including polymorphisms that affect
`transcription, RNA processing, nuclear export and mes-
`sage stability86, and the epigenetic state of an allele. As
`such, ASE can be used as a discovery tool for identify-
`ing regulatory SNPs or haplotypes, and it can be used
`indirectly to assess the epigenetic state of an allele or pro-
`moter region. In this regard, the ability to scan the whole
`genome for ASE would be particularly useful, and given
`the technological advances in array-based genotyping
`such arrays might soon be available.
`
`Duplication in HL-60
`
`2.00
`
`1.00
`
`0.00
`
`–1.00
`
`–2.00
`
`1.00
`
`0.80
`
`0.60
`
`0.40
`
`0.20
`
`0.00
`
`18
`
`Deletion in HL-60
`
`B/B
`
`A/B
`
`A/A
`
`2.00
`
`1.00
`
`0.00
`
`–1.00
`
`–2.00
`
`2.00
`
`1.80
`
`1.60
`
`1.40
`
`1.20
`
`0.00
`
`14
`
`Log2 intensity ratio
`
`Allele frequency
`
`Figure 3 | High-resolution genomic profiling on SNP-CGH arrays. SNP-CGH (comparative genomic hybridization)
`arrays collect both intensity and allelic information, allowing two different genomic profiles (plot of SNP parameter as a
`function of genomic location) to be generated. The log2 intensity (LI) ratio measures the intensity of a sample relative to a
`reference intensity and the allele frequency (AF) measures the allele frequency at a particular SNP along the genome. Loss
`of heterozygosity can easily be observed using the AF plot by noting an absence of heterozygotes (A/B). Genomic profiles
`of LI and AF are shown for genomic DNA from the HL-60 leukaemia cell line using a high-density SNP-CGH array (109,000
`loci). The heterozygous deletion on chromosome 14 and the duplication of chromosome 18 can both be seen in the LI and
`AF profiles. The deletion appears as loss of heterozygotes in the AF plot, and the duplication splits the heterozygous
`cluster in the AF plot (the two clusters correspond to 2:1 and 1:2 allele ratios). The dotted line in the LI plot represents
`a 300 kb moving average.
`
`NATURE REVIEWS | GENETICS
`
` VOLUME 7 | AUGUST 2006 | 637
`
`© 2006 Nature Publishing Group
`
`© 2006 Nature Publishing Group
`
`Ariosa Exhibit 1025, pg. 6
`IPR2013-00276
`
`
`
`R E V I E W S
`
`Polony
`Contraction of ‘polymerase
`colony’ that is created by
`growing DNA colonies from
`single DNA ‘seed’ molecules
`through the use of a PCR
`reaction on DNA molecules
`that are diffusely imbedded in
`a polymer matrix that contains
`DNA polymerase, primers and
`appropriate reagents.
`
`BEAMing
`A process of cloning on beads
`in which a library of clones is
`grown on beads through the
`use of compartmentalized
`emulsion PCR. DNA and beads
`are diluted such that, on
`average, only a single bead and
`a single target molecule co-
`occupy a single compartment.
`PCR amplification grows a
`clonal population of molecules
`on the bead starting from the
`single target sequence.
`
`Massively parallel signature
`sequencing
`This enables digital transcript
`counting in a cDNA sample. It
`is accomplished by cloning a
`17–20 base signature
`sequence tag onto micro-
`beads that are subsequently
`fixed in a single layer array in a
`flow cell. The sequence on the
`bead is then read out using a
`ligation-based cycle-
`sequencing assay.
`
`Hig