throbber
@ © 1994 Nature Publishing Group http://www.nature.com/naturebiotechnology
`/REVIEW
`
`Oligonucleotide Arrays:
`New Concepts and Possibilities
`Alexander B. Chetverin and Fred Russell Kramer‘
`
`Institute of Protein Research, Russian Academy of Sciences, 142292 Pushchino, Moscow Region, Russia (e-mail: chetverin@vax.ipr.serpukhov.su).
`]Department of Molecular Genetics, Public Health Research Institute, 455 First Ave, New York, NY 10016 (e-mail: kramer@phri.nyu.edu).
`
`Advances in solid-phase oligonucleotide synthesis and hybridization techniques have led to an incipient
`technology based on the use of oligonucleotide arrays. The inclusion of a large number of oligonucleotide
`probes within a single array greatly reduces the cost of their synthesis and allows thousands of hybridiza-
`tions to be carried out simultaneously. The range of potential applications of oligonucleotide arrays was
`expanded by the realization that nucleic acids can be sequenced by hybridizing them to all possible
`oligonucleotides of a given length. Additional possibilities are offered by novel types of oligonucleotide
`arrays that are capable of parallel sorting, isolating, and manipulating thousands, and even millions, of
`nucleic acid species. Fields, such as site-directed mutagenesis, protein engineering, and recombinant DNA
`technology, would benefit from using these arrays. Further, these approaches could enable the analysis of
`entire genomes by preparing ordered fragment libraries, and by sequencing complex pools of nucleic
`acids, in a novel approach that provides long-range sequence information by generating nested nucleic
`acids and then surveying the oligonucleotides contained in the nested strands. This would allow large
`diploid genomes to be sequenced directly in a completely automated procedure that does not require
`fragment cloning or chromosome mapping.
`
`his paper outlines the prospects of an emerging tech-
`nology based on the use of oligonucleotide arrays.
`The main components of this technology are solid-
`phase oligonucleotide synthesis and nucleic acid
`hybridization.
`Hybridization is a hydrogen-bonding interaction
`between two nucleic acid strands that obey the
`Watson-Crick complementarity rules. All other base pairs are
`mismatches that destabilize hybrids. Since a single mismatch
`decreases the melting temperature (Tm) of a hybrid by up to
`10°C', conditions can be found in which only perfect hybrids
`survive. Hybridization comprises contacting the strands, one of
`which is usually immobilized on a solid support and the other
`usually bears a radioactive or fluorescent label, and then sepa-
`rating the resulting hybrids from the unreacted labeled strands
`by washing the support. Hybrids are recognized by detecting the
`label bound to the surface of the support.
`Oligonucleotide hybridization is widely used to determine
`the presence in a nucleic acid of a sequence that is complemen-
`tary to the oligonucleotide probe. In many cases, this provides a
`simple,
`fast, and inexpensive alternative to conventional
`sequencing methods“. Hybridization does not require nucleic
`acid cloning and purification, carrying out base—specific reac—
`tions, or tedious electrophoretic separations. Hybridization of
`oligonucleotide probes has been successfully used for various
`purposes, such as the analysis of genetic polymorphisms“, diag—
`nosis of genetic diseases5, cancer diagnostics", detection of viral
`and microbial pathogens”, screening of clonesg, genome map—
`ping‘°, and the ordering of fragment libraries”. Hybridization is
`often used in combination with ligation of hybridized probes by
`a DNA ligase12 or their extension by a DNA polymerase”,
`which increases the sensitivity and signal-to—noise ratio, mainly
`due to overcoming the mismatches at hybrid termini that are the
`most difficult to discriminate against”.
`Informationally,
`the difference between conventional
`sequencing methods and oligonucleotide hybridization is analo-
`gous to the difference between reading a text by letters and
`reading it by words. The latter is faster, but requires knowledge
`of all the words. That is why the nucleic acids that are currently
`analyzed by hybridization are those whose sequence is known
`
`and whose presence in a sample is expected. The analysis of
`unknown sequences or unknown sequence variants requires that
`they be hybridized to all possible oligonucleotides, whose num—
`ber N is an exponential function of their length n: N = 4"
`(for example, N = 65,536 when n = 8, and N = 1,048,576
`when n = 10). Such large-scale hybridizations would not be
`feasible if each oligonucleotide had to be synthesized and
`hybridized individually. However, this approach is now a real
`possibility because of the invention of oligonucleotide arrays".
`An oligonucleotide array is comprised of a number of indi—
`vidual oligonucleotide species tethered to the surface of a solid
`support in a regular pattern, each one in a different area, so that
`the location of each oligonucleotide is known. Oligonucleotide
`arrays can be prepared by synthesizing all the oligonucleotides,
`in parallel, directly on the support, employing the methods of
`solid-phase chemical synthesis in combination with site-direct—
`ing masks‘i‘". Such masks direct a particular nucleotide mono-
`mer (A, T, G or C) to react with a predetermined exposed area
`on the surface of the support. Four masks with non—overlapping
`windows and four coupling reactions are required to increase the
`length of the tethered oligonucleotides by one. In each subse-
`quent round of synthesis a different set of four masks is used,
`and this determines the unique sequence of the oligonucleotides
`synthesized in each particular area. The total number of coup-
`ling reactions needed to synthesize all possible n—mers is 4 Xn.
`Thus, all possible octamers can be synthesized on an array in
`only 32 reactions, whereas as many as 524,288 reactions (8 X4”)
`are needed to synthesize them individually. Chemistries have
`been developed17 so that the growing end of the oligonucleotides
`can be either the 5 ’ or the 3’ end. An efficient photolithographic
`technique has been invented“"18 for manufacturing miniature
`arrays containing as many as 105 individual oligonucleotide
`areas per cm2, and there is no fundamental problem in increas—
`ing the density to up to the 1
`'0 areas per cm2 that is now
`achievable in semiconductor fabrication'm.
`
`Thus, a miniature array can contain a large number of oli—
`gonucleotide probes, and all of them can be simultaneously
`hybridized to a nucleic acid sample in one experiment, thereby
`greatly reducing the time required for analysis and eliminating
`the need for the costly synthesis of individual oligonucleotides.
`
`BIO/TECHNOLOGY VOL. 12 NOVEMBER 1994 1093
`
`Ariosa Exhibit 1028, pg. 1
`|PR2013-00276
`
`Ariosa Exhibit 1028, pg. 1
`IPR2013-00276
`
`

`

`A
`
`B
`
`A
`
`.
`.
`Aligning
`the n-mers
`at (n-1)-long
`subsequences
`
`SEET
`G¥¥A
`TAT
`ATC
`TCC
`
`—>
`
`Sequence
`ACGGTTAch
`
`Sequence with repeated (n-1)-mers
`@Tt-ESQGT
`Constituent n-mers
`GCA CAT ATG TGC GCC CCA CAG AGT
`
`Some variants of sequence reconstruction
`Subtragments
`Sequences
`am @TQES (reg @GT —> GCATGCCAGT
`
`fig @Tti‘ti m @GT —> GCCATGCAGT
`
`. © 1994 Nature Publishing Group http://www.nature.com/naturebiotechnology
`each area at a predetermined time when the temperature reaches
`Constituent n-mers
`the T," value of that particular hybrid", and adjusting the surface
`ACG CGG GGT GTT TTA TAT ATc ch
`concentration of the immobilized probes so that the rates of
`hybrid dissociation are independent of base composition”.
`An array can contain a chosen collection of oligonucleotides,
`e.g., probes specific for all known clinically important patho-
`gens or specific for all known sequence markers of genetic
`diseases'5~24. Such an array can satisfy the needs of a diagnostic
`laboratory. Alternatively, an array can contain all possible oli—
`gonucleotides25 of a given length n. Hybridization of a nucleic
`acid with such a “comprehensive” array results in a list of all its
`constituent n-mers, which can be used for unambiguous gene
`identification (e.g., in forensic studies), for determination of
`unknown gene variants and mutations (including the sequencing
`of related genomes once the sequence of one of them is known),
`for finding overlapping clones, and for checking sequences
`determined by conventional methods. Finally, surveying the
`n-mers by hybridization to a comprehensive array can provide
`sufficient information to determine the sequence of a totally
`unknown nucleic acid, as discussed below.
`
`W @Tm @fil fiTQ‘Q Qfifil EST —> GCATGCCATGCCAGT
`
`
`
`
`
`gig ETQEZE m @Tgit‘; ESE @GT —> GCCATGCATGCAGT
`
`QT§§§ $.51 QTQE mg, gar —> CATGCATGCCAGT
`
`@Tfi {tag @TKSR (fig EGT —> CATGCCATGCAGT
`FIGURE 1. Principle of sequencing by hybridization.
`(A) Reconstruction of a nucleic acid sequence from the list of
`its constituent n-mers identified by its hybridization to a com-
`prehensive set ot oligonucieotide probes.
`(B)
`if repeated
`(n-1)-mers are present in a sequence,
`it cannot be recon-
`structed unambiguously. Assembly of the n-mers results in
`multiple subfragments that can be permuted and/or repeated
`in ditferent ways. In these examples, n = 3 in order to simplify
`the illustration. Crosshatches and boxes indicate repeated
`(n-1)-mers.
`
`A
`
`B
`
`A well of I preparative array
`
`
`
`‘
`‘ Dignting DNA with Plti
`Mine-rm_nm_
`—mmm—My-
`Extending the 5' termini
`I'mm”
`:mvxm— r
`
`Melting and hybridizing
`to the preparative may
`
`Hyman-non by m- :' llrrnlnl... [m r
`.
`mm”
`
`Nybddtzmon wow um
`can not ma m ligation
`I
`
`l'
`
`
`
`
`I
`
`a
`Iflm'x'm
`" H
`' “
`
`I'
`r
`
`load- to lip-Item-
`m
`
`I'
`"and IMO“ at m imam copy
`
`,Mvnext
`
`FIGURE 2. An example of sorting nucleic acid strands on a
`preparative array. (A) Structure of the oiigonucleotides immo-
`bilized in a well of the preparative array. (B) Principle of sorting
`the strands from a restriction digest of human DNA by virtue of
`the n-mer (n
`8) located immediately upstream from the
`endonuciease recognition sequence.
`
`Of course, different probes contained in the same array would
`produce hybrids with difi‘erent stabilities because of their different
`base composition. This problem can be overcome in a number of
`ways,
`including carrying out hybridization in the presence of
`tetraalkylammonium salts that largely eliminate the difference in
`the stability of G:C and A:T base pairsm‘, washing the array at
`steadily increasing temperature and collecting the signal from
`
`1094 BIO/TECHNOLOGY VOL. 12 NOVEMBER 1994
`
`Sequencing by Hybridization
`Surveying the n—mers in a nucleic acid is analogous to listing
`the words contained in a text. This would not make much sense
`
`unless we know how the words are connected. Fortunately,
`unlike common words,
`the n-mers in a nucleic acid strand
`overlap one another so that each non-terminal n-mer includes
`the last n-l nucleotides of the preceding n—mer and the first n-l
`nucleotides of the next n-mer. This allows the surveyed n-mers
`to be assembled by overlapping them at their (n—l)—long subse—
`quences [(n-l)-mers] and, thus, to reconstruct the sequence of
`the analyzed nucleic acid (Fig. 1A). This strategy for sequence
`determination has independently been proposed by groups in
`Great Britain‘iz", Yugoslavia”, and Russia28 and is called
`“sequencing by hybridization” (SBH). SBH can surpass con—
`ventional sequencing procedures in a number of parameters,
`including speed, cost, quality of the results, and ease of automa—
`tion. Test sequencing of z 100 nucleotide—long DNA strands by
`hybridization with octamers has demonstrated that the method is
`feasible and is tolerant of occasional hybridization errorsz°~3". It
`has also been shown that costs can be reduced further by
`employing combinatorial arrays in which the individual areas
`contain groups of selected n—mers. This saves array space by an
`order of magnitude, with only a slight loss in resolution‘m.
`However, there is an inherent flaw in the SBH method that
`undermines its advantages. SBH relies exclusively on short—
`range information provided by the sequences of the surveyed
`n-mers, and success in assembling the n-mers is absolutely
`dependent on whether or not their (n-l)-long overlaps are
`unique. Put another way, success is dependent on whether or not
`there are repeated (n—1)-mers in the nucleic acid being analyzed.
`Strand reconstruction terminates at non-unique (n-1)-mers and
`the resulting subfragments can be permuted and/or repeated in
`many different ways without conflicting with the hybridization
`data (Fig. 1B). When n = 8 only 94%, 32% and 0.9% of
`random sequences of 50, 200 and 400 nucleotides in length,
`respectively, can be reconstructed unambiguously. (The remain—
`ing sequences contain repeated heptamers.) The situation is
`even worse with natural nucleic acids, since they usually contain
`more repeats than do random sequences. Utilizing longer
`probes would reduce the ambiguities, but would result in an
`exponential increase in cost. For example, if n is increased from
`8 to 12, then the length of random strands that can be sequenced
`with 95% success increases from 47 to 666 bases (14-fold),
`whereas the number of probes required increases 256—fold.
`Computer simulations demonstrate that the resolvable strand
`length can be increased by a factor of 4 if additional information
`
`Ariosa Exhibit 1028, pg. 2
`|PR2013-00276
`
`Ariosa Exhibit 1028, pg. 2
`IPR2013-00276
`
`

`

`cEfl
`
`5‘I
`J'IC
`
`Second type fragments (sorted on a preparative array)
`
`6-
`ll
`cll
`[l A.
`If]
`5'I7 c! 5.1;...
`I?
`llt
`IA D [it
`ll
`TIC I! fiii-C
`Linked signatures (found together in two wells)
`gC—Ca
`tA—Ac
`tG—Ta
`
`ID [1 Ala'
`ii
`[iii
`ls‘
`
`Cc—gC
`
`Ordering the fragments
`Ca—tA
`Ac-tG
`
`Ta—gA
`
`@ © 1994 Nature Publishing Group http://www.nature.com/naturebiotechnology
`is known, such as the sequences at the strand termini,
`the
`First-type fragments of a DNA (sequenced)
`approximate strand length, or the copy number of each n-mer in
`=le] AIJ‘
`.CI I
`It]
`Al I
`ci'l
`GI I
`the strandmm. However, as the intensity of the hybridization
`Flt!
`I5'
`I IC
`HI
`I IA l‘lt
`I IT
`signal is influenced by a number of factors, such as the base
`Linked lntereite segment signature: (occurring together in one fragment)
`composition, sequence context, and strand secondary structure,
`Cc—gC
`Ca—tA
`Ac—tG
`Ta—gA
`measurements of the n-mer copy number are fraught with diffi-
`culties25~3“. Furthermore, if it were necessary to estimate the
`strand length by gel electrophoresis, the advantage of SBH over
`conventional sequencing methods would be greatly diminished.
`Several methods have been proposed to increase the readable
`strand length without increasing the array size: additional strand
`hybridizations with longer oligonucleotides that extend over
`putative subfragment junctions”; the hybridization of additional
`oligonucleotides that stack with the hybrids formed at ambigu—
`ously positioned n-mers, in order to increase the effective hybrid
`length”; and the analysis of multiple clones of densely overlap-
`ping random subfragments that have different sets of repeated
`(n-l)-mers27. However, all these modifications are cumbersome
`and time-consuming, and deprive SBH of its inherent beauty—
`the ability to provide an instant result and to be easily automated.
`They also do not overcome its main weakness—the inability to
`obtaining long-range sequence information. As will be seen
`below, this problem can be solved by utilizing novel oligonu—
`cleotide arrays and a novel strategy that combines the power of
`SBH with the advantages of classical sequencing methods.
`
`Sorting Nucleic Acids
`In the applications discussed above, oligonucleotide arrays
`are exclusively used as an analytical tool. Recently, we have
`proposed the use of oligonucleotide arrays for preparative pur-
`poses“. A preparative array is larger than an analytical array,
`and its individual oligonucleotide areas are physically separated
`from one another in the same manner as are the wells in a
`
`microtiter plate. The most obvious application for these prepara—
`tive arrays is the sorting of nucleic acids by the identity of their
`constituent oligonucleotides.
`One well of such an array is shown in Figure 2A. In this
`example, the oligonucleotides are tethered to the surface of the
`well by their 5’ ends, and are “binary”, in the sense that they
`consist of two sequence segments. The 3’-terminal segment
`(of length n) is variable (i.e., its sequence is different in each of
`the 4" wells in the array), whereas the 5’—terminal segment is
`constant (i.e.,
`its sequence is the same in every well in the
`array). The constant segments are longer than the variable
`segments, and are pre—hybridized to complementary masking
`oligonucleotides. As shown in Figure 2B, such an array is
`capable of sorting nucleic acids in! their 3’-termini.
`For example, when human DNA (z 3X109 base pairs) is
`digested with restriction endonuclease PstI the result is :1
`million double-stranded fragments of z 3,000 basepairs mean
`length”. These fragments are modified by ligating an oligonu—
`cleotide adapter to their 5’ ends in order to restore the restriction
`recognition sequence and to generate an additional 5’-terminal
`extension. The fragments are then denatured to release single
`strands (whose number is twice that of the fragments), and the
`mixture is hybridized to the entire array. Because of the large
`size of the preparative array, a temperature gradient can be
`applied across its surface resulting in the temperature in each
`well being close to the hybrid T”, value. After washing away
`unbound strands, the array is incubated with a DNA ligase in
`order to join the 3’ ends of the hybridized strands to the masking
`oligonucleotides. This
`restores
`the restriction recognition
`sequence at the 3’ end, and generates an additional 3’ extension.
`The array is then washed at a higher temperature to remove all
`non-ligated strands. At
`this step, strands hybridized to the
`immobilized oligonucleotides at any other site than the 3’ tenni-
`nus are not ligated (Fig. 2B, inset) and are therefore washed
`
`m—n
`m—m
`w—m
`FIGURE 3. Principle of ordering sequenced restriction frag-
`ments by sorting the strands from an alternate restriction
`digest on a preparative array, amplifying the strands to pro-
`duce both direct and complementary copies, and then survey-
`ing the restriction site-tagged n-mers in each well of the array.
`In the diagram, the first-type restriction sites are shown as
`black rectangles, and the second-type restriction sites as
`cross-hatched rectangles. The signature of an intersite seg-
`ment consists of a combination of two n-mers, one being
`tagged to the first-type restriction site (upper-case letters)
`and the other being tagged to the second-type restriction
`site (lower-case letters). In this example n = 1 for simplicity.
`Linkages between the sequenced fragments are identified
`by noting which pairs of intersite segment signatures are
`found together in more than one well.
`
`away. Thus, each strand species from the PstI digest occupies a
`single well in the array, whose immobilized oligonucleotides
`contain a variable segment that is complementary to the n—mer
`located in that strand immediately upstream from the PstI recog—
`nition sequence. Since every possible n—mer occurs among the
`variable segments in a comprehensive array, no strand species is
`lost. Consequently, this sorting procedure results in a complete
`library of human genome fragments. If n = 8, the strands will
`be distributed among 65,536 wells, with a mean of 30 strand
`species in each well (the expected extremes are 10 and 60
`species). With 1 mle m wells, the size of the array would
`be approximately 1 square foot.
`One may wonder whether, given the complexity of a human
`genome digest, conditions can be found that prevent the hybrid-
`ization of strands in wrong wells? The answer is yes. Experi—
`ments with whole-genome hybridization demonstrate the ability
`of allele-specific oligonucleotides to discriminate against single-
`base mismatches“. Furthermore, the specificity of hybridization
`can be increased orders of magnitude by employing a reversible
`hybridization procedure“. In a preparative array, the hybridized
`strands can be released and rehybridized without intermixing
`the contents of different wells. The unbound strands can then be
`
`washed away, and the entire process of release and rehybridiza—
`tion can be repeated. In each cycle the only strands that are
`available for hybridization are those that were hybridized in the
`previous cycle. Thus, the ratio of perfect hybrids to mismatched
`hybrids will increase exponentially as the number of cycles
`increases. The reversible hybridization procedure can be carried
`out both before and after ligation of the sorted strands to the
`masking oligonucleotides.
`Of course, the final amount of each strand will be very low.
`For further use, the sorted strands should be amplified in situ,
`utilizing a polymerase chain reaction (PCR)37 which can be
`initiated by as few as 100 molecules of a template“. First, a
`complementary copy of each strand is synthesized by a DNA
`polymerase, utilizing the immobilized oligonucleotide as a
`primer (Fig. 2B). The array is then washed vigorously under
`
`BIO/TECHNOLOGY VOL. 12 NOVEMBER 1994 1095
`
`Ariosa Exhibit 1028, pg. 3
`|PR2013-00276
`
`Ariosa Exhibit 1028, pg. 3
`IPR2013-00276
`
`

`

`Tami"M
`A Random strand fragmentation
`a
`s man—macaw —:
`rum—mum—:
`rig—ammo —-
`
`Sorting the fragments by their 3‘~terrnlna| n~rnerl
`B‘s—.4—
`
`B Sorting the strands by their lntemal n-mers
`Tomi
`nun-m
`
`itwat
`
`. © 1994 Nature Publishing Group http://www.nature.com/naturebiotechnology
`immediate neighbors. This can be accomplished by cleaving the
`same DNA at different sites with another restriction endonu-
`clease, in order to produce fragments whose sequence overlaps
`the sequences of neighboring fragments from the first digest,
`and then ascertaining which segments of different fragments
`from the first digest are contained in the same fragment from the
`second digest (Fig. 3). Since these segments are bounded by two
`types of restriction sites, we refer to them as “intersite seg—
`ments”. The problem of determining neighboring fragments
`can thus be reduced to identifying the intersite segments that are
`linked to each other by being present in the same fragment from
`a second digest.
`Strands from the second digest are sorted by their 3’ termini
`on a preparative array, as described above, with the cleaved
`restriction sites being restored by terminal extensions. After
`amplification of the strands by PCR to produce both direct and
`complementary copies, the intersite segments are identified by
`determining their 3’-terminal n-mers. This is achieved by
`hybridizing the strands in each well to analytical arrays that
`contain two types of binary oligonucleotides whose constant
`segments (of length m) are complementary to either the first— or
`the second-type of restriction site, and whose variable n—mers
`are located at their 3’ ends. The strands are hybridized to the
`analytical arrays by their (m+n)—10ng sequences, consisting of
`one of the restriction sites plus the n-mer adjacent to that restric-
`tion site (a tagged n—mer). There are two such n-mers in each
`double-stranded intersite segment, and their combination con-
`stitutes its unique “signature.” The signatures of the intersite
`segments are known in advance, since the fragments from the
`first digest have already been sequenced.
`The intersite segments that are linked to each other in the
`fragments from the second digest are identified by listing, for
`each well, all those pairwise combinations of tagged n—mers that
`match known intersite segment signatures, and noting “linked”
`signatures, i.e. , those found together in at least two wells. These
`are the wells where the complementary strands of the corres-
`ponding fragment from the second-digest have been sorted.
`Statistical estimates show that more than 90% of the PstI human
`
`:' MHBQI:
`Synthesis of Immobilized templates
`
`:- m.
`
`"Wc/
`"“‘L‘EE
`I MYIWID:
`:
`an...”
`Amplification of the immobilized copies
`Transcription of the immobilized templates
`by asymmetric PCR
`s'—’u‘\‘lC<M )‘
`a ...,_, - .._vum~'r_i:ir
`"—1Lmuc )-
`S L—"MNN_—'_ ‘12-
`au WM“
`5»:]«
`5: —~:l.
`9
`some...
`1}
`:L:_wmm:
`xL:_M"m:—'
`FIGURE 4. Two strategies for generating nested strands on
`preparative arrays. The first strategy (A) employs a limited
`random fragmentation of the parental strands, and subse-
`quent sorting of the fragments by their 3’-terminal n-mers
`(n = 8). The second strategy (B) involves sorting the parental
`strands by their internal n-mers, and subsequent synthesis of
`complementary shortened copies, utilizing the immobilized
`oligonucleotides as primers.
`
`strong denaturing conditions to remove all non—covalently bound
`material, including the original sorted strands. PCR is then
`carried out simultaneously in each well, utilizing the immobi-
`lized strand copies as templates and two universal primers, one
`being identical
`to the 5’—terminal extension (introduced into
`each strand prior to sorting), and the other being identical to
`the constant segment of the immobilized oligonucleotides. The
`sorting of nucleic acids on preparative oligonucleotide arrays
`can be used in a number of applications“, two of which are
`discussed below.
`Isolation of individual strands. The pools of strands can be
`sorted further to isolate individual strands”. Since the added
`terminal extensions restore the restriction sites at the strand
`
`termini, the double-stranded fragments generated by PCR can
`be re-digested with the same restriction endonuclease to remove
`these extensions, and the sorting procedure can be repeated with
`each pool of strands. Of course, the direct copies of strands that
`were originally sorted into a well will all have identical 3’-
`terminal n—mers. However, the complementary copies (gener—
`ated by PCR) will have different 3’-terrnini and will therefore be
`sorted into different wells. Since the maximum number of com-
`
`plementary strand species in a pool is only 60, the oligonu-
`cleotides in the second-round sorting arrays may have shorter
`variable segments (e.g., n = 4, corresponding to only 256
`wells) and yet ensure that no more than one strand species will
`occur in most wells.
`
`Preparative arrays can also be used for the isolation of all
`cellular mRNAs, after their conversion into cDNAs. As the
`mean number of mRNA species in a human cell is between
`10,000 and 30,000 (ref. 39), most of their cDNAs can be
`isolated by a single round of sorting when n = 8. Although there
`is a high disproportion in the amount of different mRNA species
`in a cell, the amounts of individual cDNAs will be equalized
`upon sorting as a result of PCR amplification.
`Fragment ordering. The ability to prepare complete frag—
`ment libraries with the aid of comprehensive arrays makes them
`an ideal tool for genome analysis, and in particular, for ordering
`fragments that have already been sequenced“. Sequencing of a
`long DNA, by whatever method, includes digesting it, usually
`with a restriction endonuclease, into fragments of no more than
`a few thousand base pairs in length, and then determining the
`sequence of each fragment. The fragment sequences are then
`put into the correct order by determining, for each fragment, its
`
`1096 BiOfTECHNOLOGY VOL. 12 NOVEMBER 1994
`
`genome fragments can be ordered if n : 8. To order the
`remaining fragments,
`the procedure is repeated with other
`restriction endonucleases.
`
`Preparation of Nested Strands
`Another possibility provided by preparative oligonucleotide
`arrays is the generation of a nested set of shortened strands that
`are truncated from one end“~“°. These nested strands can be used
`
`to obtain long-range sequence information that overcomes the
`inherent ambiguity in SBH. There are two basic nesting strate—
`gies (Fig. 4). In both cases, the parental nucleic acid is modified
`to contain a universal 5’-terminal extension, and the generated
`nested strands consist of parental strands whose 3' ends are
`truncated. The 3’ ends of the nested strands are thus variable and
`their 5’ ends are constant.
`
`One nesting strategy includes a limited random fragmenta-
`tion of a parental nucleic acid by nuclease digestion or chemical
`treatment (Fig. 4A). The resulting fragments are then sorted by
`their 3’ termini, essentially as described above for the sorting of
`full—length strands. The biochemical steps employed by this
`strategy (ligation of the 3’ terminus of a fragment strand to a
`masking oligonucleotide, and copying the ligated strand by
`extending the immobilized oligonucleotide) have recently been
`realized”. To obtain direct copies of the nested strands, they are
`amplified in situ by asymmetric PCR“ in which a primer that is
`identical to the 5’—terminal extension is present in excess. The
`only fragments that are amplified are those that have not lost
`their 5’-terminal extensions during the fragmentation step.
`The other strategy requires that full—length strands be hybrid
`
`Ariosa Exhibit 1028, pg. 4
`|PR2013-00276
`
`Ariosa Exhibit 1028, pg. 4
`IPR2013-00276
`
`

`

`A Sequence:
`ATGQIAATQIMCAQTAT
`
`Neste
`
`:
`
`Pos. No.
`
`@ © 1994 Nature Publishing Group http://www.nature.com/naturebiotechnology
`.
`.
`3 -terminal n-mers of the nested strands
`and the n-mers identified in the wells
`,
`ATG TGC GCT Em m AAT ATC 'rcr AAC ACA CAC ACT TAT
`0
`0
`0
`IT 0
`0
`U
`i}
`0
`{1
`U
`U
`0
`atg atg atg atg atg atg atg atg atg atg atg atg atg
`tqc tgc tgc tgc tgc tgc tgc tgc tqc tgc tgc tgc
`get get get gct get get get get gct gct gct
`cta cta cta cta cta cta cta cta cta cta
`taa taa taa taa taa taa taa taa taa tea
`aat aat aat aat: aat aat aat aat aat aat
`etc etc
`atc atc atc atc atc atc atc
`tct tct
`tct: tct tct tct tct tcl:
`aac
`”wean-M
`aac aac aac aac aac
`aca
`aca aca aca aca
`cac
`cac cac cac
`act
`act act.
`“"‘
`tat
`?
`13
`
`rangeotCTA
`
`1
`
`2
`
`a
`
`?
`
`6
`
`7
`
`B
`
`9
`
`10
`
`11
`
`12
`
`dStrands
`atgctaatctaacacTAT
`atgctaatctaacaETA
`atgctaatctaacACT
`atgctaatctaaCAC
`atgctaatctaACA
`atgctaatctJ-XA;
`atgctaat
`athtaa
`atgctaaTCT
`atgctaATC
`atgctPtA‘T
`atgcgh
`:tglgl
`aTGC
`ATG
`
`FIGURE 5. Principle of sequencing by
`nested strand hybridization. (A) Recon-
`struction of a nucleic acid sequence
`that contains repeated (n-1)-mers. The
`data are obtained by generating all pos-
`sible nested strands of the nucleic acid
`‘
`-
`onapreparative array, and then survey
`ing the n-mers In each well of the array
`(n =3). Uniquely occurring n-mers are
`ordered according to the number of the
`n-mers found in their respective wells.
`The range of a repeated n-mer is delim-
`ited, from the 5’ end by the last n-mer
`whose well does not contain that
`repeated n-mer, and from the 3’ end by
`the first n-mer that is not contained in
`the well of the repeated n-mer. (B) Sepa-
`ration ot the n-mers that belong to differ-
`ent strands in a mixture of nucleic acids
`by identifying maximal sets of n-mers
`that are connected with each other. Two
`n-mers are connected if one of them is
`contained in the other’s well.
`
`1
`ATG
`
`3
`2
`TGc ~Gcr
`\
`C'rA
`\
`TAA
`
`s
`AAT
`
`Assembllng the n-mers
`13
`12
`11
`10
`7
`a
`9
`ATC
`TCT
`Aacs ACA—s GAG—ACT TAT
`\
`\
`/
`CTA /CTA
`CTA c'rA
`\
`/
`TAA
`
`CTA
`TAA
`
`CTA
`TAA
`
`CTA
`
`B
`
`Mixture of sequences:
`AGTCAGCTAC
`AGTCAGCTAC
`
`Connections between n-mers
`
`Sets of connected n-mers
`
`AGT
`
`
`
`
`
`GCT
`
`ized to a preparative array whose oligonucleotides consist of
`only variable sequences (Fig. 4B). In this case,
`the strands
`hybridize to the immobilized oligonucleotides by any comple—
`mentary n—mer along their length, whether it is terminal or not.
`Therefore, each strand species hybridizes to many wells in the
`array, each containing an immobilized oligonucleotide that is
`complementary to a difl’erent n—mer in the sequence. The strands
`are then copied by a DNA polymerase, beginning from the
`location where they have hybridized and continuing up to their
`5’—terminal extension, utilizing the immobilized oligonu-
`cleotides as primers. The resulting immobilized templates are
`used to produce in situ multiple DNA or RNA copies (in the
`latter case, a 5’—terminal extension is utilized that encodes an
`RNA polymerase promoter).
`Whatever strategy is used, the fragments amplified in each
`well correspond to a truncated parental strand that contains the
`region between the 5’ end and the n-mer that is complementary
`to the variable sequence of the immobilized oligonucleotide. If
`some n-mer occurs in the strand at several locations, all corres-
`ponding fragments will be generated in the well. Nesting nucleic
`acids on preparative arrays resembles the procedures employed
`in classical sequencing methods“. The difference is that nested
`strands are sorted here according to the identity of their 3'-
`terminal n-mers, rather than being separated by gel electropho—
`resis according to their lengths.
`
`Sequencing by Nested Strand Hybridization
`Unlike standard SBH, where the data are collected in one
`step by hybridizing a nucleic acid strand to all possible n-mers,
`sequencing by nested strand hybridization (SNSH

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket