throbber

`
`Cell, Vol. 21, 627-638. October 1980. Copyright© 1980 by MIT
`
`
`
`Human Fetal Gy-and Ay-Globin Genes: Complete
`
`
`
`Nucleotide Sequences Suggest That DNA Can Be
`
`
`Exchanged between These Duplicated Genes
`
`Summary
`
`Jerry L. Slightom, Ann E. Blechl and
`
`Oliver Smithies
`
`Laboratory of Genetics
`
`University of Wisconsin
`
`
`Madison, Wisconsin 53706
`
`duplicated globin genes varies, but in many cases the
`
`
`
`
`
`products of a pair of duplicated genes within a given
`
`
`species are more like each other than either is like the
`
`
`
`comparable pair of globins in other species. For ex­
`
`
`ample, adult human 8-and {J-globins, which differ in
`
`1 0 amino acid residues out of 146, are more different
`
`than are the fetal human globins G1' and Ay, which
`
`
`differ in only 1 residue, but the adult human 8-and
`
`
`
`/3-globins are still more similar to each other than
`
`
`either is to the adult mouse /3-globins (the minimum
`We present the nucleotide sequences of the Gr­
`
`
`
`
`
`human/mouse difference is 26 residues). The con­
`
`
`and Ay-globln genes from one chromosome (A) and
`
`
`
`ventional explanation of these findings is that during
`
`of most of the Ay gene from the other chromosome
`
`
`evolution the globin genes have increased and de­
`
`(B)of the same individual. All three genes have a
`
`
`
`creased their number by unequal crossovers and/or
`
`
`small, highly conserved Intervening sequence
`
`
`
`duplication events. A high degree of similarity between
`(IVS1) of 122 bp located between codons 30 and
`
`
`
`the members of a duplicated pair within a species is
`
`31 and a large intervening sequence (IVS2) of vari­
`
`
`
`usually taken to indicate a short evolutionary time
`able length (866-904 bp) between codons 104 and
`
`
`since the last duplication occurred. A lack of similarity
`
`
`
`105.A stretch of simple sequence DNA occurs in
`
`between the comparable members of duplicated pairs
`IVS2 which appears to be a hot spot for recombi­
`
`
`in two different species is taken to mean that the last
`
`
`nation. On the 5' side of this simple sequence, the
`
`
`
`duplications occurred after the species had diverged.
`
`
`allelic AY genes differ considerably in IVS2 whereas
`0r and A'Y genes from chromosome A
`
`
`(See Little et al., 1979b, for a discussion of the prob­
`the nonallelic
`
`
`lems of this argument in relation to the fetal globin
`
`
`differ only slightly. Yet on the 3' side of the simple
`genes.)
`
`
`sequence, the allelic genes differ only slightly
`Duplications probably first arise by rare breakage
`
`
`
`whereas the nonallelic genes differ considerably.
`
`
`and reunion events at nonhomologous points on two
`
`
`We hypothesize that the 5' two thirds of the AY gene
`
`
`
`chromosomes, resulting in an unequal exchange be­
`
`
`on chromosome A has been "converted" by an
`
`
`tween them (Muller, 1936; Smithies, 1964). Duplica­
`
`lntergenic exchange to become more like the GY
`
`
`tions may span part of a gene with no intergenic DNA
`
`gene on its own chromosome A than it is like the
`
`
`
`(such as the haptoglobin Hp2 allele; Smithies, Connell
`
`
`allelic AY gene on the other chromosome B. Our
`
`
`sequence data suggest that intergenic conversions
`
`
`and Dixon, 1962) or may involve several genes and
`
`occur in the germ line. The DNA sequence differ­
`
`intergenic DNA (such as the Bar locus in Drosophila;
`
`ences between two chromosomes from a single
`
`
`Bridges, 1 936). The DNA between duplicated genes
`
`
`
`which have recently arisen by a single nonhomologous
`
`
`individual strongly suggest that DNA sequence pol­
`
`
`breakage and reunion event must also be duplicated,
`
`
`
`
`ymorphisms for localized deletions, additions and
`
`
`
`either completely on one side or partly on both sides
`
`base substitutions are very common in human pop­
`ulations.
`of the duplicated genes.
`
`
`These considerations make it difficult to account,
`
`
`
`by events involving duplication followed by unequal
`
`
`
`but homologous crossing over, for the occurrence of
`The loci specifying the amino acid sequences of the
`
`
`
`
`
`closely related genes adjacent to DNA which does not
`
`
`
`globin chains of mammalian hemoglobins usually oc­
`
`
`itself appear to be duplicated. Such cases are known.
`
`
`cur in duplicated nonallelic pairs which frequently
`
`
`For example, the two adult mouse {J-globin genes
`
`
`
`
`differ in their relative expressions at different stages
`
`
`appear to be the result of a relatively recent duplica­
`
`of development (Bunn, Forget and Ranney, 1977).
`
`
`
`tion, since their structural genes readily hybridize with
`
`For example, most humans have two a genes (Orkin,
`
`
`each other to form a DNA heteroduplex (Tiemeier et
`1978); two adult /J-type genes, 8 and fJ (Lawn et al.,
`
`
`
`al., 1978); however, the DNA flanking these two genes
`1978); and two fetal {J-type genes, 0
`-y and AY (Fritsch,
`
`
`does not show sufficient homology to form a hetero­
`
`
`Lawn and Maniatis, 1979; Bernards et al., 1979; Little
`
`
`duplex. The problem is that of understanding how
`
`et al., 1979a; Ramirez et al., 1979; Tuan et al., 1979).
`
`
`duplicated genes can continue to share many species­
`
`
`
`In mice a similar situation exists; some strains have
`
`
`
`specific features while their flanking DNA shows little
`two adult fJ genes which code for different although
`
`or no evidence of once having had a common origin.
`
`
`closely related globins, pmai<x and pm ioor (Tiemeier
`et
`
`
`In this paper we present the complete nucleotide
`
`
`
`al., 1978), while in other strains either the two adult
`
`sequence of the two human fetal globin genes, GY and
`
`fJ genes code for identical products,
`/3". or only one of
`
`Ay, from one chromosome (A), and of most of the A1'
`
`the genes is expressed (Weaver et al., 1979). The
`
`gene from the other chromosome (B) of the same
`
`
`degree of similarity between the products of these
`
`
`
`individual. These sequences appear to provide an
`
`Introduction
`
`SKI Exhibit 2032 - Page 1 of 12
`
`

`

`Cell
`628
`
`mosomes (Dei�seroth et al., 1978) of our diploid do­
`
`example at the molecular level of a mechanism per­
`
`
`
`nor.
`
`
`
`
`mitting the co-evolution of related linked genes without
`the need to involve the DNA between these
`Clone 164.6 is the key clone for chromosome B. It
`
`
`
`was an independent isolate from the same unamplified
`
`
`genes.Thus the data indicate that the Gy-and Ay­
`
`globin genes on a given chromosome can exchange
`
`
`
`collection of in vitro packaged phages as 165.24. We
`
`
`DNA sequences by a recombinational event like a
`
`it as being from chromosome B because its
`identify
`
`
`
`
`restriction map (Figure 1) and several critical se­
`
`
`gene conversion, and that a stretch of simple se­
`
`
`quenced regions (see below) differ from those of
`
`
`quence DNA in IVS2 is a hotspot for initiating these
`165.24.
`
`
`
`exchanges. The sequence data suggest that conver­
`Clone 51. 1 was isolated and initially characterized
`
`
`
`sions take place in the germ line and can occur
`
`
`
`between chromosomes as well as within a single chro­
`
`
`
`
`in earlier studies from this laboratory (Blattner et al..
`
`1978; Smithies et al., 1978, 1979) using DNA from
`mosome.
`
`the same donor as for 1 65.24. DNA sequence data
`
`
`In an accompanying paper {Efstratiadis et al., 1980)
`
`
`presented below show that the sequence of the AY
`
`
`
`we present a detailed analysis of the nucleotide se­
`
`
`gene in 51 .1 is substantially different from that of the
`
`
`quences of the human fl-type globin genes [embryonic
`
`
`AY gene in 165.24, but is identical to the sequenced
`
`
`e (Baralle, Shoulders and Proudfoot, 1980), fetal GY
`
`regions of the AY gene in clone 164.6. Consequently
`
`
`and AY (this paper), adult 8 {Spritz et al., 1980) and
`
`we identify the AY gene in clone 51.1 as being from
`adult fJ (Lawn et al., 1980)], and we compare these
`
`
`chromosome B.
`
`
`sequences with published globin gene sequences
`from other species.
`
`
`
`Results and Discussion
`
`Organization of the Human Fetal Globin Genes
`
`
`
`The restriction sites and strategy used in the DNA
`
`sequencing are shown in Figure 2. The same basic
`
`
`
`strategy was applicable to all three y-globin genes.
`Chromosomal Maps and Clones Studied
`
`
`
`We have isolated clones covering the fetal globin
`
`
`
`Figure 3 presents and compares the complete DNA
`
`
`sequences of the Gy-and Ay-globin genes from chro­
`
`
`region of both chromosomes of a diploid female donor
`
`
`and have found that the chromosomes differ in a
`of the Ay­
`mosome A and most of the DNA sequence
`
`globin gene from chromosome B of our donor.
`
`
`number of places. Restriction enzyme sites used in
`
`
`
`
`defining the two chromosomes, arbitrarily labeled A
`
`
`
`The data obtained from this sequence analysis sup­
`
`
`and B, are presented in Figure 1 together with the
`
`
`
`port our earlier finding (Smithies et al .• 1978) that the
`
`
`code names and extents of the clones. Asterisks
`
`coding region of the human fetal Ay-globin gene is
`
`
`
`emphasize restriction sites which differ between the
`
`
`
`divided into three segments by two noncoding inter­
`
`
`two chromosomes. The coding regions for the GY and
`
`
`vening sequences, and extend this finding to the hu­
`
`
`man fetal Gy-globin gene. The smaller intervening
`AY genes are shown by heavy raised bars. About 14
`kb to the 5' side of the GY gene of chromosome B we
`
`
`
`sequence (IVS 1) interrupts the coding region between
`
`codons 30 and 31, and the larger intervening se­
`
`have identified another globin gene by hybridization.
`
`
`quence {IVS2) interrupts the coding region between
`
`We presume that it is an e-globin gene on the basis of
`codons 1 04 and 1 05. The arrows at the 5' and 3'
`
`
`
`restriction maps and sequence data presented by
`
`boundaries of both IVS1 and IVS2 in Figure 3 point
`
`
`
`Proudfoot and Baralle (1979) and Fritsch, Lawn and
`
`
`
`out a possible splicing frame for the removal of these
`
`
`
`Maniatis (1980), and have labeled it accordingly.
`
`
`intervening sequences which would conform to the
`
`
`Clone 165.24 is the key clone from chromosome A.
`
`
`GT/ AG rule observed at the boundaries of the inter­
`
`
`The restriction map of 165.24 (and the sequence data
`
`
`
`presented below) establish that it contains 14.3 kb of
`
`vening sequences of many other genes (Breathnach
`et al., 1978). The human fetal globin gene intervening
`
`
`DNA which include the coding sequences for the
`with 6y on the 5' side of
`
`
`
`sequences occur in exactly the same positions as
`
`expected two fetal globins,
`
`they occur in adult globin genes from mouse (Konkel,
`
`
`ex­"y. The restriction map of 165.24 agrees within
`
`
`Tilghman and Leder, 1978; Konkel, Maizel and Leder,
`
`al error with the fetal portion of more exten­
`periment
`
`1979), rabbit (van den Berg et al., 1978; van Ooyen
`sive maps of the human fetal and adult globin region
`
`et al., 1979) and human (Lawn et al., 1980; Spritz et
`
`
`
`
`published by Bernards et al. (1979), Fritsch et al.
`
`
`al., 1980). Clearly the difference in expression be­
`
`
`
`
`(1979), Little et al. (1979a), Ramirez et al. (1979) and
`
`
`tween adult and fetal globins cannot be the result
`
`
`Tuan et al. (1979), using DNA from different donors.
`
`either of the fetal genes lacking these intervening
`
`
`These previous studies showed one copy each of the
`
`GY and AY genes per chromosome. Since clone 165.24
`
`
`
`sequences or of their presence in different places in
`the coding region.
`
`contains one copy of each fetal globin gene in the
`IVS1 is 122 bases in length in all three y-globin
`
`
`same DNA environment found in the earlier studies,
`
`genes. The length of IVS2 is different in each of the
`
`
`as judged by the restriction maps, we conclude that
`
`
`three y-globin genes sequenced: 886 bases in the
`
`
`
`165.24 contains DNA corresponding to the entire fetal
`
`GY gene of chromosome A, 866 bases in the AY gene
`
`
`globin region from one of the two relevant #11 chro-
`
`SKI Exhibit 2032 - Page 2 of 12
`
`

`

`
`
`Human Fetal y-Globin Gene Sequences
`
`629
`
`
`
`Chromosome A
`
`E
`I
`
`i(
`
`*
`
`E E Hh BH
`I EE
`I I I ,J ! I
`A..,
`
`165.12
`
`165.24
`
`166.1
`
`242.7
`
`Chromosome B
`E
`
`E
`
`E
`
`E
`
`Th E
`
`B Hh
`f1T Jl(i i A"Y
`E
`JT(
`1i(
`i(
`I
`I
`
`** Th
`* HhHtHh
`
`E
`
`I
`
`51.1
`
`164.1
`
`164.2
`
`164.6
`
`I
`0
`Kbp
`
`I
`2
`
`I
`4
`
`I
`6
`
`I
`8
`
`I
`10
`
`I
`12
`
`I
`14
`
`I
`16
`
`I
`18
`
`I
`20
`
`I
`I
`22 2 4
`
`Figure 1. Maps Outlining Restriction Enzyme Sites Used to Define Chromosomes A and B of the DNA Donor
`
`
`The coding regions of the "y-and •y-globin genes and the general location of a gene presumed to code for E-globin are shown by heavy raised
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`bars. The direction of transcription of the fetal globin genes is shown. Asterisks emphasize restriction sites which differ in the two chromosomes.
`
`
`
`
`
`The brackets under the two chromosome maps show the extents of the clones considered in this paper; their respective code numbers are listed
`
`
`
`
`alongside. The scale is in kilobase pairs. The restriction enzyme sites are (B) Barn HI; (Bg) Bgl II; (E) Eco RI; (Hf) Hin! I; (Hh) Hha I; (Hp) Hph I; (Th)
`Tha I.
`
`5' Flanking and 5' Untranslated Region
`
`
`of chromosome A and 876 bases in the "y gene of
`
`Only one difference occurs in the DNA sequences of
`
`
`
`
`
`
`chromosome B. Preliminary data (not presented here)
`
`
`the 5' flanking untranscribed region and 5' tran­
`
`indicate that the IVS2 in the GY gene from chromosome
`
`
`scribed but untranslated region of the three genes;
`
`B is the largest of the four, having about 904 bases.
`
`this is at position 25, where the "y gene from chro­
`
`
`A difference in the size of IVS2 in the nonallelic Gy­
`
`
`mosome A has an adenine residue whereas the other
`
`
`and "y-globin genes is not surprising, but we did not
`
`two genes have a guanine. Chang et al. (1978) have
`
`
`expect that the lengths of the allelic globin genes
`
`
`
`published a sequence for (presumably mixed G and
`would differ.
`
`
`A)y-globin mRNA; they only report a guanine at this
`
`
`
`position. At position 19 they report both guanine and
`Comparison and Analysis of Sequence Data
`
`
`
`
`cytosine, where we find only guanine. It is not known
`
`
`
`The general similarities and differences in the three
`
`
`y-globin genes are illustrated in Figure 4 by a bar
`
`
`whether these differences are due to genetic poly­
`
`
`diagram showing the distribution of the differences.
`
`
`
`morphisms or to problems in sequencing. In the re­
`
`mainder of the 5' untranslated region of mRNA, the
`
`
`
`Two striking features are revealed by this comparison.
`
`
`two sets of sequence data are in complete agreement.
`
`
`
`First, substantial portions of the three genes have
`
`
`Included in the 5' flanking region are some se­
`
`
`
`
`virtually identical sequences: the 5' flanking and 5'
`
`quences with features common to many globin genes
`
`
`
`untranslated regions, the complete coding sequences,
`
`all of IVS1, and three regions at the ends and middle
`
`
`and to other eucaryotic genes. These sequences are
`
`
`considered in detail in the accompanying comparison
`
`
`of IVS2. Second, in the 3' third of the genes there are
`
`
`paper (Efstratiadis et al., 1980). They include a hex­
`
`
`more differences between the nonallelic ay and "y
`
`
`genes on the same chromosome (hatched areas) than
`
`
`
`anucleotide sequence (AAT AAA) starting 31 bases
`
`before the first nucleotide of the mRNA (overlined
`
`
`
`between the allelic "y genes (unhatched areas); how­
`
`
`sequence in Figure 3) which is similar in sequence
`
`ever, in the 5' two thirds of the genes the distribution
`
`of the differences is reversed.
`
`
`and position to that first recognized by Goldberg
`
`SKI Exhibit 2032 - Page 3 of 12
`
`

`

`Cell
`630
`
`G-,,
`Sac
`I A-,,[
`Pst
`
`Sac RI Mbo
`
`I
`
`I
`
`Ecol
`
`! J
`
`Map of the Restriction
`En­
`Figure 2. Detailed
`Used
`Arrows.
`Shown by Vertical
`zyme Sites.
`g the GY and •y Genes of Clone
`in Sequencin
`165.24 and the "y Gene of Clone 51.1
`The •y above the Psi I site and the ay above
`that each site occurs
`indicate
`the Sac I site
`gene. The directions
`of
`only In the specified
`shown by the solid horizontal
`ar­
`sequencing
`rows apply to all three genes; the dotted ar­
`rows apply only to the genes in 165.24. The
`on the scale
`and negative numbers
`positive
`1 is the
`position
`positions;
`refer to nucleotide
`fetal globin mRNAs. The
`of the
`first adenine
`(common to all three genes)
`coding regions
`bars.
`are shown by heavy
`
`·100 1
`
`500
`
`1000
`
`1500
`
`, ...•..
`� ........ .
`----. ----. ---+-­
`
`•• I
`
`---+--�---
`
`IVS1 DNA sequences,
`A have identical
`chromosome
`histone genes.
`[The hexa­
`(1979) in the Drosophila
`between the IVS1 of
`and there is only one difference
`(AAT AAA) from the 5' side of the
`sequence
`nucleotide
`is at position
`21 O,
`AY genes: this difference
`the allelic
`genes is also the same as the poly(A)
`y-globin
`addition
`conservation
`about in the middle of the IVS. The strict
`signal found about 20 bases before the 3' end of
`that
`again suggests
`sequences
`of these nucleotide
`and Brownlee,
`mRNAs (Proudfoot
`many eucaryotic
`must either
`be
`like the coding region,
`the sequences,
`genes. This identity
`is
`our y-globin
`1976), including
`or else are maintained
`pressure
`under strong selection
`but we cannot exclude the
`coincidental,
`probably
`A result
`form by some other mechanism.
`in identical
`on the 5' side of the
`that these sequences
`possibility
`by Konkel et al.
`to ours has been reported
`similar
`genes are part of the 3' end of mRNAs
`y-globin
`from the mouse pm aior
`(1979) for the IVS1 sequences
`from DNA on the 5' side of the y-globin
`transcribed
`and pmin<>< globin genes; three nucleotide
`differences
`functional
`impor­
`A second region of possible
`genes.]
`of the IVS1 se­
`were found in the 116 nucleotides
`region of many eucaryotic
`tance in the 5' untranslated
`quences of their mouse fJ-globin
`genes.
`out by Ziff
`box," first pointed
`genes is the "capping
`The capping box consists
`of 1 2
`and Evans (1978).
`Region Contains
`a
`The 3' Untranslated
`1 0 of which are on the 5' side of the first
`nucleotides,
`Number of Differences
`Considerable
`of the mRNA. The capping boxes in the
`nucleotide
`of the 3' untranslated
`re­
`sequences
`The nucleotide
`genes all have the same sequence
`three y-globin
`gions of both 0y mRNA (Forget
`et al., 1979) and AY
`in Figure 3.
`which is underlined
`(GCAGTTCCACAC),
`mRNA (Poon, Kan and Boyer, 1978) have already
`0y-and Ay-Globin Coding
`but their
`Both are 90 bp in length,
`been determined.
`Sequences
`Our AY sequence
`at six positions.
`differ
`sequences
`differ­
`that only one nucleotide
`It is most remarkable
`with that of Poon et al. (1978).
`We
`agrees completely
`the coding
`ence occurs in the 438 bases comprising
`between the two nonallelic
`also find six differences
`genes. This
`y-globin
`region of the three sequenced
`is in codon 136, where the 0y-globin
`genes at the same positions
`as in these previous
`codon
`difference
`are in a block just
`Four of the differences
`reports.
`codon
`and the Ay-globin
`for glycine)
`is GGA (coding
`1508-1511
`in
`codon (positions
`after the terminator
`The fact that the DNA
`for alanine).
`is GCA (coding
`1522 and
`Figure 3); the other two occur at positions
`of the GY and AY
`in the coding regions
`sequences
`to the
`are related
`1 583. Whether these differences
`either
`indicates
`clearly
`identical
`genes are otherwise
`during
`of the Gy-versus Ay-globins
`varied synthesis
`are being exerted
`at
`pressures
`that strong selection
`(Bunn et al., 1977) cannot be deter­
`development
`and/or that some type of molec­
`level,
`the nucleotide
`Our GY se­
`available.
`mined from the data currently
`exists whereby these duplicated
`ular mechanism
`from the GY mRNA
`in two positions
`quence differs
`divergence.
`evolutionary
`genes have avoided
`for 0y mRNA has previ­
`The first
`by Forget et al. (1979).
`reported
`sequence
`sequence
`The nucleotide
`1510 (where a G was found
`is at position
`difference
`by Forget et al. (1979) from
`ously been determined
`of our 0y-globin coding
`in the mRNA and we find an A), and the second is at
`A comparison
`cDNA clones.
`position
`1 583 (where a T was found in the mRNA and
`sequence
`(B.
`revised
`with their recently
`sequence
`between
`also differ
`we find an A). These two positions
`no differ­
`reveals
`communication)
`personal
`Forget,
`genes.
`the nonallelic
`ences.
`(1976) have suggested,
`as
`and Brownlee
`Proudfoot
`IVS1 Is Very Conserved
`above, that the hexanucleotide
`mentioned
`(AATAAA)
`about 20 bp 5' to
`addition
`forms a signal for poly(A)
`Comparison
`of the 122 bp IVS1 from the three y­
`in many mRNAs. We
`of poly(A)
`the first nucleotide
`globin
`to be almost
`genes shows their DNA sequences
`find, as did Forget et al. (1979) and Poon et al. (1978),
`as the sequences
`as highly conserved
`of the coding
`0y and AY genes from
`occurs in both GY and AY
`this sequence
`that exactly
`The two nonallelic
`regions.
`
`SKI Exhibit 2032 - Page 4 of 12
`
`

`

`Human Fetal y-Globin Gene Sequences
`
`
`631
`
`7,6,pp
`
`
`
`100
`
`MetGlyHisPheThrGluGluAsplysAlaThrlleThrSerleuTr
`
`pG1yLysVa1AsnVa1GluAspA1aG1yG1yG1uThrleuGlyAr
`
`200
`
`300
`
`
`
`ePheAspSerPheGlyAsnleuSerSerAlaSerAlalleMetGlyAsnProLysValLysAlaHlsGlyLysLysValLeuThrSerleuGlyAspAla
`
`500
`
`gleuleuValValTyrProTrpThrGlnArgPh
`
`400
`
`IleLysHisleuAspAspleulysG1yThrPheA1aGlnleuSerGluLeuHlsCysAspLysLeuHlsVa1AspProG1uAsnPheLys
`
`-56 GGCCGGCGGCTGGCTAGGGATGAAGAATAAAAGGAAGCACCCTTCAGCAGTTCCAC -1
`lg�i�� l
`-----+---------+---------+---------+---------+---------+
`lg�i�i I A°fu1cGcncTGGAACGTCTGAGftTTATCAATAAGc1cc1AGTccAGAcGcCA1GGGTCATTTCACAGA
`GGAGGACAAGGCTAcTATCACMGCCTGTG
`
`---------+---------+---------+---------+---------+--------+----+--------+-__ ..,._ ___ +
`::::�::����:�::��ATGCTGGAGGAGAAACCCTGGGAAlTAGGCTCTG:��CAGGACMGGGAGG:���===�:::::�::�
`l
`Uii�i
`
`AGTCCAGGlfGCTTCTCAGGATTTGTGGCACCTTCTGACTGTCAAACTGTTCTTGTCMTCTCACAGGCTCCTGGTTGTCTACCCATGGACCCAGAGGTT
`l
`
`
`---------+---------+---------+---------+---------+---------+-------t-------------....
`iE�ii
`CAGCTTTGGCMCCTGTCCTCTGCCTCTGCCATCATGGGCAACCCCAAAGTCMGGCACATGGCAAGAAGGTGCTGACTTCCTTGGGAGATGCC
`CTTTGA
`
`
`---------+----+----1------+---------+---------+---------+---------+--------+---------+
`
`ATAAAGCACCTGGATGATCTCAAGGGCACCTTTGCCCAGCTGAGTGAACTGCACTGTGACAAGCTGCATGTGGATCCTGAGAACTTCAAGGTGAGTCCAG
`I
`lg�i�
`
`------·+---------+---------+---------+---------+----------+-------------4--------+
`GAGATGTTTCAGCtCTGTTGCCTTTAGTCTCGAGGCAACTTAGACAACTGAGTATTGATCTGAGCACAGCAGGGTGTGAGCTGTTTGMGATACTGG
`GGT
`I
`
`
`---------+---------+---------+---------+---------+---------+---------+-------------+---------+
`iEii�i
`TGGGtGTGAAGAAACTGCAGAGGACTAACTGGGCTGAGACCCAGTGGfAATGTTTTAGGGCCTAAGGAGfGCCTCljAAAATCTAGATGGACAAyTTTGA
`lg�·�tt i51.! A
`
`
`
`---------+---------+---------+-----------------+---------+-------------
`ATTAGATTfCXGTAGAAAGAACTTTCAyCTTTCCCftATTTTTGTT�!!�GTTTTA
`800
`CTTTGAGAAAAGAGAGGTGGAAATGAGGAAAATGACTTTT�T.
`ig�if
`----+ ---+--------➔---------+---------+-----�------+
`--- ·--------+---------+
`i
`i51.I AAAACATCTATCTGGAGGCAGGACAAGTATGGTCGTTAAAAAGATGCAGGCAGAAGGCA
`TATATTGGCTCAGTCAAAGTGGGGAACTTTGGTGGCCAAACA 900
`lg�·�tt
`I TACATTGCTAAGGCTATTCCTATATCAGCTGGACACATATAAAATGCTGCTAATGCTTCATTACAAACTTATATCCTTTAATTCCAGATGGGGGCAAAGT 1000
`
`Ig�i�i
`TGTGCGCGCGTGTGTTTGTG
`16� 24 G
`1 1100
`A ATGrccAGGGGTGAGGAAcAAnGAAAcArnGGGcrGGAGTAGAn11GAAAGrcAGcrcrGTG1GTGTG1G1G1G1Gc6c6c6c6c6r6r6r::6r6
`16si�!
`-------1--------+---------+---------+---------+---------+---------+-------------+
`
`GTGTGTGt&GCGTGTGTTTCTTTTAACGTtTTCAGCCTACJCATACAGGGTTCATGGTGG�AGAAGATAaCAAGATTTAAATTATGGCCAGTGACTA 1200
`I
`--------+------------+---------+---------+---------+---------+---------+---------+
`lg�ii
`
`
`GTGCTY�iGAACAACTACCTGCATTTAATGGGAA!GCAAAATCTCAGGCTTTGAGGGAAGTTAACATAGGCTTGATTCTGGGTXGAAGCT!GGTGT 1300
`U�ii I
`---------------------+---------+---------+--------+---------+---------+---------+
`1400
`AACA,CTCC�:::� TGTGCTGGTGA�
`
`:�AGTTATCTGGAGGCCAGGCTGG�:=�=!�:=
`�=�=���:::�������:�=�==����=
`I
`lgiii
`
`
`GTTTTGGCAATCCATTTCGGCAAAGAATTCACCCCTGAGGTGCAGGCTTCCTG GCAGAAGATGGTGACTGeAGTGGCCAGTGCCCTGTCCTCCAGATACC
`I
`----+-------+---------+
`------+ ---------+----+---------+---------+---------+-
`fg�ii
`ATCAC--PoLY
`
`
`
`
`
`ACTGAGCt�frGCCCATGArlCAGAGCTTTCMGGATAGGCTTTATTCTGCAAGCAATACAAATAATAAATCTATTCTGCTaAGAG
`A1aSerTrpGlnLysMetVa1ThrA1aVa1AlaSerA1aleuSerSerArgTyrH
`Va1LeuA1aI1eHfsPheG1yLysG1uPheThrProG1uVa1Gln
`
`ig�:�a x
`---+---------+---------+---------+-------:::;---------+---------+--
`---------+---------+------
`1sTer
`of the 0y and AY Genes from Clone 165.24 and of the •1 Gene from Clone 51.1
`
`Figure 3. Nucleotide Sequences
`
`The numbering system is taken from the 0y gene of 165.24,
`the largest of the three sequenced genes, with position 1 corresponding to the first
`
`
`
`
`is that of the •y gene of 165.24;
`
`adenine of globin mRNA to which the cap, 7mGppp, is added {Chang et al., 1978). The fully listed sequence
`in the 0y gene of 165.24 or in the •y gene of 51.1 are shown respectively
`
`nucleotides which are different
`
`above or below the sequence of the 'y
`
`
`
`
`
`
`
`
`
`
`gene of 165.24. Asterisks indicate gaps. Underlined and overlined nucleotides denote regions of possible biological importance {see text). The
`
`
`
`
`
`amino acids printed below the dashed counting line refer to coding nucleotides above the line. The initiator codon is printed as Met, and the
`
`
`
`
`terminator codon as Ter. Arrows indicate splicing sites which conform to the GT/ AG rule (Breathnach et al., 1978).
`
`
`
`
`
`600
`
`700
`
`I.,._ End of clone
`51.1
`
`G 1 Y
`
`1500
`
`LeuLeuG1yAsnVa1LeuVa1Thr
`
`A
`1592
`
`SKI Exhibit 2032 - Page 5 of 12
`
`

`

`Cell
`632
`
`15
`
`10
`L1)
`
`er
`
`LU
`
`!: 5
`0
`
`I0
`
`fiOS) {146;
`I
`1500
`
`IVS2 Contains Conserved, Nonconserved and
`
`l77J G-y V A-y
`� 165.24 . 165.24
`□A'Yv. A-y
`165 24 51.1
`- COOING REGION
`ll ll
`
`point, and no difference in the 80 bp of IVS2 adjoining
`
`
`
`
`the 3' splice point (Figure 3). Conservation of se­
`
`quences around splice points may be important for
`
`
`
`the proper removal of intervening sequences, a pos­
`
`
`
`sibility already pointed out by Konkel et al. (1979),
`
`who made similar observations for the mouse {1"' 810'
`and /f"1"0' genes.
`Figure 4 also shows that the three fetal globin genes
`
`
`have an invariant region of 285 bp close to the center
`
`
`of IVS2 (positions 795-1079, Figure 3). We shall
`
`
`
`consider later a possible explanation of this invariant
`region in the middle of IVS2. The data of Konkel et al.
`111 130) 1:111 11041
`
`(1979) on the mouse adult p-globin genes do not
`I
`I
`1000
`500
`show a similar invariant region.
`SEQUENCE POSITION
`
`
`We expected that the allelic forms of the Ay-globin
`Figure 4. Bar Diagram Illustrating the Distribution of Differences be­
`
`
`
`
`gene from the two chromosomes of our donor would
`
`tween the Nonallelic G')' and 'y Genes and between the Allelic
`Ai'
`
`
`
`be very similar at the DNA sequence level, even in the
`Genes
`
`
`
`intervening sequences, and that the nonallelic globin
`The hatched areas show the differences between the G'Y gene of
`
`
`chromosome A (from clone 165.24) and the AY gene of the same
`
`genes on the same chromosome might be less similar.
`
`
`chromosome (also from clone 165.24). The unhatched areas show
`
`
`
`Although we found this expectation to be correct in
`
`
`the differences between the •y gene of chromosome A (from clone
`
`some parts of the genes, it is not correct in other
`
`
`165.24) and the allelic •y gene of chromosome B (from clone 51 .1 ).
`parts,
`
`
`as clearly demonstrated in Figure 4. On the 5'
`
`
`Base substitutions are counted as one difference; gaps, regardless
`
`
`
`
`of their length, are arbitrarily counted as three differences so as not
`
`
`
`side of the invariant IVS2 sequence the allelic AY genes
`
`
`
`
`to give gaps overemphasis while still indicating that they usually occur
`
`from the two chromosomes unexpectedly show many
`
`
`
`less frequently than substitutions. The horizontal scale shows nucleo­
`
`
`differences in the IVS2 sequence, while in the same
`
`
`
`tide sequence positions. Each bar shows the differences found in
`
`region there are no differences between the nonallelic
`
`
`in 1 00 nucleotides, but the bars have been adjusted
`approximately
`
`0y and AY genes from chromosome A. On the 3' side
`
`
`
`width to coincide with rational boundaries. The coding regions are
`
`
`indicated with the relevant amino acid numbers in parentheses.
`
`
`
`of the invariant region the expected tendency is found;
`
`there are more differences between the IVS2 se­
`
`quences in the nonallelic 0y and AY genes than be­
`genes of chromosome A at the doubly underlined
`
`
`
`positions 1565-1571 in Figure 3, which is 21 bp from
`
`tween the allelic Ay genes.
`
`
`where rnRNA has poly(A) attached. We do not have
`A clue to solving this puzzle is provided by our
`
`
`
`the corresponding data for chromosome 8.
`
`
`finding a region of "simple sequence" DNA (positions
`
`
`1062-1107 in Figure 3) at the 3' boundary of the
`
`
`
`
`
`invariant region of IVS2. The DNA sequences of the
`Simple Sequence DNA
`
`three genes at and near this region of simple sequence
`
`
`
`We have completely sequenced the IVS2 from three
`
`DNA are reproduced in Figure 5. As shown in the
`
`figure, the dinucleotide TG is the most common ele­
`
`
`
`
`of the four y-globin genes of our donor (Figure 3) and
`
`
`
`ment of the simple sequence (large letters in Figure
`
`
`
`(see Figure 7). As have partially sequenced the fourth
`
`
`
`pointed out earlier, the lengths of these IVS2 se­
`
`
`5). It is repeated 19 times in stretches of 11 , 2 and 6
`quences vary between 866 and 904 bp.
`
`
`dinucleotides in the GY gene of chromosome A, 13
`
`
`A comparison of the sequences of IVS2 from these
`
`times in the AY gene of chromosome A in a single
`
`three y-globin genes shows a great deal of homology,
`
`stretch, and 1 7 times in the AY gene of chromosome
`
`
`which was expected from the restriction enzyme site
`
`
`B in stretches of 9 and 8 dinucleotides. The dinucleo­
`
`
`analysis presented in Figure 2. Using the same
`
`tide CG is also repeated in two of the genes (under­
`
`
`lined letters in Figure 5). The simple sequences on
`
`
`method for counting differences as was used for Fig­
`
`
`
`chromosome A can formally be related to each other
`
`
`ure 4, the nonallelic Gy-and Ay-globin IVS2 se­
`
`
`by a deletion resulting from an unequal exchange
`
`
`quences are 98.3% homologous and the allelic AY
`
`
`between the two larger stretches of TG dinucleotides
`
`
`genes are 97 .3% homologous. The homology found
`with the loss of 20 bases.
`
`here between the nonallelic GY and AY IVS2 sequences
`Figure 5 reemphasizes our finding that on the 5'
`
`
`
`is very much greater than the 59% found by Konkel et
`mouse pmajor and pmlnor
`
`side of the simple sequence region the nonallelic
`al. (1979) for the nonallelic
`GY
`
`
`and AY genes are virtually identical (only 1 base sub­
`genes.
`
`
`stitution out of 11 35 bases) whereas the allelic AY
`Figure 4 shows that the IVS2 sequences of these
`
`
`
`
`genes differ considerably (13 substitutions and 2 four­
`
`
`three y genes are virtually identical close to the 5' and
`base gaps). Yet on the 3' side of the simple sequence
`
`3' splice points of IVS2. There is only one base pair
`
`
`region the relationships are reversed, with the allelic
`
`
`difference in the 11 O bp of IVS2 adjoining the 5' splice
`
`SKI Exhibit 2032 - Page 6 of 12
`
`

`

`Human fetal y-Globin Gene Sequences
`
`
`633
`
`1060 1070 1080 1090 1100 1110
`t
`I
`I
`l
`l
`I
`
`+- -- CAGCTcTGTGTGTGTGTGTGTGTGTGTGcGcGcGTGTGrrTGTGTGTGTGTGAGAGCG ---..
`L_ 121333
`---•�
`
`
`+----CAGCTcTGTGTGTGTGTGTGTGTG .. •• ................ JGTGTGTGTcAGCG
`G-y CHROMOSOME A
`13+2 GAPs/1135
`
`
`+ - --CAGCTcTGTGTGTGTGTGTGTGTGcGCGCGcGcGTGTG .. TGTGTGTGTGTGTcAGCG ---• 2/333
`
`A'Y CHROMOSOME A
`
`A 'Y CHROMOSOME B
`
`Figure 5. Detailed Comparison of the Nucleotide Sequences of the Three Sequenced Genes at a Region of Simple Sequence DNA (Positions
`
`
`
`
`
`
`
`
`
`1062-1107) near the Middle of IVS2, and a Summary of the Differences between the Three Genes on Either Side of This Simple Sequence Region
`
`
`
`
`
`
`
`Repeated TG dinucleotides in the simple sequence region are shown in large letters; repeated CG dinucleotides are underlined; asterisks
`
`
`
`
`
`
`
`
`represent gaps. The brackets and fractions indicate the fractional sequence differences on either side of the simple sequence between the
`
`
`
`
`
`
`nonallelic GY and AY genes on chromosome A, and between the allelic •y genes on the two chromosomes. The boxed fractions emphasize that on
`
`
`
`
`the 5' side the nonallelic genes are more similar than the allelic genes, while on the 3' side the allelic genes are more similar.
`
`occurred by strand transfer without isomerization and
`
`
`
`
`genes being very similar (2 base substitutions out of
`
`
`
`
`
`branch migration. We do, however, exclude partici­
`
`333)and the nonallelic genes being more different
`
`pation of chromosome B in the particular exchange
`(12 base substitutions).
`
`which led to the AY gene of chromosome A, because
`
`Because the DNA in the simple sequence region of
`
`of the 0y gene on chromosome B
`
`
`the 0y and AY genes of chromosome A differs in

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket