throbber
ELSEVIER
`
`
`
`Gene 205 (1997) 73-94
`
`GENE
`
`AN INTERNATIONAL. JOURNAL ON
`
`GENES AND GENOMES
`
`Locus control regions of mammalian /3-globin gene clusters: combining
`
`
`
`
`
`
`
`
`
`
`phylogenetic analyses and experimental results to gain functional insights
`
`Review
`
`a,b,*, Jerry L. Slightom C, Deborah L. Gumucio d, Morris Goodman e,
`
`
`
`Ross Hardison
`b,r
`
`Nikola Stojanovic r, Webb Miller
`a Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA
`
`
`
`
`
`
`
`
`
`
`
`
`
`b Center for Gene Regulation, The Pennsylvania State University, University Park, PA 16802, USA
`
`
`
`
`
`
`
`c Molecular Biology Unit 7242, Pharmacia and Upjohn, Inc., Kalamazoo, MI 49007, USA
`d Department of Anatomy and Cell Biology, University of Michigan Medical School, Ann Arbor, MI 48109-0616, USA
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`e Department of Anatomy and Cell Biology, Wayne State School of Medicine, Detroit, MI 48201, USA
`
`
`
`
`
`
`
`
`r Department of Computer Science and Engineering, The Pennsylvania State University, University Park, PA 16802, USA
`
`
`
`Accepted 22 July 1997
`
`Abstract
`
`Locus control regions (LCRs) are cis-acting DNA segments needed for activation of an entire locus or gene cluster. They are
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`operationally defined as DNA sequences needed to achieve a high level of gene expression regardless of the position of integration
`
`
`
`
`
`
`
`
`in transgenic mice or stably transfected cells. This review brings together the large amount of DNA sequence data from the /3-
`
`
`
`
`
`
`
`
`globin LCR with the vast amount of functional data obtained through the use of biochemical, cellular and transgenic experimental
`
`
`
`
`
`
`
`
`
`
`systems. Alignment of orthologous LCR sequences from five mammalian species locates numerous conserved regions, including
`
`
`
`
`
`
`
`
`
`
`previously identified cis-acting elements within the cores of nuclease hypersensitive sites (HSs) as well as conserved regions located
`
`
`
`
`
`
`
`
`
`between the HS cores. The distribution of these conserved sequences, combined with the effects of LCR fragments utilized in
`
`
`
`
`
`
`
`
`
`expression studies, shows that important sites are more widely distributed in the LCR than previously anticipated, especially in
`
`
`
`
`
`and around HS2 and HS3. We propose that the HS cores plus HS flanking DNAs comprise a 'unit' to which proteins bind and
`
`
`
`
`
`
`form an optimally functional structure. Multiple HS units (at least three: HS2, HS3 and HS4 cores plus flanking DNAs) together
`
`
`
`
`
`
`
`
`establish a chromatin structure that allows the proper developmental regulation of genes within the cluster. © 1997 Elsevier
`
`Science B.V.
`
`
`
`Keywords: Hemoglobin; Sequence conservation; Enhancement; Chromatin; Domain opening; DNA-binding proteins
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`1.Expression patterns of mammalian hemoglobin gene
`
`
`
`clusters
`
`
`clusters in birds and mammals. In humans, the /3-like
`
`
`
`
`
`
`globin genes (including pseudogenes denoted by the
`in the array 5'-E-Gy-Ay-if;r,-f>-/3-3'
`prefix if;) are clustered
`The genes that encode the polypeptides of the a2/32
`
`
`
`that covers about 75 kb on chromosome l lp l5.4, and
`
`
`
`tetramer of hemoglobin are encoded in two separate
`5'-(2-if;(l ­
`
`
`the a-like globin genes are in a 4O-kb cluster,
`
`if;a2-if;al-a2-al-0-3', very close to the telomere o f the
`
`
`short arm of chromosome 16. Expression of the a-and
`*Corresponding author. Present address: Department of Biochemistry
`
`
`
`
`
`
`
`/3-like globin genes is limited to erythroid cells and is
`
`
`
`
`and Molecular Biology, The Pennsylvania State University, 206
`
`
`
`balanced so that equal amounts of the two polypeptides
`
`
`
`
`Althouse Laboratory, University Park, PA 16802, USA. Tel.: + I 814
`
`
`
`are available to assemble the hemoglobin heterotet­
`
`
`8630113 ; Fax: + 1 814 8637024; e-mail: rch8@psu.edu
`
`
`
`
`ramer. Expression of genes within the clusters is develop­
`
`
`
`
`Abbreviations: LCR, locus control region; HS, hypersensitive site;
`
`
`
`mentally controlled, so that different forms of
`
`
`
`
`
`HIC, highest information content; DPF, differential phylogenetic foot­
`
`
`
`hemoglobin are produced in embryonic, fetal and adult
`
`print; CACBPs, proteins that bind to the CACC motif; MAR, matrix
`
`
`
`
`
`life (reviewed in Stamatoyannopoulos and Nienhuis,
`
`
`
`attachment region; bHLH, basic helix-loop-helix; MEL, murine
`erythroleukemia.
`1994).
`
`0378-1119/97/$17.00 © 1997 Elsevier Science B.V. All rights reserved.
`
`
`
`
`
`Pl/ S0378-l l 19(97)00474-5
`
`SKI Exhibit 2015
`Page 1 of 22
`
`

`

`74
`
`
`
`
`
`R.Hardison et al. / Gene 205 ( 1997) 73-94
`
`This process of hemoglobin switching is an excellent
`
`
`anthropoid primates, its expression continues and predo­
`
`
`
`
`
`
`model system for increasing our understanding of the
`
`
`
`minates in fetal red cells. The appearance of this new
`
`
`
`molecular mechanisms of differential gene expression
`
`
`
`
`pattern of fetal expression of the y-globin genes coincides
`
`
`
`
`during development. These developmental switches also
`
`
`roughly with the duplication of the genes in primate
`
`
`
`offer new approaches to therapy for inherited anemias.
`
`
`evolution, which leads to the hypothesis that the duplica­
`
`
`
`
`For example, continued expression of the normally fetal
`
`
`tion allowed the changes that caused the fetal recruit­
`
`
`
`HbF (o:2y2) in adults will reduce the severity of symptoms
`
`
`ment (Hayasaka et al., 1993). The /J-globin gene is
`
`
`
`
`of patients producing an abnormal P-globin in sickle
`
`
`expressed after birth in all mammals, but in galago,
`
`
`
`
`cell disease and possibly also in patients lacking sufficient
`
`
`
`mouse and rabbit, its expression initiates and predomi­
`
`
`P-globin (P-thalassemia). An understanding of the
`
`
`
`nates in the fetal liver (arguing that fetal expression of
`
`
`
`molecular basis of globin gene switching will facilitate
`
`
`
`the p-globin gene is the ancestral state). The recruitment
`
`
`
`development of new therapeutic strategies (pharmaco­
`
`
`of y-globin genes for fetal expression in anthropoid
`
`
`
`logical and/or DNA transfer) that continue y-globin
`
`
`
`primates is accompanied by a corresponding delay in
`
`gene expression in adults.
`
`
`expression of the p-globin gene.
`In addition to biochemical and genetic approaches to
`
`
`
`
`
`
`Comparisons of DNA sequences among mammalian
`
`studying regulation of globin genes, phylogenetic
`
`
`
`p-globin gene clusters can reveal candidates for
`
`
`approaches are also highly informative. The detailed
`
`
`
`
`
`sequences involved in shared regulatory functions; these
`
`
`
`
`
`
`
`
`study of globin gene clusters in many mammalian species will be detected as conserved sequence blocks, or phylo­
`
`
`genetic footprints, found in all mammals (Gumucio
`
`
`
`has provided a rich resource of information from which
`
`
`
`
`
`
`
`to glean further insight into not only the evolution of et al., 1992; Hardison et al., 1993). Notable similarities
`
`
`the gene clusters but also their regulation. The p-globin
`
`
`are found in alignments of the proximal 5' flanking
`
`
`
`
`
`regions of the orthologous P-like globin genes, consistent
`
`
`
`gene clusters have been extensively studied in human,
`
`
`with their roles as promoters and other regulators of
`
`
`
`
`the prosimian galago, the lagomorph rabbit, the artio­
`
`
`
`
`expression. In addition, striking and extensive sequence
`
`dactyls goat and cow, and the rodent mouse. Maps of
`
`matches are found at the far 5' end of the gene clusters,
`
`
`these gene clusters are shown in Fig. I, and aspects of
`
`
`in the region that we now recognize as the locus control
`
`
`
`their evolution and regulation have been reviewed
`
`
`region (LCR), which is the dominant, distal control
`
`
`(Collins and Weissman, 1984; Goodman et al., 1987;
`
`
`
`sequence for these gene clusters. Sequence comparisons
`
`
`
`
`Hardison and Miller, 1993). The E:-globin gene is at the
`
`
`can be used also to identify candidates for regulatory
`
`
`
`5' end of all the mammalian globin gene clusters and is
`
`
`
`elements that lead to differences in expression patterns.
`
`
`
`expressed only in embryonic red cells in all cases. In
`
`
`
`In this case, one searches for sequences conserved in the
`
`
`
`most eutherian mammals, expression of the y-globin
`
`
`set of mammals that show a particular phenotype but
`
`
`
`gene is also limited to embryonic red cells, but in
`
`0
`I
`
`Human 13
`
`duplication of y
`
`and
`fetal recruitment '::iii.
`
`Galago 13 �
`
`
`
`L - -- - Rabbit l3 �
`
`
`
`
`
`Ancestral eutherian mammal
`
`20
`I
`
`40
`I
`
`E GyAy lj/Tl cl
`
`60
`I
`
`80
`I
`
`100
`I
`
`120
`I
`
`140 kb
`I
`
`E FF
`cl l3
`E Y lj/Tl
`
`A
`
`EE
`F,A
`E Y ljlcl 13
`
`E E F,A
`
`Goatl3
`
`Mouse 13
`
`E E J E
`y bh0 bh1bh2 bh3 b1 b2
`
`A
`
`F
`
`E E
`
`F,A F,A
`
`Fig. I. Evolution of /3-globin gene clusters in eutherian mammals. The inferred ancestral gene cluster and the branching pathways to contemporary
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`gene clusters are shown. The time of expression during development is indicated beneath the box representing each gene; E, embryonic; F, fetal;
`
`
`A, adult. The boxes for orthologous genes have the same shading.
`
`SKI Exhibit 2015
`Page 2 of 22
`
`

`

`
`
`R Hardison et al./ Gene 205 ( 1997) 73-94
`
`75
`
`2.2. Position-independent expression and enhancement
`
`
`
`
`
`
`cluster
`
`1986). Thus, the LCR marks an open chromatin domain
`
`
`which differ in the species with a different pattern of
`
`
`
`
`
`
`
`for the /3-like globin gene cluster in erythroid cells from
`
`
`
`expression. For instance, such differential phylogenetic
`
`
`
`
`all developmental stages, and functional assays implicate
`
`
`footprints (Gumucio et al., 1994) led to the discovery
`
`
`the LCR in generating this open domain, as described
`
`
`
`
`of a sequence implicated in fetal-specific expression of
`in the next section.
`
`
`the y-globin genes in higher primates (Jane et al., 1992)
`
`
`and a sequence that binds several proteins implicated in
`
`
`
`fetal silencing of the y-globin gene (Gumucio et al.,
`
`
`
`
`
`1994). In this review, we summarize the results of
`As illustrated in Fig. 2, the /3-globin LCR will confer
`
`
`
`
`
`sequence comparisons for both types of regulatory ele­
`
`
`
`high-level, position-independent expression on globin
`ment in the LCR.
`
`
`
`gene constructs in transgenic mice (reviewed in Townes
`
`
`and Behringer, 1990; Grosveld et al., 1993). In the
`
`
`
`absence of the LCR, the human /3-or y-globin gene is
`2.General features of mammalian �-globin LCRs
`
`expressed in only about half of the lines of transgenic
`
`
`
`mice carrying the integrated gene, and expression levels
`2.1. DNase hypersensitive sites 5' to the f3-globin gene
`
`
`are low relative to those of the endogenous mouse globin
`
`
`genes. The lack of expression in many lines of transgenic
`
`
`
`
`mice is presumed to result from negative position effects
`The /3-globin LCR was initially discovered as a set of
`
`
`
`
`
`
`generated by adjacent sequences at the site of integ­
`
`
`dnase hypersensitive sites located 5' to the £-globin gene
`
`
`
`
`ration, which prevent expression of the transgene in
`
`
`(Tuan et al., 1985; Forrester et al., 1986, 1987). At least
`
`
`
`erythroid cells. However, when a large DNA fragment
`
`5 DNase HSs, called HS1-HS5 (Fig. 2), have been
`
`
`
`containing the full LCR is linked to the /3-globin gene,
`
`
`
`
`characterized within the region that provided the original
`
`
`
`all resulting transgenic mouse lines express the gene,
`
`
`
`
`gain-of-function effects described below (Grosveld et al.,
`
`and at a level comparable to that of the endogenous
`
`
`1987), and we will refer to this region with all five HSs
`
`
`
`globin genes (Grosveld et al., 1987). Hence, the negative
`
`as the 'full LCR.' The presence of DNase HSs is
`
`
`
`
`
`position effects are no longer observed, indicating that
`
`
`
`indicative of an altered chromatin structure associated
`
`
`
`
`either a strong domain-opening activity (that overrides
`
`
`
`with important cis-regulatory regions (Gross and
`
`
`
`
`the negative effects of adjacent sequences), or an insula­
`
`
`
`Garrard, 1988). Some of these sites, especially HS3,
`
`
`tor that blocks the effects of adjacent sequences, or
`
`
`
`appear preferentially in erythroid nuclei (Dhar et al.,
`
`both, are present in the LCR. The high level of expres­
`
`
`
`1990), but in contrast to the DNase hypersensitive sites
`
`
`
`sion of the transgene indicates the presence of enhancers
`
`
`
`
`at promoters, all are developmentally stable, i.e., present
`
`in the LCR as well. Both enhancers and LCRs increase
`
`
`in embryonic, fetal and adult red cells (Forrester et al.,
`
`0
`I
`
`20
`I
`
`40
`I
`
`60
`I
`
`DNase HSs
`
`80 kb
`
`1Expr!:!SS!:!d in Develgpmental
`
`!;!l(th[QiQ
`f.QfilllQD.
`� Chromatin
`red cells
`RegulatiQn
`
`E
`
`E
`
`Yes
`
`No Open
`
`No
`
`Yes? Closed
`
`GyAy 'lfll 6 13
`Human
`♦
`Yes
`ttttt
`13-globin�:�9.�:l
`I II □� I
`gene cluster
`GyAy 'lfll 6 13
`Hispanic
`No
`(y6l3) thalassemia
`I
`I II
`0--§
`
`In transgenic mice:
`13
`Sometimes Yes Yes Sometimes
`open
`I
`13
`
`.
`
`Yes PrecociousNo Open
`expression
`
`E
`
`GyAy 'lfll 6 13
`♦
`No Open
`Yes
`Yes
`ttttt
`I II 0--{J I
`�=�=l
`
`
`
`
`
`
`
`Fig. 2. Summary of the major effects of the P-globin locus control region.
`
`SKI Exhibit 2015
`Page 3 of 22
`
`

`

`76
`
`
`
`R Hardison et al. / Gene 205 ( 1997) 73-94
`
`2.5. Developmental regulation
`
`2.3. Copy-number dependent expression
`
`the probability that a locus will be in a transcriptionally
`
`semia deletion, which removes HS2 through HS5
`
`
`
`
`
`
`competent state without affecting the transcription rate
`
`
`(Fig. 2), not only leaves the locus in a closed chromatin
`
`
`
`in a cell actively expressing that locus (Walters et al.,
`
`conformation but also delays the time of replication
`
`
`
`1995, 1996; Wijgerde et al., 1996). This further argues
`
`
`from early to late in S phase in erythroid cells (Forrester
`
`that one of the major functions of the LCR is to open
`
`
`
`et al., 1990). Replication of the /3-globin gene locus
`
`
`a chromatin domain around the locus in erythroid cells.
`
`
`
`normally initiates just 5' to the /J-globin gene (Kitsberg
`
`et al., 1993), which is 50 kb 3' to the LCR. Surprisingly,
`
`In fact, deletion of most of the LCR but not the /3-
`
`
`globin genes, e.g., as occurs in Hispanic (yb/3)-thalas­
`
`
`
`
`chromosomes with the Hispanic thalassemia deletion no
`
`
`
`
`semia, leaves the gene cluster in a chromatin conforma­
`
`
`
`longer use the normal replication origin, even though it
`
`tion that is inaccessible to DNase I, and the globin
`
`
`
`
`is intact, but instead use an origin located 3' to the /3-
`
`
`
`genes are not expressed (Forrester et al., 1990). Thus,
`
`globin locus (Aladjem et al., 1995).
`
`
`this loss-of-function analysis also shows that the LCR
`
`
`
`is necessary for the establishment and maintenance of
`
`
`
`an open chromatin domain within which the globin
`
`genes are expressed (Fig. 2).
`The effects of the LCR, if any, on developmental
`
`Minimal DNA sequences that confer position-inde­
`
`
`
`regulation are more complicated to analyze. Several
`
`
`pendent expression of a linked /3-globin gene in
`
`
`
`lines of evidence show that sequences proximal to the
`
`
`
`genes are sufficient to specify expression at a given
`
`
`
`transgenic mice have been determined in regions around
`
`
`
`
`the sites of strong DNase cleavage (reviewed in Grosveld
`
`
`
`developmental stage. In the absence of an LCR, human
`
`
`et al., 1993). These regions are referred to as the
`
`
`
`
`/3-like globin genes are expressed at the 'correct' develop­
`
`
`'hypersensitive site cores' for HSI, HS2, HS3 and HS4.
`
`
`mental stage in transgenic mice, i.e., mimicking the
`
`
`expression pattern of the orthologous endogenous
`
`mouse genes (summarized in Trudel and Costantini,
`
`
`1987). In fact, developmental switching can occur
`
`
`
`
`Transgene constructs that confer full protection from
`
`
`
`between human y-and /3-globin genes in transgenic mice
`
`
`in the absence of an LCR (Starck et al., 1994 ), demon­
`
`
`
`position effects should not be affected by any adjacent
`
`
`
`sequences. Thus, when the construct is integrated in
`
`
`strating that the LCR is not essential for switching.
`
`
`Point mutations in the promoter of the human y-globin
`
`
`
`multiple copies, as is frequently the case in transgenic
`
`
`
`
`mice lines and in stably transfected cultured cells, each
`
`
`
`genes are associated with prolonged expression in the
`
`
`
`copy should be expressed independently of other copies,
`
`
`
`
`adult stage, i.e., hereditary persistence of fetal hemoglo­
`
`
`
`resulting in a level of expression that increases linearly
`
`
`bin (reviewed in Stamatoyannopoulos et al., 1994).
`
`with the number of copies. This 'copy-number-depen­
`
`
`
`Detailed studies of the human E-and y-globin genes in
`
`
`dent' expression has been observed in some cases with
`
`
`
`constructs also containing LCR fragments have revealed
`
`
`
`
`particular fragments of the /3-globin LCR (Talbot et al.,
`
`
`sequences extending up to about 0.8 kb away from the
`
`
`
`1989), as well as with the chicken /3/E-globin enhancer
`
`
`gene that have both positive and negative effects on
`
`
`
`developmental control (Stamatoyannopoulos et al.,
`
`
`
`
`(Reitman and Felsenfeld, 1990). Other experiments with
`
`1993; Trepicchio et al., 1993; Li and Stamatoy­
`
`
`fragments of the /3-globin LCR do not show a clear
`
`
`annopoulos, 1994b; Trepicchio et al., 1994 ). Recent
`
`
`dependence on copy-number (Ryan et al., 1989), and
`
`
`studies in transgenic mice show that the human y-globin
`
`
`
`
`occasional studies show inverse relationships between
`
`
`copy number and level of expression (Morley et al.,
`
`
`
`
`gene is expressed fetally, whereas the orthologous galago
`
`
`
`y-globin gene is expressed embryonically, in the context
`
`1992; TomHon et al., 1997). Although the minimal
`
`
`
`
`of an otherwise identical transgene construct (TomHon
`
`
`
`sequences that will achieve full dependence on copy
`
`
`
`et al., 1997). This recapitulation of developmental speci­
`
`
`number are not yet known, this property appears to
`
`
`
`ficity shows that the dominant determinants of develop­
`
`require sequences from both the LCR and the gene
`
`
`mental timing are encoded by nucleotide differences
`
`proximal region (Lloyd et al., 1992; Fraser et al., 1993;
`
`
`
`
`
`within the 4.0-kb fragment containing the y-globin gene.
`
`Li and Stamatoyannopoulos, 1994b ). For the y-globin
`
`
`
`
`Although developmental switches in expression can
`
`
`
`gene, copy-number dependence requires both sequences
`
`
`
`occur in the absence of the LCR, it is still possible that,
`
`
`3' to the y-globin gene and one or more elements in the
`
`HS cores (Stamatoyannopoulos et al., 1997).
`
`
`
`when present, the LCR participates directly in develop­
`
`
`
`mental regulation (e.g., Stamatoyannopoulos, 1991;
`
`
`
`Wijgerde et al., 1996). Addition of the LCR to a single
`
`human /3-or y-globin gene will alter developmental
`
`
`
`
`control (Fig. 2), leading to precocious expression of the
`In addition to the strong effects of the /3-globin LCR
`
`
`
`
`
`
`/3-globin gene in embryonic red cells and expression of
`
`
`
`on chromatin opening and enhancement of expression,
`
`
`the y-globin gene in fetal and adult stages (Enver et al.,
`
`the LCR also has a dominant effect on the regulation
`
`
`1989; Behringer et al., 1990). Inclusion of bothy-and
`
`
`
`
`of replication in the locus. The Hispanic (yb/3)-thalas-
`
`
`
`
`
`
`
`2.4. Replication of the locus
`
`SKI Exhibit 2015
`Page 4 of 22
`
`

`

`
`
`R Hardison et al./ Gene 205 ( 1997) 73-94
`
`77
`
`2.6. Models for LCR action
`
`cells. This could occur indirectly, with recognition of
`
`
`
`
`
`
`{1-globin genes will improve the developmental switch­
`
`
`specific sequences in the LCR by trans-activator proteins
`
`
`
`ing, leading to a model of competition between promot­
`
`such as members of the APl family of proteins and
`ers for the LCR (Enver et al., 1990). The order of
`
`
`
`recruitment of chromatin remodeling and/or histone
`
`
`
`multiple globin genes in LCR-containing constructs also
`
`
`
`modifying activities by specific interaction between these
`
`
`
`influences their regulation (Hanscombe et al., 1991;
`
`
`enzymes and the trans-activator. For instance, the
`
`
`Peterson and Stamatoyannopoulos, 1993). Although
`
`
`
`co-activator proteins CBP and P300 are histone acetyl
`
`
`these data can be explained by a competition model,
`
`
`
`transferases and also interact with APl (Ogrysko et al.,
`
`
`the apparent loss of developmental control seen in the
`
`
`
`1996). In addition, some DNA sequences in the LCR
`
`
`presence of an LCR could result from the increased
`
`
`
`
`could recruit chromatin remodeling and modifying activ­
`
`
`
`sensitivity of the assays, and the effects of additional
`
`ities directly.
`
`
`genes in the construct can be explained by gene order
`Several other issues remain unresolved. For instance,
`
`
`
`
`
`effects (such as transcriptional interference from the
`
`upstream gene) as opposed to proximity to the LCR
`
`
`the LCR could influence all or several of the genes in
`
`
`
`
`the locus at once (Bresnick and Felsenfeld, 1994; Martin
`
`(Martin et al., 1996).
`
`
`et al., 1996) or it could serve to activate expression of
`The effects that led to models of competition in
`
`
`
`one gene at a time (Wijgerde et al., 1995). If the LCR
`
`
`
`developmental regulation are seen primarily for the
`
`
`does influence predominantly one gene at time, it could
`
`
`regulation of the human {1-globin gene. The €-globin
`
`
`
`do so by interaction directly with the target gene with
`
`
`(Raich et al., 1990; Shih et al., 1990), a-globin and (­
`
`
`
`
`looping out of DNA between this distal regulator and
`
`
`globin (Pondel et al., 1992; Liebhaber et al., 1996) genes
`
`the proximal regulatory elements (Grosveld et al., 1993)
`
`
`
`are autonomously regulated during development in the
`
`
`or the positive effect of the LCR could 'track' along the
`
`
`
`
`presence of LCR-like elements, and constructs contain­
`
`
`
`DNA to the target gene (Tuan et al., 1992). Neither the
`
`
`
`ing larger LCR fragments with the y-globin gene also
`
`
`
`
`molecular targets of the direct interactions (in the former
`
`
`
`show autonomous regulation (Dillon and Grosveld,
`
`
`model) nor the molecular basis of the tracking effects
`1991 ).
`
`
`'tracking' (in the latter model) are known. For instance,
`
`
`
`could involve movement of transcription factors along
`
`
`the DNA, or it could result from spreading of the active
`
`chromatin domain down the locus.
`
`
`
`Many studies are consistent with the hypothesis that
`
`
`several DNase HSs in the LCR work together in a
`
`
`
`
`holocomplex to generate the several effects enumerated
`
`
`
`above. One explicit model stating that each HS has a
`
`
`predominant effect on only one specific gene in the
`DNA sequences of much of the {1-globin LCR are
`
`
`
`
`
`
`cluster (Engel, 1993) can be excluded since deletions of
`
`
`
`
`now available from several mammalian species, includ­
`
`
`
`
`single HSs in the context of entire gene clusters either
`ing human (Li et al., 1985; Yu et al., 1994), galago
`
`
`have little effect or affect expression of all the genes in
`
`
`
`(Slightom et al., 1997), rabbit (Hardison et al., 1993;
`
`the locus (reviewed below). Indeed, removal of any
`
`
`Slightom et al., 1997), goat (Li et al., 1991) and mouse
`
`
`
`single HS makes the entire human {1-globin gene cluster
`
`(Moon and Ley, 1990; Hug et al., 1992; Jimenez et al.,
`
`
`
`more sensitive to position effects in transgenic mice
`
`
`
`
`1992). The remainder of this review will discuss insights
`
`
`
`(Milot et al., 1996), arguing that this defining property
`
`
`
`into the regions required for LCR function based on
`
`
`of the LCR requires all of the HSs. This result contrasts
`
`
`
`patterns of conservation revealed by a simultaneous
`
`
`
`with the implications of reports on the ability of indivi­
`
`
`
`alignment of these DNA sequences (Slightam et al.,
`
`
`dual HSs to provide position-independent, copy-number
`
`1997). Key features of the LCRs from the different
`
`
`
`
`dependent expression (e.g., Fraser et al., 1993), and the
`mammals are mapped in Fig. 3.
`
`
`
`molecular basis for this apparent discrepancy is not
`
`
`
`clear. Functional interactions between the HSs have
`
`3.1. Conservation of number and order of HSs
`
`
`been demonstrated, but require DNA sequences inside
`
`
`
`and outside the core HSs (reviewed below). Thus,
`All of the known mammalian /1-globin LCRs have
`
`
`
`
`
`
`although several individual HSs do exhibit substantial
`
`segments homologous to HSI, HS2 and HS3 (Fig. 3).
`
`
`function alone, it is most likely that they normally
`
`
`
`HS4 is likely present in all these species as well, although
`
`
`
`interact in a holocomplex (Ellis et al., 1996) that encom­
`
`
`
`the currently available goat sequence does not include
`
`
`passes a substantial amount of DNA.
`
`
`
`the region corresponding to HS4. Homologs to human
`
`The ability of the LCR to open a chromosomal
`
`HS5 are found in galago (Slightam et al., 1997) and
`
`
`domain suggests that it recruits chromatin-remodeling
`
`mouse (A. Reik, M. Bender and M. Groudine, pers.
`
`activities such as SWI/SNF (Cote et al., 1994; Peterson
`
`
`
`commun.), suggesting a wide distribution of HS5 as
`
`
`and Tamkun, 1995) and/or histone acetyl transferases
`
`
`well. If HS5 is present in rabbit, it does not occur in
`
`
`(Brownell et al., 1996) to this locus, but only in erythroid
`
`
`the same place in human or galago. Thus the presence
`
`
`
`
`
`3.Sequence analysis of mammalian P-globin LCRs
`
`
`
`SKI Exhibit 2015
`Page 5 of 22
`
`

`

`78
`
`
`
`
`
`R Hardison et al. / Gene 205 ( 1997) 73-94
`
`0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 22000
`
`HSS
`
`HS4
`
`HS3
`
`HS2
`
`HS1
`
`Human
`
`I ►I ◄ I ► ►'6inhum.► �
`
`• ..
`
`f"globi+
`
`◄
`
`0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 22000
`
`-globin
`
`Galago
`
`0
`
`2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 22000
`
`0
`
`2000 4000 6000 8000 10000 12000 14000 16000
`
`not sequenced
`in goat
`
`� I � � � I ���B-
`
`10000 12000 14000 16000
`18000 20000 22000
`2000 4000 6000 8000
`
`L1Md
`
`L 1Md >--------it--globin
`
`Goat
`
`Mouse
`
`0
`
`I- ,,_---I
`
`
`
`distance = 6.5 kb in a allele (BALB/c)
`
`5.0 kb in b allele (C67BU6J)
`
`Fig. 3. Mammalian p-globin LCRs. Maps of the p-globin LCRs of human, galago, rabbit, goat and mouse show the positions of HS cores in
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`humans, their homologs in other species, positions and identities of repeats, and the new regions sequenced in rabbit (double-arrowed lines under
`
`
`
`
`
`
`the rabbit p-globin LCR map). The HS cores are shown as boxes with distinctive fills, long interspersed repeats (Lis) are open arrowed boxes,
`
`
`
`
`
`
`
`
`
`
`and short interspersed repeats are triangles (in the latter two cases, the icon points in the direction that the repeat is oriented). Short repeats are
`
`
`
`
`
`
`
`
`
`in mouse. in goats, and Bl repeats Alu repeats in humans, both type I and type II Alu repeats in galago, C repeats in rabbits, Nia and D repeats
`
`
`
`
`An insertion between positions 14 419 and 14 599 of galago does not match any known short or long repeats, and it may represent a newly
`
`
`
`
`
`
`
`discovered repeat. An insertion of 81 bp that begins at position 3614 of galago is a novel short insertion sequence.
`
`
`
`3.2. Conserved sequences within the LCR
`
`
`
`
`
`of four (HSl-4) and possibly all five major HSs is
`
`Fig. 4. The information content reflects both the amount
`
`
`
`
`
`
`conserved in these eutherian mammals. This conserva­
`
`
`
`of variability in a column in a multiple alignment as
`
`
`
`well as the base composition for the sequences being
`
`
`
`tion extends even further back in evolutionary time,
`
`
`
`
`aligned, and provides a finely graded function for meas­
`
`with at least LCR HS2, HS3, HS4, and possibly HSI
`
`uring conservation. The second method simply finds
`
`
`being found in Australian marsupials and monotremes
`
`
`(R. Baird, J. Kuliwaba, R. Hope, M. Goodman et al.,
`
`
`runs of exact matches; Fig. 4 plots positions with seven
`
`or more consecutive invariant columns such that
`
`personal communication).
`
`
`
`sequences from some minimal number of species align
`
`
`(four in one case, three in the other). A third method
`
`
`
`(Stojanovic et al., 1997) was devised to better reflect
`
`
`
`
`matches found at protein binding sites. Specifically,
`We used the program yama2 (Chao et al., 1994) to
`
`
`
`Fig. 4 identifies all runs of six or more columns possess­
`
`compute a simultaneous alignment of the available
`
`
`
`
`ing a plausible consensus sequence, i.e., each row in that
`
`
`mammalian {3-globin LCR sequences. We then used
`
`
`region can have at most one mismatch with the (a priori
`
`
`three different approaches to search for conserved
`
`
`
`unspecified) consensus. This requirement mimics the
`
`
`
`
`sequences at a variety of criteria (Slightom et al., 1997;
`
`
`
`documented ability of some proteins to bind equally
`
`
`
`Stojanovic et al., 1997). The first method computes the
`
`
`well to similar but not identical sequences. For instance,
`
`information content
`
`of each column (Schneider et al.,
`GATAl binds to AGATAA or to TGATAG, which each
`
`
`1986); the positions of the 10 and 30 blocks with the
`
`
`
`highest information content (HIC) are displayed in
`
`differ in only one position from AGATAG.
`
`
`SKI Exhibit 2015
`Page 6 of 22
`
`

`

`
`
`R.Hardison et al./ Gene 205 ( 1997) 73-94
`
`
`
`79
`
`HS2
`
`; H$1
`
`HS4 HS3
`HS cores
`HSs
`10 HIC
`30 HIC
`1=7, n=4
`1=7, n=3
`1 mismatch
`DPF
`AP1/NFE2
`CACBP
`GATA
`TATTT/ATTTA
`
`clusters that some single-copy regions do not align
`
`
`
`
`
`
`(Hardison et al., 1994; Hardison and Miller, 1993). One
`1f11 I II' 11111;
`
`
`
`
`
`notable example is the intergenic region between the c5-
`I II I I I
`'
`II 11 1 11111 II II II 1111 I I
`
`and p-globin genes in mouse vs. human comparisons
`I Ill 'I II I I I
`
`
`
`(Hardison et al., 1997). This shows that in the time
`1111111 1111 Ill 'I II 11 II I I Ill I
`
`
`
`since the ancestors to rodents and primates separated,
`11 I I HIIIIIIIH 11111111111 11111 Ill 111111
`
`
`
`
`1 II
`I, 11 I I 11 II ,
`
`
`some sequences in this locus (presumably those not
`I
`I
`
`
`
`
`necessary for function) have diverged extensively. Thus,
`I I II I I I
`
`
`the phylogenetic footprints in the LCR are indeed
`I
`II I II 11 I
`1111 II II 1111 i 1 1I
`
`
`candidates for functional sequences.
`
`4000 6000 8000 10000 12000 14000 16000
`position
`
`3.3. Correspondence between DNase HSs and conserved
`
`
`
`
`blocks
`
`3.4. Repeated, conserved sequence motifs in the LCR and
`
`Fig. 4. Positions of selected features revealed by the multiple sequence
`
`
`
`
`
`The positions of DNase HSs are also plotted in Fig. 4.
`
`
`
`
`alignment. The positions of the HS cores are shown on the top line.
`
`
`Several HSs are reported around each of the cores.
`
`
`Reported positions of DNase HSs are on the second line. The next
`
`
`Although some of this heterogeneity simply reflects
`
`
`
`
`five lines show the positions of conserved sequences, as detected by
`
`
`
`multiple reports of the same HS, some of it results from
`
`
`
`
`
`three different methods. Differential phylogenetic footprints (DPFs)
`
`
`
`
`are on line 8. Conserved matches to consensus binding sites for the
`
`
`
`a wide distribution of cleavage. For instance, the regions
`
`
`indicated proteins are shown on lines 9-11. The last line shows posi­
`
`
`
`around HS3 and HS2 have DNase cleavage sites outside
`
`
`one mismatch tions of matches between the
`(line
`unspecified consensus
`
`
`the minimal cores (Philipsen et al., 1990; Talbot et al.,
`
`7)and the motifs TATTT or ATTTA. GenBank entry HUMHBB
`
`1990). DNase HSs have been mapped at approximate
`
`
`
`begins at 2688 in the current human sequence file.
`
`
`positions 6200 (Stamatoyannopoulos et al., 1995) and
`
`6500 (Tuan et al., 1985), which are about 1000 bp 5' to
`These various methods for finding conserved segments
`
`
`the HS3 core. This is the same region that displays
`
`
`
`
`
`multiple conserved sequence blocks, showing a good
`
`
`
`
`produce generally congruent results, with substantial
`
`congruence between HS mapping and conserved
`
`
`
`overlap in the blocks detected by each of the methods.
`
`
`This indicates that the combination of the various
`
`
`sequences not only in the cores but far outside them
`
`
`
`
`methods for finding conserved sequences is quite robust.
`as well.
`
`
`
`As expected, all three methods find strongly conserved
`
`
`blocks within the HS cores, as well as juxtaposed to
`
`
`
`proteins that bind to them
`
`
`
`them (in particular, a phylogenetic footprint located just
`
`3' to the HS4 core and an API binding site immediately
`
`
`5' to the HS3 core). In addition, some, but not all, of
`The distribution of conserved binding sites for some
`
`
`
`
`
`
`the regions between the cores are conserved, with some
`
`
`
`
`prominent proteins involved in globin gene regulation
`
`
`
`
`phylogenetic footprints as strongly conserved as those
`
`
`
`
`was determined by searching for matches between the
`
`
`
`
`in the HS cores. Notable conserved regions are as much
`
`
`
`
`consensus sequence for the protein binding sites and the
`as 1000 bp 5' to and 3' to the HS3 core and also 5' to
`
`
`
`
`'unspecified consensus' computed allowing one mis­
`
`
`the HS2 core. Interestingly, a conserved sequence is
`
`
`match (see Section 3.2. above). As shown in Fig. 4, three
`
`
`located between HS2 and HS 1 as well.
`
`
`
`
`conserved segments matching the consensus API bind­
`
`
`The pattern of many conserved blocks in certain
`
`
`ing site (

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket