throbber
GENE
`
`
`
`A.N INTERNATIONAL. JOU .. NAL ON
`
`OE.Nt:S AND Gt:NOM[S
`
`ELSEVIER
`
`
`
`Gene 205 (1997) 73-94
`
`Review
`
`Locus control regions of mammalian P-globin gene clusters: combining
`
`
`
`
`
`
`
`
`
`phylogenetic analyses and experimental results to gain functional insights
`
`
`
`
`
`
`Ross Hardison
`b,f
`
`
`Nikola Stojanovic r, Webb Miller
`
`a,b,*, Jerry L. Slightom C, Deborah L. Gumucio d, Morris Goodman e,
`
`a Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`b Center for Gene Regulation, The Pennsylvania State University, University Park, PA 16802, USA
`
`
`
`0 Molecular Biology Unit 7242, Pharmacia and Upjohn, Inc., Kalamazoo, MI 49007, USA
`d Dep artment
`
`
`
`
`
`
`
`
`of Anatomy and Cell Biology, University of Michigan Medical School, Ann Arbor, MI 48109-0616, USA
`
`0 Department of Anatomy
`
`
`
`
`and Cell Biology, Wayne State School of Medicine, Detroit, MI 48201, USA
`
`
`
`
`
`
`r Department of Computer Science and Engineering, The Pennsylvania State University, University Park, PA 16802, USA
`
`
`
`Accepted 22 July 1997
`
`Abstract
`
`Locus control regions (LCRs) are cis-acting DNA segments needed for activation of an entire locus or gene cluster. They are
`
`
`
`
`
`
`
`
`
`
`
`
`
`operationally defined as DNA sequences needed to achieve a high level of gene expression regardless of the position of integration
`
`
`
`
`
`
`in transgenic mice or stably transfected cells. This review brings together the large amount of DNA sequence data from the P­
`
`
`
`
`
`
`
`globin LCR with the vast amount of functional data obtained through the use of biochemical, cellular and transgenic experimental
`
`
`
`
`
`
`
`
`
`
`systems. Alignment of orthologous LCR sequences from five mammalian species locates numerous conserved regions, including
`
`
`
`
`
`
`
`
`
`
`
`previously identified cis-acting elements within the cores of nuclease hypersensitive sites (HSs) as well as conserved regions located
`
`
`
`
`
`
`
`between the HS cores. The distribution of these conserved sequences, combined with the effects of LCR fragments utilized in
`
`
`
`
`
`
`
`
`expression studies, shows that important sites are more widely distributed in the LCR than previously anticipated, especially in
`
`
`
`
`
`
`and around HS2 and HS3. We propose that the HS cores plus HS flanking DNAs comprise a 'unit' to which proteins bind and
`
`
`
`
`
`
`
`form an optimally functional structure. Multiple HS units (at least three: HS2, HS3 and HS4 cores plus flanking DNAs) together
`
`
`
`
`
`
`
`establish a chromatin structure that allows the proper developmental regulation of genes within the cluster. © 1997 Elsevier
`Science B.V.
`
`
`
`
`
`
`
`
`
`
`
`Keywords: Hemoglobin; Sequence conservation; Enhancement; Chromatin; Domain opening; DNA-binding proteins
`
`
`
`
`
`
`
`1.Expression patterns of mammalian hemoglobin gene
`
`
`
`clusters
`
`clusters in birds and mammals. In humans, the /3-like
`
`
`
`
`
`
`globin genes (including pseudogenes denoted by the
`
`prefix l/1) are clustered in the array 5'-c-Gy-Ay-l/111-b-/3-3'
`The genes that encode the polypeptides of the 1:1.2/32
`
`
`
`that covers about 75 kb on chromosome 11 p 15.4, and
`
`
`
`tetramer of hemoglobin are encoded in two separate
`
`
`
`the a-like globin genes are in a 40-kb cluster, 5'-(2-l/10-
`
`
`l{la2-l{l1:1.l-1:1.2-al-0-3', very close to the telomere of the
`
`
`short arm of chromosome 16. Expression of the 1:1.-and
`*Corresponding author. Present address: Department of Biochemistry
`
`
`
`
`
`
`
`
`/3-like globin genes is limited to erythroid cells and is
`
`
`
`and Molecular Biology, The Pennsylvania State University, 206
`
`
`balanced so that equal amounts of the two polypeptides
`
`
`Althouse Laboratory, University Park, PA 16802, USA. Tel.: + I 814
`
`
`
`are available to assemble the hemoglobin heterotet­
`
`
`8630113; Fax: + I 814 8637024; e-mail: rch8@psu.edu
`
`
`
`
`ramer. Expression of genes within the clusters is develop­
`Abbreviations: LCR, locus control region; HS, hypersensitive site;
`
`
`
`
`
`mentally controlled, so that different forms of
`
`
`
`
`
`HIC, highest information content; DPF, differential phylogenetic foot­
`
`
`
`hemoglobin are produced in embryonic, fetal and adult
`
`
`
`
`print; CACBPs, proteins that bind to the CACC motif; MAR, matrix
`
`
`life (reviewed in Stamatoyannopoulos and Nienhuis,
`
`
`attachment region; bHLH, basic helix-loop-helix; MEL, murine
`1994),
`erythroleukemia.
`
`0378-1119/97/$17.00 © 1997 Elsevier Science B.V. All rights reserved.
`
`
`PIT S0378-l l 19(97)00474-5
`
`SKI Exhibit 2042
`Page 1 of 23
`
`

`

`74
`
`
`
`R Hardison et al./ Gene 205 ( 1997) 73-94
`
`This process of hemoglobin switching is an excellent
`
`
`
`anthropoid primates, its expression continues and predo­
`
`
`
`
`
`model system for increasing our understanding of the
`
`
`
`minates in fetal red cells. The appearance of this new
`
`mechanisms of differential gene expression
`molecular
`pattern
`
`
`of fetal expression of the y-globin genes coincides
`
`
`
`during development. These developmental switches also
`
`
`roughly with the duplication of the genes in primate
`
`
`
`offer new approaches to therapy for inherited anemias.
`
`
`evolution, which leads to the hypothesis that the duplica­
`
`
`
`
`For example, continued expression of the normally fetal
`
`
`tion allowed the changes that caused the fetal recruit­
`
`
`
`of symptoms the severity HbF (a2y2) in adults will reduce
`
`
`ment (Hayasaka et al., 1993). The P-globin gene is
`
`
`
`
`of patients producing an abnormal /J-globin in sickle
`
`
`expressed after birth in all mammals, but in galago,
`
`
`
`
`cell disease and possibly also in patients lacking sufficient
`
`
`
`mouse and rabbit, its expression initiates and predomi­
`
`
`/J-globin (/J-thalassemia). An understanding of the
`
`
`nates in the fetal liver (arguing that fetal expression of
`
`
`
`molecular basis of globin gene switching will facilitate
`
`
`the /3-globin gene is the ancestral state). The recruitment
`
`
`
`development of new therapeutic strategies (pharmaco­
`
`
`of y-globin genes for fetal expression in anthropoid
`
`
`
`logical and/or DNA transfer) that continue y-globin
`
`
`
`primates is accompanied by a corresponding delay in
`
`gene expression in adults.
`
`expression of the /1-glo bin gene.
`In addition to biochemical and genetic approaches to
`
`
`
`
`
`
`Comparisons of DNA sequences among mammalian
`
`studying regulation of globin genes, phylogenetic
`
`
`
`/3-globin gene clusters can reveal candidates for
`
`
`approaches are also highly informative. The detailed
`
`
`
`
`
`sequences involved in shared regulatory functions; these
`
`
`
`will be detected as conserved sequence blocks, or phylo­
`
`
`
`study of globin gene clusters in many mammalian species
`
`
`
`has provided a rich resource of information from which
`
`
`genetic footprints, found in all mammals (Gumucio
`
`
`
`et al., 1992; Hardison et al., 1993). Notable similarities
`
`
`to glean further insight into not only the evolution of
`
`
`are found in alignments of the proximal 5' flanking
`
`
`the gene clusters but also their regulation. The p-globin
`
`
`
`gene clusters have been extensively studied in human,
`
`
`
`
`
`regions of the orthologous P-like globin genes, consistent
`
`
`with their roles as promoters and other regulators of
`
`
`
`
`the prosimian galago, the lagomorph rabbit, the artio­
`
`
`
`
`expression. In addition, striking and extensive sequence
`
`
`dactyls goat and cow, and the rodent mouse. Maps of
`
`matches are found at the far 5' end of the gene clusters,
`
`
`these gene clusters are shown in Fig. 1, and aspects of
`
`
`in the region that we now recognize as the locus control
`
`
`their evolution and regulation have been reviewed
`
`
`(Collins and Weissman, 1984; Goodman et al., 1987;
`
`
`region (LCR), which is the dominant, distal control
`
`
`
`sequence for these gene clusters. Sequence comparisons
`
`
`
`
`Hardison and Miller, 1993). The £-globin gene is at the
`
`
`can be used also to identify candidates for regulatory
`
`
`
`5' end of all the mammalian globin gene clusters and is
`
`
`
`elements that lead to differences in expression patterns.
`
`
`
`expressed only in embryonic red cells in all cases. In
`
`
`
`In this case, one searches for sequences conserved in the
`
`
`
`most eutherian mammals, expression of the y-globin
`
`set of mammals that show a particular phenotype but
`
`
`
`gene is also limited to embryonic red cells, but in
`
`o
`I
`
`20
`I
`
`40
`I
`
`60
`I
`
`80
`I
`
`100
`I
`
`120
`I
`
`140 kb
`I
`
`Human P
`
`duplication of y
`
`and
`fetal recruitment� Galago P �
`
`L...----
`
`Rabbit p �
`
`
`
`Ancestral eutherian mammal
`
`E GyAy ljlll Ii
`
`A
`E F F
`E Y ljlll
`Ii J}
`
`E E F,A
`£ y \jlli p
`
`E E
`
`F,A
`
`Goatp
`
`Mouse P
`
`J E
`E E
`y bh0bhlbh2 bh3 bl b2
`
`A
`
`F
`
`E E F,A F,A
`
`Fig. I. Evolution of fi-globin gene clusters in eutherian mammals. The inferred ancestral gene cluster and the branching pathways to contemporary
`
`
`
`
`
`
`
`
`
`
`
`
`
`gene clusters are shown. The time of expression during development is indicated beneath the box representing each gene; E, embryonic; F, fetal;
`
`
`
`A, adult. The boxes for orthologous genes have the same shading.
`
`SKI Exhibit 2042
`Page 2 of 23
`
`

`

`
`
`R. Hardison et al. / Gene 205 ( 1997) 73-94
`
`75
`
`
`
`
`
`2.2. Position-independent expression and enhancement
`
`1986). Thus, the LCR marks an open chromatin domain
`
`
`which differ in the species with a different pattern of
`
`
`
`
`
`
`
`for the /3-like globin gene cluster in erythroid cells from
`
`
`
`expression. For instance, such differential phylogenetic
`
`
`
`
`all developmental stages, and functional assays implicate
`
`
`footprints (Gumucio et al., 1994) led to the discovery
`
`
`the LCR in generating this open domain, as described
`
`
`
`
`of a sequence implicated in fetal-specific expression of
`in the next section.
`
`
`
`the y-globin genes in higher primates (Jane et al., 1992)
`
`
`
`
`and a sequence that binds several proteins implicated in
`
`
`
`fetal silencing of the y-globin gene (Gumucio et al.,
`
`
`
`
`1994). In this review, we summarize the results of
`As illustrated in Fig. 2, the /3-globin LCR will confer
`
`
`
`
`
`sequence comparisons for both types of regulatory ele­
`
`
`
`high-level, position-independent expression on globin
`ment in the LCR.
`
`
`
`gene constructs in transgenic mice (reviewed in Townes
`
`
`and Behringer, 1990; Grosveld et al., 1993). In the
`
`
`absence of the LCR, the human /3-or y-globin gene is
`2.General features of mammalian p-globin LCRs
`
`expressed in only about half of the lines of transgenic
`
`
`
`mice carrying the integrated gene, and expression levels
`2.1. DNase hyp ersensitive
`
`sites 5' to the [l-globin gene
`
`
`are low relative to those of the endogenous mouse globin
`cluster
`
`
`genes. The lack of expression in many lines of transgenic
`
`
`
`
`mice is presumed to result from negative position effects
`The /3-globin LCR was initially discovered as a set of
`
`
`
`
`
`
`generated by adjacent sequences at the site of integ­
`
`
`
`
`dnase hypersensitive sites located 5' to the €-globin gene
`
`
`
`
`ration, which prevent expression of the transgene in
`
`
`(Tuan et al., 1985; Forrester et al., 1986, 1987). At least
`
`
`
`erythroid cells. However, when a large DNA fragment
`
`5 DNase HSs, called HS1-HS5 (Fig. 2), have been
`
`
`
`containing the full LCR is linked to the ,8-globin gene,
`
`
`
`
`characterized within the region that provided the original
`
`
`
`all resulting transgenic mouse lines express the gene,
`
`
`
`
`gain-of-function effects described below (Grosveld et al.,
`
`and at a level comparable to that of the endogenous
`
`
`1987), and we will refer to this region with all five HSs
`
`
`
`globin genes (Grosveld et al., 1987). Hence, the negative
`
`as the 'full LCR.' The presence of DNase HSs is
`
`
`
`
`
`position effects are no longer observed, indicating that
`
`
`
`
`indicative of an altered chromatin structure associated
`
`
`
`
`either a strong domain-opening activity (that overrides
`
`
`
`with important cis-regulatory regions (Gross and
`
`
`
`
`the negative effects of adjacent sequences), or an insula­
`
`
`
`Garrard, 1988). Some of these sites, especially HS3,
`
`
`tor that blocks the effects of adjacent sequences, or
`
`
`appear preferentially in erythroid nuclei (Dhar et al.,
`
`both, are present in the LCR. The high level of expres­
`
`
`
`1990), but in contrast to the DNase hypersensitive sites
`
`
`
`sion of the transgene indicates the presence of enhancers
`
`
`
`
`at promoters, all are developmentally stable, i.e., present
`
`in the LCR as well. Both enhancers and LCRs increase
`
`
`in embryonic, fetal and adult red cells (Forrester et al.,
`
`0
`
`I I
`
`20
`
`DNase HSs
`
`40
`
`60
`
`I I
`
`BO kb
`l;!l(lb[Qid
`�21,m�ssed ia Qe�IQ12rneotal
`
`EQsi.tiQo
`RegulruiQ□
`� Chromatin
`�
`
`E G.yA-y 'lfTl 6 13
`t
`Human ttttt
`Yes
`13-globin
`�:��-�:!
`I II □� I
`gene cluster
`E G.yA-y 'lfTl 6
`Hispanic
`(r613) thalassemia
`I
`I II
`0-§
`
`In transgenic mice:
`13
`
`No
`
`Yes
`
`No Open
`
`No
`
`Yes? Closed
`
`Sometimes Yes
`I
`
`Yes Sometimes
`open
`
`.
`
`Yes Precocious No Open
`expression
`
`E GyA-y 'lfTl 6 13
`♦
`tt♦♦t
`Yes
`�:�:) I II 0----fil I
`
`Yes
`
`No Open
`
`
`
`
`
`
`
`
`
`Fig. 2. Summary of the major effects of the P-globin locus control region.
`
`SKI Exhibit 2042
`Page 3 of 23
`
`

`

`76
`
`
`
`
`
`R Hardison et al. ( 1997) 73-94 / Gene 205
`
`2.5. Developmental regulation
`
`
`
`
`semia deletion, which removes HS2 through HS5
`the probability that a locus will be in a transcriptionally
`
`
`
`
`
`competent state without affecting the transcription rate
`
`
`
`(Fig. 2), not only leaves the locus in a closed chromatin
`
`
`conformation but also delays the time of replication
`
`
`
`in a cell actively expressing that locus (Walters et al.,
`
`
`
`
`
`
`1995, 1996; Wijgerde et al., 1996). This further argues from early to late in S phase in erythroid cells (Forrester
`
`that one of the major functions of the LCR is to open
`
`
`
`et al., 1990). Replication of the /3-globin gene locus
`
`
`a chromatin domain around the locus in erythroid cells.
`
`
`
`normally initiates just 5' to the /3-globin gene (Kitsberg
`
`In fact, deletion of most of the LCR but not the /3-
`
`
`et al., 1993), which is 50 kb 3' to the LCR. Surprisingly,
`
`globin genes, e.g., as occurs in Hispanic (yb/J)-thalas­
`
`
`
`
`chromosomes with the Hispanic thalassemia deletion no
`
`
`
`
`semia, leaves the gene cluster in a chromatin conforma­
`
`
`
`longer use the normal replication origin, even though it
`
`tion that is inaccessible to DNase I, and the globin
`
`
`
`
`is intact, but instead use an origin located 3' to the /3-
`
`
`
`genes are not expressed (Forrester et al., 1990). Thus,
`
`
`globin locus (Aladjem et al., 1995).
`
`
`this loss-of-function analysis also shows that the LCR
`
`
`
`is necessary for the establishment and maintenance of
`
`
`an open chromatin domain within which the globin
`
`genes are expressed (Fig. 2).
`The effects of the LCR, if any, on developmental
`Minimal DNA sequences that confer position-inde­
`
`
`
`
`
`
`regulation are more complicated to analyze. Several
`
`
`
`pendent expression of a linked /3-globin gene in
`
`
`
`lines of evidence show that sequences proximal to the
`
`
`
`
`transgenic mice have been determined in regions around
`
`
`genes are sufficient to specify expression at a given
`
`
`
`the sites of strong DNase cleavage (reviewed in Grosveld
`
`
`
`developmental stage. In the absence of an LCR, human
`
`
`et al., 1993). These regions are referred to as the
`
`
`
`
`/3-like globin genes are expressed at the 'correct' develop­
`
`
`'hypersensitive site cores' for HSI, HS2, HS3 and HS4.
`
`
`
`mental stage in transgenic mice, i.e., mimicking the
`
`
`expression pattern of the orthologous endogenous
`2.3. Copy-number dependent expression
`
`mouse genes (summarized in Trudel and Costantini,
`
`
`1987). In fact, developmental switching can occur
`Transgene constructs that confer full protection from
`
`
`
`
`
`
`
`between human y-and /3-globin genes in transgenic mice
`
`
`
`position effects should not be affected by any adjacent
`
`
`in the absence of an LCR (Starck et al., 1994 ), demon­
`
`
`
`sequences. Thus, when the construct is integrated in
`
`
`strating that the LCR is not essential for switching.
`
`
`multiple copies, as is frequently the case in transgenic
`
`
`Point mutations in the promoter of the human y-globin
`
`
`
`
`mice lines and in stably transfected cultured cells, each
`
`
`
`genes are associated with prolonged expression in the
`
`
`
`
`adult stage, i.e., hereditary persistence of fetal hemoglo­
`
`
`
`copy should be expressed independently of other copies,
`
`
`
`
`
`
`resulting in a level of expression that increases linearly bin (reviewed in Stamatoyannopoulos et al., 1994).
`
`with the number of copies. This 'copy-number-depen­
`
`
`
`Detailed studies of the human E-and y-globin genes in
`
`
`dent' expression has been observed in some cases with
`
`constructs also containing LCR fragments have revealed
`
`
`
`particular fragments of the /J-globin LCR (Talbot et al.,
`
`
`sequences extending up to about 0.8 kb away from the
`
`
`
`1989), as well as with the chicken /3/E-globin enhancer
`
`
`gene that have both positive and negative effects on
`
`
`
`developmental control (Stamatoyannopoulos et al.,
`
`
`
`(Reitman and Felsenfeld, 1990). Other experiments with
`
`1993; Trepicchio et al., 1993; Li and Stamatoy­
`
`
`fragments of the /3-globin LCR do not show a clear
`
`
`annopoulos, 19946; Trepicchio et al., 1994). Recent
`
`
`
`dependence on copy-number (Ryan et al., 1989), and
`
`
`studies in transgenic mice show that the human y-globin
`
`
`
`occasional studies show inverse relationships between
`
`
`galago whereas the orthologous
`
`gene is expressed fetally,
`
`
`copy number and level of expression (Morley et al.,
`
`
`
`y-globin gene is expressed embryonically, in the context
`1992; TomHon et al., 1997). Although the minimal
`
`
`
`
`of an otherwise identical transgene construct (TomHon
`
`
`sequences that will achieve full dependence on copy
`
`
`
`et al., 1997). This recapitulation of developmental speci­
`
`
`number are not yet known, this property appears to
`
`
`ficity shows that the dominant determinants of develop­
`
`
`require sequences from both the LCR and the gene
`
`
`mental timing are encoded by nucleotide differences
`
`
`
`proximal region (Lloyd et al., 1992; Fraser et al., 1993;
`
`
`
`
`
`within the 4.0-kb fragment containing the y-globin gene.
`
`
`Li and Stamatoyannopoulos, 1994b). For the y-globin
`
`
`
`
`Although developmental switches in expression can
`
`
`
`gene, copy-number dependence requires both sequences
`
`
`
`occur in the absence of the LCR, it is still possible that,
`
`
`3' to the y-globin gene and one or more elements in the
`
`
`when present, the LCR participates directly in develop­
`
`HS cores (Stamatoyannopoulos et al., 1997).
`
`
`mental regulation (e.g., Stamatoyannopoulos, 1991;
`
`
`
`Wijgerde et al., 1996). Addition of the LCR to a single
`
`human /3-or y-globin gene will alter developmental
`
`
`
`
`
`control (Fig. 2), leading to precocious expression of the
`In addition to the strong effects of the /3-globin LCR
`
`
`
`
`
`
`
`/3-globin gene in embryonic red cells and expression of
`
`
`
`on chromatin opening and enhancement of expression,
`
`
`the y-globin gene in fetal and adult stages (Enver et al.,
`
`the LCR also has a dominant effect on the regulation
`
`
`
`1989; Behringer et al., 1990). Inclusion of bothy-and
`
`
`
`of replication in the locus. The Hispanic (yb/3)-thalas-
`
`
`
`
`
`2.4. Replication of the locus
`
`SKI Exhibit 2042
`Page 4 of 23
`
`

`

`
`
`R Hardison et al./ Gene 205 ( 1997) 73-94
`
`77
`
`
`
`2.6. Models for LCR action
`
`cells. This could occur indirectly, with recognition of
`
`
`
`
`
`
`/J-globin genes will improve the developmental switch­
`
`
`
`specific sequences in the LCR by trans-activator proteins
`
`
`ing, leading to a model of competition between promot­
`such as members of the APl family of proteins and
`ers for the LCR (Enver et al., 1990). The order of
`
`
`recruitment of chromatin remodeling and/or histone
`
`
`
`multiple globin genes in LCR-containing constructs also
`
`
`
`modifying activities by specific interaction between these
`
`
`
`influences their regulation (Hanscombe et al., 1991;
`
`
`enzymes and the trans-activator. For instance, the
`
`
`Peterson and Stamatoyannopoulos, 1993). Although
`
`
`
`co-activator proteins CBP and P300 are histone acetyl
`
`
`these data can be explained by a competition model,
`
`
`transferases and also interact with API (Ogrysko et al.,
`
`
`
`the apparent loss of developmental control seen in the
`
`
`
`1996). In addition, some DNA sequences in the LCR
`
`
`presence of an LCR could result from the increased
`
`
`
`could recruit chromatin remodeling and modifying activ­
`
`
`
`sensitivity of the assays, and the effects of additional
`
`ities directly.
`
`genes in the construct can be explained by gene order
`
`Several other issues remain unresolved. For instance,
`
`
`
`
`
`effects (such as transcriptional interference from the
`
`
`the LCR could influence all or several of the genes in
`
`
`
`upstream gene) as opposed to proximity to the LCR
`the locus at once ( Bresnick
`
`and Felsenfeld, 1994; Martin
`
`(Martin et al., 1996).
`
`
`et al., 1996) or it could serve to activate expression of
`The effects that led to models of competition in
`
`
`
`one gene at a time (Wijgerde et al., 1995). If the LCR
`
`
`developmental regulation are seen primarily for the
`
`
`does influence predominantly one gene at time, it could
`
`
`regulation of the human /J-globin gene. The £-globin
`do so by interaction
`
`directly with the target gene with
`
`
`(Raich et al., 1990; Shih et al., 1990), a-globin and (­
`
`
`
`
`looping out of DNA between this distal regulator and
`
`
`globin (Pondel et al., 1992; Liebhaber et al., 1996) genes
`
`
`
`the proximal regulatory elements (Grosveld et al., 1993)
`
`
`
`are autonomously regulated during development in the
`
`
`or the positive effect of the LCR could 'track' along the
`
`
`
`
`presence of LCR-like elements, and constructs contain­
`
`
`
`DNA to the target gene (Tuan et al., 1992). Neither the
`
`
`ing larger LCR fragments with the y-globin gene also
`
`
`
`
`molecular targets of the direct interactions (in the former
`
`
`
`show autonomous regulation (Dillon and Grosveld,
`
`
`model) nor the molecular basis of the tracking effects
`1991).
`
`
`(in the latter model) are known. For instance, 'tracking'
`
`
`
`could involve movement of transcription factors along
`
`
`the DNA, or it could result from spreading of the active
`
`chromatin domain down the locus.
`
`
`
`Many studies are consistent with the hypothesis that
`
`
`several DNase HSs in the LCR work together in a
`
`
`to generate the several effects enumerated
`holocomplex
`
`
`
`
`above. One explicit model stating that each HS has a
`
`predominant effect on only one specific
`gene in the
`DNA sequences of much of the P-globin LCR are
`
`
`
`
`cluster (Engel, 1993) can be excluded since deletions of
`
`
`
`
`now available from several mammalian species, includ­
`
`
`
`
`single HSs in the context of entire gene clusters either
`
`ing human (Li et al., 1985; Yu et al., 1994), galago
`
`
`have little effect or affect expression of all the genes in
`
`
`
`(Slightam et al., 1997), rabbit (Hardison et al., 1993;
`
`
`
`removal of any the locus (reviewed below). Indeed,
`
`
`Slightam et al., 1997), goat (Li et al., 1991) and mouse
`
`
`single HS makes the entire human p-globin gene cluster
`
`(Moon and Ley, 1990; Hug et al., 1992; Jimenez et al.,
`
`
`
`more sensitive to position effects in transgenic mice
`
`
`
`
`1992). The remainder of this review will discuss insights
`
`
`
`
`(Milot et al., 1996), arguing that this defining property
`
`
`
`into the regions required for LCR function based on
`
`
`of the LCR requires all of the HSs. This result contrasts
`
`
`
`revealed by a simultaneous patterns of conservation
`
`
`
`with the implications of reports on the ability of indivi­
`
`
`
`alignment of these DNA sequences (Slightom et al.,
`
`dual HSs to provide position-independent, copy-number
`
`1997). Key features of the LCRs from the different
`
`
`
`
`
`dependent expression (e.g., Fraser et al., 1993), and the
`mammals are mapped in Fig. 3.
`
`
`molecular basis for this apparent discrepancy is not
`
`
`
`
`clear. Functional interactions between the HSs have
`
`
`
`been demonstrated, but require DNA sequences inside
`
`
`
`and outside the core HSs (reviewed below). Thus,
`All of the known mammalian /3-globin LCRs have
`
`
`
`
`HSs do exhibit substantial although several individual
`
`segments homologous to HSI, HS2 and HS3 (Fig. 3).
`
`
`function alone, it is most likely that they normally
`
`
`
`HS4 is likely present in all these species as well, although
`
`
`
`interact in a holocomplex (Ellis et al., 1996) that encom­
`
`
`
`the currently available goat sequence does not include
`
`
`passes a substantial amount of DNA
`
`
`
`the region corresponding to HS4. Homologs to human
`
`The ability of the LCR to open a chromosomal
`
`HS5 are found in galago (Slightom et al., 1997) and
`
`
`
`domain suggests that it recruits chromatin-remodeling
`
`mouse (A. Reik, M. Bender and M. Groudine, pers.
`
`activities such as SWI/SNF (Cote et al., 1994; Peterson
`
`
`
`commun.), suggesting a wide distribution of HS5 as
`
`
`and Tamkun, 1995) and/or histone acetyl transferases
`in rabbit, it does not occur in
`well. If HS5 is present
`
`
`(Brownell et al., 1996) to this locus, but only in erythroid
`
`the same place in human or galago. Thus the presence
`
`
`
`
`
`3.Sequence analysis of mammalian P,.globin LCRs
`
`3.1. Conservation of number and order of HSs
`
`SKI Exhibit 2042
`Page 5 of 23
`
`

`

`78
`
`
`
`R Hardison e1 al. / Gene 205 ( 1997) 73-94
`
`
`
`0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 22000
`HS1
`HS2
`
`HS5
`
`HS4
`
`HS3
`
`Human I ►I ◄ I ► ►�in hum.► .--0 ◄ f•globi+
`
`0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 22000
`
`I ► I ►► le�
`Galago I I
`
`IS
`
`� � ◄◄ e
`
`0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 22000
`
`Rabbit
`
`Goat
`
`Mouse
`
`..
`
`.. ..
`
`0 2000 4000 6000 8000 10000 12000 14000 16000
`
`�I � � �I� in goat
`not sequenced
`
`-globin
`
`0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 22000
`
`L1Md
`
`-globin
`
`I-,,_
`distance = 6.5 kb In a allele (BALB/c)
`
`
`5.0 kb in b allele (C67BU6J)
`
`L1Md
`
`Fig. 3. Mammalian p-globin LCRs. Maps of the p-globin LCRs of human, galago, rabbit, goal and mouse show the positions of HS cores in
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`humans, their homologs in other species, positions and identities of repeats, and the new regions sequenced in rabbit (double-arrowed lines under
`
`
`
`
`
`
`
`the rabbit P-globin LCR map). The HS cores are shown as boxes with distinctive fills, long interspersed repeats (Lis) are open arrowed boxes,
`
`
`
`
`
`
`
`
`and short interspersed repeats are triangles (in the latter two cases, the icon points in the direction that the repeat is oriented). Short repeats are
`
`
`
`
`
`
`
`
`in goals, and Bl repeats in mouse. in rabbits, in galago, C repeats Alu repeats in humans, both type I and type II Alu repeats Nia and D repeats
`
`
`
`
`
`
`An insertion between positions 14419 and 14599 of galago does not match any known short or long repeats, and it may represent a newly
`
`
`
`
`
`
`
`discovered repeat. An insertion of 81 bp that begins at position 3614 of galago is a novel short insertion sequence.
`
`
`
`3.2. Conserved sequences within the LCR
`
`
`
`
`
`Fig. 4. The information content reflects both the amount
`
`
`
`of four (HSl-4) and possibly all five major HSs is
`
`
`
`
`
`
`
`
`conserved in these eutherian mammals. This conserva­of variability in a column in a multiple alignment as
`
`
`well as the base composition for the sequences being
`
`
`tion extends even further back in evolutionary time,
`
`
`
`aligned, and provides a finely graded function for meas­
`
`with at least LCR HS2, HS3, HS4, and possibly HSl
`
`uring conservation. The second method simply finds
`
`
`being found in Australian marsupials and monotremes
`
`
`runs of exact matches; Fig. 4 plots positions with seven
`
`
`(R. Baird, J. Kuliwaba, R. Hope, M. Goodman et al.,
`
`
`or more consecutive invariant columns such that
`
`personal communication).
`
`
`
`sequences from some minimal number of species align
`
`(four in one case, three in the other). A third method
`
`
`
`(Stojanovic et al., 1997) was devised to better reflect
`
`
`
`
`matches found at protein binding sites. Specifically,
`We used the program yama2 (Chao et al., 1994) to
`
`
`
`Fig. 4 identifies all runs of six or more columns possess­
`
`
`compute a simultaneous alignment of the available
`
`
`
`
`
`ing a plausible consensus sequence, i.e., each row in that
`
`
`
`mammalian ,B-globin LCR sequences. We then used
`
`
`region can have at most one mismatch with the (a priori
`
`
`three different approaches to search for conserved
`
`
`
`unspecified) consensus. This requirement mimics the
`
`
`
`
`sequences at a variety of criteria (Slightom et al., 1997;
`
`
`
`documented ability of some proteins to bind equally
`
`
`
`Stojanovic et al., 1997). The first method computes the
`
`
`
`well to similar but not identical sequences. For instance,
`
`of each column (Schneider et al.,
`
`information content
`GATA I binds to AGATAA or to TGATAG, which each
`
`
`
`1986); the positions of the 10 and 30 blocks with the
`
`differ in only one position from AGATAG.
`
`
`
`
`highest information content (HIC) are displayed in
`
`SKI Exhibit 2042
`Page 6 of 23
`
`

`

`
`
`
`
`R.Hardison et al./ Gene 205 ( 1997) 73-94
`
`79
`
`;
`
`I
`
`I
`
`1 mismatch
`DPF
`AP1/NFE2
`
`GATA
`
`l II
`
`clusters that some single-copy regions do not align
`
`
`
`I HS� , Hsa
`HS cores
`
`
`
`
`(Hardison et al., 1994; Hardison and Miller, 1993). One
`:II I' : 11�11I
`1i
`HSs
`
`
`
`
`
`notable example is the intergenic region between the 8-
`10 HIC
`: I II ti I I
`11 I 11111 I II ill 1111
`and /J-globin genes in mouse vs. human comparisons
`30 HIC
`1=7, n=4
`
`(Hardison et al., 1997). This shows that in the time
`
`11 II 111! 1111 I !II I I
`111
`i i: Ill 11
`1=7, n=3
`I. 1111111
`
`
`
`since the ancestors to rodents and primates separated,
`11, ■1111 �RI 1111�
`I �11111
`II 11 ! I i' 11 I II [ I
`
`
`some sequences in this locus (presumably those not
`
`
`
`
`necessary for function) have diverged extensively. Thus,
`I I II ii I
`1! I i !I
`CACBP
`11 I I
`ll
`I I I
`
`
`the phylogenetic footprints in the LCR are indeed
`l Iii I
`I 1111 II ill 111 I
`
`
`candidates for functional sequences.
`TATTT/ATTTA
`
`4000 6000 8000 10000 12000 14000 16000
`
`position
`
`3.3. Correspondence between DNase HSs and conserved
`
`
`
`
`blocks
`
`3.4. Repeated, conserved sequence motifs in the LCR and
`
`Fig. 4. Positions of selected features revealed by the multiple sequence
`
`
`
`
`
`
`of DNase HSs are also plotted in Fig. 4.
`The positions
`
`
`alignment. The positions of the HS cores are shown on the top line.
`
`
`Several HSs are reported around each of the cores.
`
`
`Reported positions of DNase HSs are on the second line. The next
`
`
`Although some of this heterogeneity simply reflects
`
`
`
`
`five lines show the positions of conserved sequences, as detected by
`
`
`
`multiple reports of the same HS, some of it results from
`
`
`
`
`
`three different methods. Differential phylogenetic footprints (DPFs)
`
`
`
`
`
`are on line 8. Conserved matches to consensus binding sites for the
`
`
`
`a wide distribution of cleavage. For instance, the regions
`
`
`indicated proteins are shown on lines 9-11. The last line shows posi­
`
`
`around HS3 and HS2 have DNase cleavage sites outside
`
`
`tions of matches between the one mismatch
`(line
`
`unspecified consensus
`
`
`the minimal cores (Philipsen et al., 1990; Talbot et al.,
`
`
`7)and the motifs TATIT or ATITA. GenBank entry HUMHBB
`
`1990). DNase HSs have been mapped at approximate
`
`
`
`begins at 2688 in the current human sequence file.
`
`
`positions 6200 (Stamatoyannopoulos et al., 1995) and
`
`6500 (Tuan et al., 1985), which are about 1000 bp 5' to
`These various methods for finding conserved segments
`
`
`the HS3 core. This is the same region that displays
`
`
`
`
`produce generally congruent results, with substantial
`
`
`
`
`multiple conserved sequence blocks, showing a good
`
`
`overlap in the blocks detected by each of the methods.
`
`congruence between HS mapping and conserved
`
`
`This indicates that the combination of the various
`
`
`sequences not only in the cores but far outside them
`
`
`
`methods for finding conserved sequences is quite robust.
`as well.
`
`
`
`As expected, all three methods find strongly conserved
`
`
`blocks within the HS cores, as well as juxtaposed to
`
`
`
`
`
`
`
`them (in particular, a phylogenetic footprint located just
`
`proteins that bind to them
`
`3' to the HS4 core and an APl binding site immediately
`5' to the HS3 core). In addition, some, but not all, of
`The distribution of conserved binding sites for some
`
`
`
`
`
`
`the regions between the cores are conserved, with some
`
`
`
`prominent proteins involved in globin gene regulation
`
`
`
`
`was determined by searching for matches between the
`
`
`
`
`phylogenetic footprints as strongly conserved as those
`
`
`
`in the HS cores. Notable conserved regions are as much
`
`
`
`
`
`consensus sequence for the protein binding sites and the
`as 1000 bp 5' to and 3' to the HS3 core and also 5' to
`
`
`
`'unspecified consensus' computed allowing one mis­
`
`
`
`the HS2 core. Interestingly, a conserved sequence is
`
`
`match (see Section 3.2. above). As shown in Fig. 4, three
`
`located between HS2 and HSI as well.
`
`
`
`
`conserved segments matching the consensus APl bind­
`
`
`ing site (TGASTCA) are found close to or within the
`
`
`The pattern of many conserved blocks in c

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket