`Vol. 81, pp. 659-663, February 1984
`Biochemistry
`
`Promoter-regulatory region of the major immediate early gene of
`human cytomegalovirus
`(RNA polymerase 11/transcriptional control elements)
`DARRELL R. THOMSEN, RICHARD M. STENBERG, WILLIAM F. GOINS, AND MARK F. STINSKI
`Department of Microbiology, School of Medicine, University of Iowa, Iowa City, IA 52242
`
`Communicated by Bernard Roizman, September 19, 1983
`
`The DNA templates containing immediate
`ABSTRACT
`early (IE) genes of human cytomegalovirus (CMV) were tran-
`scribed in vitro by using a HeLa cell extract. When IE region 1,
`2, and 3 were used, transcription was detected qualitatively
`only from IE region 1. Transcription was detected with DNA
`representing IE region 2 when the IE region 1 promoter was
`not present. DNA sequence analysis of the upstream regula-
`tory region of IE region 1 detected two distinct repeats of 19
`and 18 nucleotides, both being repeated four times. A putative
`cruciform structure could form through the surrounding se-
`quences with each 18-nucleotide repeat being located in the
`unpaired region. The potential secondary structure and the
`repeat sequences in the regulatory region of IE region 1 are
`presumably related to the high level of transcription of this IE
`gene.
`
`Human cytomegalovirus (CMV), a member of the' herpesvi-
`rus classification group, has a large double-stranded DNA
`genome of 240 kilobases (kb). The viral genome consists of a
`long-and short unique region flanked by differept repeat se-
`quences 'that are inverted relative to each other. Four
`genome arrangements, resulting from the possible combina-
`tion of inversions of the two sections of the genome, are
`present in DNA preparations in approximately equal
`amounts (1-7).
`At immediate early (IE) times after infection-i.e., in the
`absence of de novo viral protein synthesis, 88% or more of
`the viral RNA originates from a region in the long unique
`component of the viral genome (6, 8, 9) between 0.660 and
`0.751 map units for the Towne strain (8, 10). One or more of
`the IE viral genes presumably codes for a viral regulatory
`protein that stimulates transcription from other regions of
`the vital genome.
`Based on the high steady-state levels of viral mRNA and
`the abundance of its translation product in the infected cell,
`the IE gene between 0.739 and 0.751 map units is highly ex-
`pressed and has been designated IE gene 1 or the major IE
`gene (11, 12) Adjacent IE genes from 0.732 to 0.739 (region
`2) and from 0.709 to 0.728 (region 3) map units are expressed
`at relatively low levels and, consequently, are considered
`minor IE genes (12). Transcription under IE conditions is
`also detectable from another adjacent region of approximate-
`ly 0.660-0.685 map units (6, 8), but we have failed to trans-
`late in vitro hybrid-selected RNA encoded by this region;
`consequently, the expression of this region requires further
`investigation.
`Because CMV IE gene expression is dominated in vivo by
`the expression of a single gene, we were interested in deter-
`mining the properties of the promoter-regulatory region and
`whether or not region 1 was highly transcribed in vitro rela-
`tive to regions 2 and 3. The DNA sequence upstream of IE
`region 1 of CMV may constitute the earliest point at which
`
`The publication costs of this article were defrayed in part by page charge
`payment. This article must therefore be hereby marked "advertisement"
`in accordance with 18 U.S.C.- §1734 solely to indicate this fact.
`
`659
`
`expression of the viral genome is regulated at the level of
`transcription.
`
`MATERIALS AND METHODS
`Genetic Map and Recombinant Plasmids. Physical maps of
`the entire CMV genome were developed by LaFemina and
`Hayward (5). The cloning, purification, and characterization
`of recombinant plasmids containing insertions of CMV DNA
`have been described (13). Recombinant plasmid pCB42 and
`pSmaF are gifts from R. LaFemina and P.' Weil, respective-
`ly. A physical map of the Xba I fragment E and the recombi-
`nant plasmids representing this region have been described
`by Stinski et al. (12). Restriction endonucleases were ob-
`tained from Bethesda Research Laboratories or New En-
`gland BioLabs. The conditions were as described by the sup-
`plier. After digestion, the DNA was extracted twice with
`phenol/chloroform, 1:1 (vol/vol), and twice with chloro-
`form, precipitated twice with ethanol, and resuspended in 10
`mM Tris HCl, pH 7.9/1 mM EDTA.
`Preparation of HeLa Cell Extracts. HeLa cells were ob-
`tained from W. C. Summers. Spinner cultures were grown to
`a density of 4-5 X 105 cells per ml. In vitro transcription
`extracts were prepared by the method of Manley et al. (14).
`In Vitro Transcription and RNA Fractionation. In vitro
`transcription was as described by Manley et al. (14). Recom-
`binant plasmids cut with various restriction enzymes to gen-
`erate linear templates were at a concentration of 100 ig per
`ml. Some reactions contained a-amanitin (1 pg/ml; Sigma)
`to inhibit RNA polymerase II activity. The 32P-labeled RNA
`was subjected to electrophoresis in 1.5% agarose gels con-
`taining 10 mM methylmercury (II) hydroxide as described by
`Bailey and Davidson (15). Molecular weight standards were
`23S (3.3 kb) and 16S (1.7 kb) Escherichia coli rRNA (16), 28S
`(5.3 kb) and 18S (2.0 kb) human' cell rRNA (17), and approxi-
`mately 0.160 kb tRNA. To visualize the RNA, the slab gels
`were stained in a solution containing 0.5 M ammonium ace-
`tate, 0.005 M 2-mercaptoethanol, and 1 ,ug of ethidium bro-
`mide per ml. The gels were dried and exposed to Kodak X-
`Omat AR film. RNA sizes were interpolated from a standard
`curve.
`DNA Sequence Analysis. Recombinant plasmid pXEP 22
`containing the 5' end of the major IE RNA (18) and its pro-
`moter-regulatory region (12) were digested with the appro-
`priate restriction endonucleases, fractionated by electropho-
`resis in agarose or acrylamide gels, and eluted electrophoret-
`ically. The methods used for labeling DNA in vitro and for
`sequence determination by the chemical modification and
`degradation procedure of Maxam and Gilbert (19) have been
`described (18).
`Estimation of Secondary Structure. The free energies for
`the base-paired regions in the putative cruciform structures
`were calculated by the method of Tinoco et al. (20).
`
`Abbreviations: IE, immediate early; CMV, cytomegalovirus; kb, ki-
`lobase(s).
`
`Regeneron Ex. 1042
`Page 1 of 5
`
`
`
`660
`
`Proc. NaA Acad Sci. USA 81 (1984)
`Biochemistry: Thomsen etaLP
`
`Map units: 0.680
`
`DNA
`coding'roglon:
`
`orctlon of
`tranicription:
`
`RNA size
`cas (kb)e
`
`0.709
`
`i
`
`0*
`
`0.732
`
`0.7281
`
`0.[39
`
`0.751
`
`0170
`
`3
`
`2
`
`1
`
`i
`
`0*
`
`*
`
`1.95
`
`2.25, 1.95.
`1.75, 1.40
`1.10
`
`1.95
`
`FIG. 1. Summary of the IE RNAs coded within the Xba I fragment E DNA region. The map units of coding regions 1, 2, and 3 depict the
`limits of the probes used to detect viral RNAs. The direction of transcription is indicated for coding region 1. The direction of transcription in
`region 2 requires further investigation to determine which direction predominates at IE and early times after infection. The thickness of the bar
`represents the relative abundance of the IE RNAs originating from the various coding regions. The size classes of the viral RNAs in vivo are
`indicated in kb. The data for the above is taken from Stinski et al. (12).
`
`RESULTS
`
`In Vitro Transcription Using DNA Templates for IE Genes.
`At least three promoters between 0.709 and 0.751 map units
`influence IE transcription after infection with CMV (12).
`One IE viral gene (IE region 1) is highly expressed, whereas
`the other (IE regions 2 and 3) are expressed at relatively low
`levels based on steady-state levels' of mRNA in the cyto-
`p'lasm (12). These viral genes are also referred to as the ma-
`jor and minor IE genes. Fig. 1 summarizes the map location
`
`of these viral genes, direction of transcription for IE region
`1, and the RNA size classes originating from the various re-
`gions as described (12). We previously had designated the
`transcription in IE region 2 from left to right based on 3'
`cDNA hybridizations. However, recent evidence obtained
`by one of us (unpublished data) does not support this inter-
`pretation. The direction of IE transcription in this region re-
`quires further investigation.
`In vitro transcription of these IE DNA templates was ana-
`lyzed to obtain a general map location of the promoters and
`
`- a. An.tOk
`
`B
`
`C
`
`1
`
`2
`
`3
`
`size
`kb
`
`4
`
`2
`
`3
`
`4
`
`size
`kb
`
`_
`
`.:
`
`- i
`
`i
`
`. -~
`
`okgo -1.2
`
`A
`
`(P-*
`
`i,
`0U
`
`^&
`..
`
`.4
`
`s.
`
`E
`
`- 0.90
`
`- 0.53
`
`Lane,
`
`1-
`
`SAL
`
`P
`
`34-
`BlLI
`0.732
`0 7/ 39
`I
`I
`
`Map Units.
`
`6
`
`7
`
`8
`
`finI r_ ha
`
`Lane:
`
`3 2
`
`*- 4.6
`4.
`5
`
`PHS Alp
`III
`I
`
`P
`
`1
`
`2
`
`3
`
`4
`
`5
`
`A
`
`size
`kb
`
`4.5 -
`
`2.4-
`
`.90_
`.72-
`
`SAL
`I
`
`FP
`I
`
`B
`I
`
`PP
`11
`
`I
`
`Map Units
`
`0732
`I
`
`0739
`I
`
`0.751
`I
`Autoradiogram of in vitro transcripts with DNA templates for the IE genes. RNA was synthesized in standard reactions with various
`FIG. 2.
`DNA templates, extracted, denatured, and fractionated by electrophoresis in denaturing 1.5% agarose gels containing methylmercury(II)
`hydroxide as described. (A) DNA templates (100 ug/ml) for both major and minor IE genes. Lanes: 1, no added DNA; 2, BamHI-cleaved
`pXbaIE; 3, Sal I-cleaved pXbaIE; 4, Pst I-cleaved pXbaIE; 5, HincII-cleaved pXEP22; 6, Pst I-cleaved pXEP22; 7, Pst I-cleaved pXEP22 with
`a-amanitin (1 ,g/rnl); 8, Sac I-cleaved pXEP22. (B) DNA templates (100 pg/ml) for IE region 2. Lanes: 1, BamHI-cleaved pCB42; 2, BamHI-
`cleaved pCB42 with a-amanitin (1 ,g/ml); 3, Pst I-cleaved pCB42; 4, BamHI/Pst I-cleaved pCB 42. (C) DNA templates for IE region 1 and the
`major late adenovirus promoter. Lanes: 1, Pst I-cleaved pXEP22 (100 ,g/ml); 2, Sma I-cleaved pSmaF (100 ,g/ml); 3, Pst I-cleaved pXEP22
`(50 ug/ml) plus Sma I-cleaved pSmaF (50 ,ug/ml); 4, no added DNA. The sizes of the RNAs are shown in kb. Restriction enzyme sites Sal I
`(Sal), Pst I (P), BamHI (B), HinclI (H), and Sac I (S) relative to region 1 and region 2 DNA coding regions as well as the direction of
`transcription on the prototype arrangement of the viral genome are designated.
`
`Regeneron Ex. 1042
`Page 2 of 5
`
`
`
`Biochemistry: Thomsen et aL
`
`Proc. NatL Acad. Sci. USA 81 (1984)
`
`661
`
`to test if these viral promoters are recognized by RNA poly-
`merase II. Three types of recombinant plasmids were used.
`Recombinant plasmid pXbaIE contained IE regions 1
`(0.739-0.751 map units), 2 (0.732-0.739 map units), and 3
`(0.709-0.728 map units) (Fig. 1). Recombinant plasmid
`pXEP22 contained the promoter for IE region 1, and pCB42
`contained a promoter in region 2. However, it is not known
`whether this is an IE or early promoter.
`When all three promoters were present on a single plasmid
`(pXbaIE), in vitro transcription was detected only from IE
`promoter region 1. Fig. 2A demonstrates that transcripts of
`2.4, 4.5, and 0.90 kb were truncated at the BamHI (lane 2),
`Sal I (lane 3), and Pst I (lane 4) sites, respectively. When the
`recombinant plasmid containing only IE promoter region 1
`(pXEP22) was used, 0.72- and 0.90-kb transcripts (Fig. 2A,
`lanes 5 and 6) were truncated at the HincII and Pst I sites,
`respectively. However, digestion of this recombinant plas-
`mid with Sac I eliminated detectable in vitro transcription
`(Fig. 2A, lane 8). In vitro transcription was also inhibited by
`treatment with a-amanitin at 1 Mg/ml (Fig. 2A, lane 7). The
`band at the top represents the typical end-labeled or read-
`through product obtained with the HeLa cell lysate. Lanes 2,
`3, and 4 required longer exposures to see the end-labeled
`bands. These data indicated that the major IE promoter was
`located left of the HincII site and right of the Sac I site and
`that transcription was by host cell RNA polymerase II.
`To test for in vitro transcription from promoters in region
`2, recombinant plasmid pCB42 was used for in vitro tran-
`scription. This plasmid contains a viral DNA insert extend-
`ing approximately 5.2 kb left of IE DNA coding region 1 and
`represents the BamHI fragment B within the Xba I fragment
`E (12) or the BamHI fragment T for the BamHI physical map
`of the viral genome (unpublished data). In vitro transcription
`with IE region 2 was possible when IE promoter region 1
`was not present. Fig. 2B demonstrates that 1.9-kb (lane 1)
`and 1.2-kb transcripts (lanes 3 and 4) were truncated by the
`BamHI and Pst I sites, respectively. In vitro transcription
`was inhibited by a-amanitin at 1 pg/ml (lane 2). Because of
`the complexity of RNAs in this region, it is presently not
`known whether the promoter located at approximately 0.732
`map units is an IE or early promoter.
`The above data suggested that region 1 IE promoter com-
`peted for RNA polymerase II and any other host cell pro-
`teins necessary for in vitro transcription to the point that ac-
`tivity of promoters in IE regions 2 and 3 were not detectable.
`To further evaluate this qualitative difference in DNA tem-
`plates, an equal mixture of each (50 ktg/ml) recombinant
`plasmid containing IE promoter region 1 of CMV (pXEP22)
`and the late adenovirus promoter (pSmaF) was tested for in
`vitro transcription. Fig. 2C is an autoradiogram of the frac-
`tionated 32P-labeled RNAs with the DNA template for IE
`region 1 of CMV (lane 1), the DNA template for the major
`late adenovirus promoter (lane 2), and a mixture of both tem-
`plates (lane 3) for in vitro transcription. Even though the late
`adenovirus promoter was slightly greater in molar equiva-
`lents of DNA because of the smaller size of the DNA frag-
`ment, the major IE promoter of CMV permitted transcrip-
`tion approximately 2-fold greater than the late adenovirus
`promoter based on incorporation of [32P]GTP of newly syn-
`thesized RNA. This calculation was determined by isolating
`the 0.90-kb RNA made from the CMV DNA template and
`the 0.53-kb RNA made from the adenovirus DNA template
`and determining the amount of [32P]GTP associated with
`each RNA molecule and then dividing by the size of the
`RNA molecule. Therefore, with equal amounts of the two
`promoters, the level of synthesis of the transcripts were sig-
`nificantly different.
`DNA Sequence of the Major Promoter-Regulatory Region.
`Fig. 3 shows the nucleotide sequence upstream of the initia-
`tion site of IE region 1. The sequences were confirmed by
`
`CAGCGACCCC CGCCCGTTGA CGTCMTAGT GACGTATGTT CCCATAGTAA CGCCMTAGG
`
`(5') GGCGACCGCC
`
`-480
`
`-420
`
`GACTTTCCAT TGACGTCAAT GGGTGGAGTA TTTACGGTM ACTGCCCACT TGGCAGTACA
`
`-360
`
`TCMGTGTAT CATATGCCAA GTCCGCCCCC TATTGACGTC MTGACGGTA AATGGCCCGC
`
`-300
`
`CTAGCATTAT GCCCAGTACA TGACCTTACG GGAGTTTCCT ACTTGGCAGT ACATCTACGT
`
`-240
`
`ATTAGTCATC GCTATTACCA TGGTGATGCG GTTTTGGCAG TACACCMTG GGCGTGGATA
`
`-180
`
`GCGGTTTGAC TCACGGGGAT TTCCMGTCT CCACCCCATT GACGTCMTG GGAGTTTGTT
`HinclI
`TTGGCACCAA MTCAACGGG ACTTTCCAM ATGTCGTAAT AACCCCGCCC CGTTGACGtA
`
`_~~~~~~~~Sd+
`
`-120
`
`-60
`
`7A9jCGGT AGGCGTGTAC GGTGGGAGGT C
`
`Sacl
`+1
`CA GAGCTCGTTT AGTGAACCGT
`
`(3')
`
`0.751 Mcp Units
`I
`
`0o
`
`I
`
`t CI .-
`I I
`
`T0-
`
`co
`-S
`x Cl
`I
`I
`
`Cap
`
`C.)
`c
`
`T-
`
`C1
`
`(Upper) Nucleotide sequence for the promoter-regula-
`FIG. 3.
`tory region of the major IE gene. The sequences of TE region 1 pro-
`moter-regulatory regions were sequenced in both directions by the
`chemical method as described. The numbers above the sequences
`represent plus or minus nucleotides from the cap site. The transcrip-
`tion initiation site in vivo was determined by Stenberg et al. (18).
`The TATA and CAAT boxes are enclosed. Relevant restriction en-
`zyme sites are underlined and designated. (Lower) The sequence
`assay strategy for the prototype arrangement of the Towne strain. *,
`Termini labeled at either the 5' or 3' end; arrow, direction of se-
`quence determination.
`
`analysis of both complementary DNA strands. The initiation
`site is designated +1 and represents the in vivo cap site as
`determined by Stenberg et al. (18). The sequences reveal
`typical Hogness-Goldberg boxes and "CAAT" boxes (21,
`22) at the predicted distance and in the expected orientations
`for eukaryotic promoter regions. Relevant restriction en-
`zyme sites are underlined and designated. In the IE promot-
`er-regulatory region, a Sac I site is located slightly down-
`stream of the "TATA" box (Fig. 3). This explains why in
`vitro transcription of IE region 1 is eliminated by digestion of
`the DNA template with Sac I. A HincIl site is located up-
`stream of the CAAT box and, consequently, in vitro tran-
`scription with this DNA template was possible, but the
`amount of transcription was reduced approximately half rel-
`ative to DNA templates containing the upstream regulatory
`sequences (see Fig. 2). The locations of the 19- and the 18-
`nucleotide repeat sequences are illustrated in Figs. 3 and 4.
`The 19-nucleotide repeat that overlaps into a 18-nucleotide
`repeat between -397 and -415 was not designated.
`Both the 19- and the 18-nucleotide sequence are repeated
`four times with a 83-95% fidelity. The 19-nucleotide repeat
`sequence characteristically has a CAAT box-like sequence.
`One of these is located approximately 60 nucleotides from
`the cap site. A central sequence was highly conserved within
`the 18-nucleotide repeat and is underlined (Fig. 4). A 16-nu-
`cleotide repeat with the consensus 5'C-T-T-G-G-C-A-G-T-A-
`C-A-T-C-A-A3 is also repeated four times with a 63-100%
`fidelity but is not designated.
`
`Regeneron Ex. 1042
`Page 3 of 5
`
`
`
`662
`
`Biochemistry: Thomsen et aL
`
`Consensus
`for a 19 base
`pair repeat
`
`Consensus
`for a 18 base
`pair repeat
`
`CCCC ATTGACGTCAATGGG
`G
`-72 CCCCGTTGACGCAAATGGG -54
`
`-146 CCCC ATTGACGTCAATGGG -128
`
`-334 CCCCTATTGACGTCAATG A C -314
`
`-468 G CCCGTTGACGTCAAT AGT -450
`
`SIC CTAACGGGACTTTCCAA
`A
`-108 A T CAACGGGACTTTCCAA -91
`
`3'
`
`-171 ACT c ACGGG GA TTTCCAA -154
`
`-276 CCT T ACGGGAGTTTCCTA -259
`
`-427 CC AA T A GGGACTTTCCA T -41 0
`
`Directly repeated sequences in the promoter-regulatory
`FIG. 4.
`region of the major IE gene. Residues shown as large capitals con-
`form to a consensus, whereas those shown as small capitals deviate
`from a consensus. A central region that is highly conserved in the
`18-nucleotide repeat is underlined.
`
`Putative Cruciform Structures with Direct Repeats Located
`5' to the TATA Box. In each case, the 18-nucleotide repeat
`sequences are located in the unpaired region on a putative
`cruciform structure that could form through the surrounding
`sequences. The stability of each structure was estimated by
`the method of Tinoco et al. (20). Two different-type struc-
`tures could form between nucleotides -54 and -198 with
`stabilities ranging from -11.5 to -28.6 kcal per strand. The
`structure with a stability of -28.6 kcal per strand could form
`a cruciform structure through the 19-nucleotide direct repeat
`sequence (Fig. 5). Structures also could form between -252
`and -289 (-6.5 kcal) and -397 and -440 (-16.8 kcal). For
`each putative cruciform structure, the 18-nucleotide repeat
`sequences are positioned usually to the top of the loop. The
`formation of these structures is hypothetical and would re-
`quire the torsional tension of the DNA to be high.
`DISCUSSION
`When the complete promoter-regulatory regions were pres-
`ent, such as IE region 1 plus IE regions 2 and 3 or IE region 1
`plus the major late adenovirus promoter, in vitro transcrip-
`tion was qualitatively higher from IE region 1. Therefore, we
`propose that the upstream sequences of IE region 1 compete
`more efficiently for RNA polymerase II or other host cell
`proteins necessary for in vitro transcription. This type of cis-
`acting regulatory sequence may explain why the IE region 1
`gene of CMV is highly expressed relative to other IE re-
`gions. It is possible that a component in the HeLa cell ex-
`tract may interact directly or indirectly with the sequences
`upstream of IE region 1 and favor transcription of this re-
`gion. However, the sequence upstream of the CAAT box for
`IE region 1 of CMV is not required for in vitro transcription.
`Transcription was detected when the DNA template was cut
`at the HincII site at approximately 65 nucleotides upstream,
`but the relative amount of transcription was reduced.
`In vivo or in vitro transcription by RNA polymerase 11
`may be influenced by the sequence upstream of the 5' end of
`IE region 1. One would expect the free energy of the normal
`heteroduplex DNA in region 1 to be more stable than the
`combined free energy of the four putative cruciform struc-
`
`Proc. NatL Acad Sci. USA 81 (1984)
`
`-108
`
`CCAAAATCACGGGACTTTc
`
`C
`G
`G
`T
`T
`
`A
`A
`T
`G
`T
`
`TT TG-C GCCCCAATA
`AsI
`>G-C
`
`-28.6 k cal
`
`T)
`A-T
`C-G
`T-A
`Q-
`A-" C-8
`sG-C -6
`T-A
`T-A
`"A
`I
`A-To
`Q-G
`-154
`-171
`ACTCACGGGGATTCCAAGTGTCCAC 8-8 -54
`-145
`Putative cruciform structure with direct repeats located
`FIG. 5.
`5' to the TATA box of the major immediate early gene. Only the
`noncoding strand is represented. The numbers at the base of the
`stem designate distance in nucleotides from the initiation site. The
`direct repeat sequence in the loop is in bold-type. The free energy
`per strand (kcal/mol) is designated to the right. Although upstream
`sequences of the IE genes are shown from left to right for conven-
`tional purposes, the gene is transcribed from right to left on the pro-
`totype orientation of the viral genome.
`
`tures based on thermodynamic principles. However, in the
`infected cell or even in the HeLa cell extract, the IE region 1
`CMV DNA template may combine with proteins that affect
`the upstream DNA structure. Chromatin composition up-
`stream from the simian virus 40 early genes has been found
`to be hypersensitive to DNase I, suggesting a structural
`change in the DNA compared to other regions of the chro-
`matin (23, 24). Likewise, other cellular or viral genes have
`been found to have the potential for forming secondary
`structures, with many of them starting 100 to 150 nucleotides
`from the initiation site. For example, integrated Friend
`spleen focus-forming virus could have one relatively stable
`(-69 kcal/mol per strand) cruciform structure 140 nucleo-
`tides from the initiation site to the base of the stem (25).
`Therefore, a correlation might exist between a number of
`cruciform structures, potential stability of cruciform struc-
`tures, the position of the sequences, and the promoter
`strength.
`The functions, if any, of the 19-, 18-, and 16-nucleotide
`repeat sequences are unknown. It is interesting that the 19-
`nucleotide repeat sequence is highly conserved in the Col-
`burn strain of simian CMV, but the 18- and 16-nucleotide
`repeat sequences are only marginally conserved (K. T.
`Jeang and G. S. Hayward, personal communication). Con-
`servation of these sequences in different CMVs suggest that
`they have an important role in IE gene expression.
`The immediate early (a) genes of herpes simplex virus also
`have cis-acting regulatory sequences that have been charac-
`terized by the presence of repeated sequences and GC-rich
`inverted repeats (26, 27). These upstream sequences impart
`upon the IE genes of herpes simplex virus or other genes
`such as thymidine kinase or ovalbumin a capacity to be posi-
`tively regulated (26-30). Therefore, the upstream sequences
`of herpesviruses represent regulatory elements that influ-
`ence the expression of important regulatory genes. In the
`case of CMV, only one gene is expressed in high abundance
`in the absence or presence of de novo protein synthesis at 1
`hr after infection. This major IE gene is located between
`0.739 and 0.751 map units (Towne strain) and codes for 1.95-
`kb mRNA that translates to 72,000-dalton protein (12). With-
`
`Regeneron Ex. 1042
`Page 4 of 5
`
`
`
`Biochemistry: Thomsen et aL
`
`Proc. Natl. Acad ScL USA 81 (1984)
`
`663
`
`11.
`
`12.
`
`13.
`14.
`
`15.
`
`16.
`
`17.
`
`18.
`
`Stinski, M. F., Thomsen, D. R. & Rodriguez, J. E. (1982) J.
`Gen. Virol. 60, 261-270.
`Stinski, M. F., Thomsen, D. R., Stenberg, R. M. & Goldstein,
`L. C. (1983) J. Virol. 46, 1-14.
`Thomsen, D. R. & Stinski, M. F. (1981) Gene 16, 207-216.
`Manley, J. L., Fire, A., Cano, A., Sharp, P. A. & Gefter,
`M. L. (1980) Proc. Natl. Acad. Sci. USA 77, 3855-3859.
`Bailey, J. M. & Davidson, N. (1976) Anal. Biochem. 70, 75-
`85.
`Bishop, D. H. L., Claybrook, J. R. & Spiegelman, S. (1967) J.
`Mol. Biol. 26, 373-387.
`Anderson, K. P., Costa, R. H., Holland, L. E. & Wagner,
`E. K. (1980) J. Virol. 34, 9-27.
`Stenberg, R. M., Thomsen, D. R. & Stinski, M. F. (1983) J.
`Virol. 49, 190-199.
`19. Maxam, A. M. & Gilbert, W. (1980) Methods Enzymol. 65,
`499-560.
`Tinco, I., Borer, P., Dengler, B., Levine, M. D., Uhlenbeck,
`0. C., Crothers, D. M. & Gralla, J. (1973) Nature (London)
`246, 40-41.
`21. Chambon, P. & Breathnach, R. (1981) Annu. Rev. Biochem.
`50, 349-383.
`Liebhaber, S. A., Goossens, M. J. & Wai Kan, Y. (1980) Proc.
`Natl. Acad. Sci. USA 77, 7054-7058.
`Saragosti, S., Cereghini, S. & Yaniv, M. (1982) J. Mol. Biol.
`160, 133-146.
`Shakhov, A., Nedospasov, S. A. & Georgiev, G. P. (1982) Nu-
`cleic Acids Res. 10, 3951-3965.
`Clark, S. P. & Mak, T. W. (1982) Nucleic Acids Res. 10, 3315-
`3330.
`26. Mackem, S. & Roizman, B. (1982) Proc. Natl. Acad. Sci. USA
`79, 4917-4921.
`27. Mackem, S. & Roizman, B. (1982) J. Virol. 44, 939-949.
`28. Mackem, S. & Roizman, B. (1982) J. Virol. 43, 1015-1023.
`Post, L. E., Mackem, S. & Roizman, B. (1981) Cell 24, 555-
`29.
`565.
`Post, L. E., Norrild, B., Simpson, T. & Roizman, B. (1982)
`Mol. Cell Biol. 2, 233-240.
`
`in 100 to 468 nucleotides upstream of the major IE gene are
`palindromic sequences and repeat sequences. Whether or
`not these repeat sequences are associated with cruciform
`structures in the DNA molecule is hypothetical. Neverthe-
`less, we proposed that these sequences and their surround-
`ing dyad symmetry play a role in the relative amount of gene
`expression.
`
`Our sincere thanks go to Pamela Witte for expert assistance. We
`thank Mark Urbanowski for advice in sequence assay and C. Martin
`Stoltzfus for a critical review of this manuscript. This investigation
`was supported by Public Health Service Grant A113526 from the
`National Institute of Allergy and Infectious Diseases and by Grant 1-
`697 from the National Foundation March of Dimes. M.F.S. is the
`recipient of Public Health Service Career Development Award
`A1100373 from the National Institute of Allergy and Infectious Dis-
`eases. R.M.S. is the recipient of a fellowship from the Ladies Auxil-
`iary of the Veterans of Foreign Wars.
`
`1.
`2.
`
`3.
`
`4.
`
`5.
`
`6.
`7.
`
`Kilpatrick, B. A. & Huang, E. S. (1977) J. Virol. 24, 261-276.
`Geelen, J. L. M. C., Walig, C., Wertheim, P. & Van der Noor-
`daa, J. (1978) J. Virol. 26, 813-816.
`DeMarchi, J. M., Blankship, M. L., Brown, G. D. & Kaplan,
`A. S. (1978) Virology 89, 643-646.
`Weststrate, M. W., Geelen, J. L. M. C. & Van der Noordaa,
`J. (1980) J. Gen. Virol. 49, 1-22.
`LaFemina, R. L. & Hayward, G. S. (1980) in Animal Virus
`Genetics, eds. Fields, B. N. & Jaenish, R. (Academic, New
`York), pp. 39-55.
`DeMarchi, J. M. (1981) Virology 114, 23-28.
`Spector, D. H., Hock, L. & Tamashiro, J. C. (1982) J. Virol.
`42, 558-582.
`Wathen, M. W. & Stinski, M. F. (1982) J. Virol. 41, 462-477.
`8.
`9. McDonough, S. H. & Spector, D. H. (1983) Virology 125, 31-
`46.
`Wathen, M. W., Thomsen, D. R. & Stinski, M. F. (1981) J.
`Virol. 38, 446-459.
`
`10.
`
`20.
`
`22.
`
`23.
`
`24.
`
`25.
`
`30.
`
`Regeneron Ex. 1042
`Page 5 of 5
`
`