`
`Diversity and Analysis of Bacterial
`Terpene Synthases
`Yuuki Yamada*, David E. Cane{, Haruo Ikeda*,1
`
`*Laboratory of Microbial Engineering, Kitasato Institute for Life Sciences, Kitasato University,
`Kanagawa, Japan
`{
`Department of Chemistry, Box H, Brown University, Providence, Rhode Island, USA
`1Corresponding author: e-mail address: ikeda@ls.kitasato-u.ac.jp
`
`Contents
`
`1. Terpenoid Metabolites from Bacterial Cultures
`2. Bacterial Terpene Synthases
`2.1 Monoterpene synthases
`2.2 Sesquiterpene synthases
`2.3 Diterpene synthases
`3. Methods for the Study of Bacterial Terpene Synthases
`3.1 Bioinformatic analysis of bacterial terpene synthases
`3.2 Expression of genes encoding bacterial terpene synthases in
`heterologous hosts
`References
`
`124
`124
`125
`126
`128
`129
`129
`
`153
`159
`
`Abstract
`
`Terpenoid compounds are generally considered to be plant or fungal metabolites,
`although a small number of odorous terpenoid metabolites of bacterial origin have
`been known for many years. Recently, extensive bacterial genome sequencing and bio-
`informatic analysis of deduced bacterial proteins using a profile hidden Markov model
`have revealed more than a hundred distinct predicted terpene synthase genes.
`Although some of these synthase genes might be silent in the parent microorganisms
`under normal
`laboratory culture conditions, the controlled overexpression of these
`genes in a versatile heterologous host has made it possible to identify the biochemical
`function of cryptic genes and isolate new terpenoid metabolites.
`
`Methods in Enzymology, Volume 515
`ISSN 0076-6879
`http://dx.doi.org/10.1016/B978-0-12-394290-6.00007-0
`
`# 2012 Elsevier Inc.
`All rights reserved.
`
`123
`
`LCY Biotechnology Holding, Inc.
`Ex. 1048
`Page 1 of 40
`
`
`
`124
`
`Yuuki Yamada et al.
`
`1. TERPENOID METABOLITES FROM BACTERIAL
`CULTURES
`
`Tens of thousands of terpenoid metabolites, including monoterpenes,
`sesquiterpenes, and diterpenes, are present in both terrestrial and marine higher
`plants, in liverworts, and in fungi, with only a relatively minor fraction having
`been isolated from prokaryotes. These compounds act as antibiotics, hormones,
`flavor or odor constituents, and pigments. Some of them also possess other
`physiologically or commercially important properties, such as vitamins (A,
`D, E, K, and coenzyme Q) and antitumor agent (paclitaxel) (Glasby, 1982;
`Sacchettini & Poulter, 1997). The study of bacterial terpenoid metabolites
`has its roots in the detection of odorous and volatile metabolites (Berthelot
`& Andre´, 1891). The pioneering study of bacterial terpenoid metabolites as
`volatile compounds was accomplished by Gerber, who speculated that the
`typical odors of actinomycetes might be due to the production of terpenes
`(Gerber, 1967, 1969, 1971, 1973; Gerber & Lechevalier, 1965). In the years
`that followed, a variety of terpene hydrocarbons and alcohols have been
`isolated from bacteria (Ja´chymova´, Votruba, Vı´den, & Rezanka, 2002;
`Pollack & Berger, 1996; Scho¨ller, Gu¨rtler, Redersen, Molin & Wilkins,
`2002; Wilkins & Scho¨ller, 2009). In this chapter, we review these
`occurrences and describe in silico and bench-level procedures for their analysis.
`
`2. BACTERIAL TERPENE SYNTHASES
`
`Several cyclic monoterpene, sesquiterpene, and diterpene hydrocarbons
`and alcohols are formed by variations of the common terpene cyclization
`mechanism that is initiated by enzyme-catalyzed ionization to form allylic cat-
`ions from the universal acyclic precursors, geranyl diphosphate (GPP), farnesyl
`diphosphate (FPP), and geranylgeranyl diphosphate (GGPP) that are them-
`selves synthesized by mechanistically related condensations of the 5-carbon
`building blocks dimethylallyl diphosphate (DMAPP) and isopentenyl diphos-
`phate (IPP). Intramolecular electrophilic attack of the intermediate allylic
`cations on the central or distal double bonds of the substrate, followed by
`well-precedented cationic transformations,
`including hydride shifts and
`carbon–carbon backbone rearrangements (Wagner–Meerwein reactions),
`and ultimate quenching of the positive charge either by deprotonation or by
`capture of a nucleophilic water molecule, can account for the formation of
`the enormous variety of several hundred cyclic terpene compounds. The ini-
`tiation of formation of cyclic sesquiterpenes from FPP is illustrated in Fig. 7.1.
`
`LCY Biotechnology Holding, Inc.
`Ex. 1048
`Page 2 of 40
`
`
`
`Diversity and Analysis of Bacterial Terpene Synthases
`
`125
`
`H
`
`HH
`
`O
`Caryolan-1-ol
`H
`
`+
`
`Pentalenene
`
`1,6 Closure
`
`1,7 Closure
`
`14
`
`+
`
`8 9
`
`6 7
`10
`
`2
`
`5
`
`4
`
`3
`
`1
`
`15
`
`PPO
`
`12
`
`11
`
`13
`
`1,10 Closure 1,11 Closure
`
`Farnesyl
`diphosphate
`
`+
`
`+
`
`epi-Isozizaene
`
`H
`b-Sesquiphellandrene
`
`Germacrene D
`
`OH
`
`H
`Epicubenol
`OH
`
`H
`
`OH
`
`Geosmin
`
`H
`
`T-muurolol
`
`H
`d-Cadinene
`
`Figure 7.1 Cyclization of farnesyl diphosphate by sesquiterpene synthases.
`
`2.1. Monoterpene synthases
`The homomonoterpene synthase, 2-methylisoborneol synthase, was indepen-
`dently isolated from Streptomyces lasaliensis NRRL 3382 (Komatsu, Tsuda,
`Omura, Oikawa, & Ikeda, 2008) and Streptomyces coelicolor A3(2) (Wang &
`Cane, 2008). The 1446-bp tpc gene of S. lasaliensis NRRL 3382 and the
`1323-bp sco7700 gene of S. coelicolor A3(2) encode 481-aa and 440-aa proteins,
`respectively, with less than 20% identity to pentalenene synthase of Streptomyces
`exfoliatus UC5319, the first bacterial terpene synthase for which the full -
`sequence was determined and thus commonly used as a standard of
`comparison for newly characterized bacterial terpene synthases. The two
`2-methylisoborneol synthases each incorporate the same variants of the
`two characteristic terpene synthase divalent metal-binding motifs, the acidic
`amino acid-rich -DDCYCED- and the more conventional downstream
`NSE triad -NDLYSYTKE-. All genes encoding 2-methylisoborneol synthase
`are separated by 16–45 bp from the proximal downstream gene, which was
`originally annotated as a generic C-methyltransferase. It appears that a two-
`gene biosynthetic operon is therefore involved in the formation of 2-met-
`hylisoborneol. The corresponding downstream gene products of S. lasaliensis
`NRRL 2338 and S. coelicolor A3(2) were confirmed to catalyze methylation of
`GPP at C-2 using S-adenosyl-L-methionine, to give exclusively (E)-2-
`methylgeranyl diphosphate (2-MeGPP). The 2-methylisoborneol synthase
`
`LCY Biotechnology Holding, Inc.
`Ex. 1048
`Page 3 of 40
`
`
`
`126
`
`Yuuki Yamada et al.
`
`then catalyzes cyclization of the methylated acyclic substrate, 2-MeGPP, to
`generate 2-methylisoborneol. The Gram-negative gamma-proteobacterium
`Pseudomonas
`fluorescens Pf0-1 harbors
`an orthologous gene product,
`Pfl01_1841, which has
`significant
`sequence similarity to actinomycete
`2-methylisoborneol synthases. Recombinant Pfl01_1841 has been shown
`to
`catalyze
`generation
`of
`the
`homomonoterpene
`hydrocarbon
`2-methylenebornane from 2-MeGPP without formation of any detectable
`2-methylisoborneol. Surprisingly, a gene corresponding to the requisite
`GPP 2-methyltransferase could not be found in the P. fluorescens Pf0-1 genome
`and 2-methylenebornane was not detected in culture extracts (Chou, Ikeda, &
`Cane, 2011). By contrast, production of 2-methylenebornane has been
`observed in Micromonospora olivasterospora NRRL 8178 and a truncated recom-
`binant protein from M. olivasterospora expressed in E. coli catalyzes formation of
`2-methylenebornane from 2-MeGPP (Ikeda, unpublished data).
`
`2.2. Sesquiterpene synthases
`The first studies of a bacterial terpene synthase came from investigations
`of the biosynthesis of the sesquiterpenoid antibiotic pentalenolactone.
`Pentalenene synthase, first isolated from S. exfoliatus UC5319, was shown
`to catalyze the cyclization of FPP, pentalenene, the parent sesquiterpene
`hydrocarbon of the pentalenolactone family of antibiotics (Cane et al.,
`1994). Extensive experiments with stereospecifically labeled FPP have
`established the detailed cyclization mechanism (Cane et al., 1990). Even-
`tual purification allowed cloning of the S. exfoliatus gene encoding pen-
`talenene synthase and expression in E.
`coli. The crystal structure of
`pentalenene synthase revealed an all a-helical fold very similar to that
`of tobacco epi-aristolochene synthase, with which pentalenene synthase
`shares insignificant primary amino acid sequence similarity (Lesburg, Zhai,
`Cane, & Christianson, 1997). Indeed, this same fold has been found in
`many terpene synthases of diverse biological origin and is believed to
`be universal
`for this class of cyclases. Sequencing of the Streptomyces
`
`avermitilis genome (Ikeda et al., 2003) revealed a 13.6-kb cluster harbor-
`
`ing 13 unidirectionally transcribed protein-coding sequences, one of
`which, sav2998, encodes a 336-aa protein (defined as PtlA) with 76%
`identity to the previously characterized pentalenene
`synthase of
`S. exfoliatus UC5319. Consistent with these observations, recombinant
`PtlA purified from E. coli catalyzed the Mg2 þ
`-dependent cyclization of
`FPP to pentalenene (Tetzlaff et al., 2006).
`
`LCY Biotechnology Holding, Inc.
`Ex. 1048
`Page 4 of 40
`
`
`
`Diversity and Analysis of Bacterial Terpene Synthases
`
`127
`
`The gene encoding geosmin synthase in S. coelicolor A3(2) was indepen-
`dently characterized by two groups (Cane & Watt, 2003, Gust, Challis,
`Fowler, Kieser, & Chater, 2003). The 2181-bp sco6073 encodes an
`unusually large 726-aa protein in which both the N-terminal (366 aa)
`and C-terminal
`(339 aa) halves
`showed about 30% identity to
`S.
`exfoliatus UC5319 pentalenene synthase. Similarly,
`the 2178-bp
`sav2163 of S. avermitilis encodes a 725-aa protein with 77% identity to
`the S. coelicolor A3(2) geosmin synthase (SCO6073) and 27% identity in
`both its N-terminal
`(363 aa)
`and C-terminal
`(329 aa) halves
`to
`pentalenene synthase (Cane, He, Kobayashi, Omura, & Ikeda, 2006).
`The N-terminal domains of both synthases possess
`the characteristic
`conserved metal-binding domains,
`the acidic amino acid-rich motif
`-DDHFLE- and the NSE triad -NDLFSYQRE-, while the C-terminal
`halves contain an unusual acidic amino acid-rich motif, -DDYYP-, as
`well as a canonical -NDVFSYQKE- sequence in both synthases. The
`N-terminal domain has been shown to catalyze cyclization of FPP to
`generate germacradienol and germacrene D, while the C-terminal
`domain is
`responsible for
`the proton-initiated sequential
`retro-Prins
`fragmentation of germacradienol with loss of the 2-propanol side-chain
`as acetone to give an octalin derivative, which is then converted to
`geosmin by proton-initiated hydration and hydride migration (Jiang &
`Cane, 2008, Nawrath et al., 2008). The sequence of geosmin synthase is
`in fact highly conserved across a wide variety of bacterial genera, with
`more than 75 predicted proteins from actinomycetes, myxobacteria, and
`cyanobacteria exhibiting > 55% sequence identity to the S. coelicolor A3
`(2) SCO6073.
`Two genes, S. coelicolor A3(2) sco5222 and S. avermitilis sav3032, encode
`orthologous 361-aa and 363-aa proteins with 24% and 25% identity, respec-
`tively,
`to S. exfoliatus pentalenene synthase. Both deduced proteins,
`SCO5222 and SAV_3032, possess two common conserved metal-binding
`motifs, acidic amino acid-rich -DDRHD- and the downstream NSE triad
`-NDLCSLPKE-. The two corresponding recombinant proteins each cata-
`lyze cyclization of FPP to generate a tricyclic sesquiterpene hydrocarbon
`identified as (þ)-epi-isozizaene (Lin, Hopson, & Cane, 2006, Takamatsu
`
`et al., 2011). The sco5222 and sav3032 genes are each translationally
`coupled with the downstream genes, sco5223 and sav3031, encoding the
`cytochrome P450s CYP170A1 and CYP170A2, respectively. The two-
`gene operon for albaflavenone biosynthesis
`is highly conserved and
`
`LCY Biotechnology Holding, Inc.
`Ex. 1048
`Page 5 of 40
`
`
`
`128
`
`Yuuki Yamada et al.
`
`translationally coupled
`widely distributed, with orthologous pairs of
`sesquiterpene synthase/cytochrome P450 genes with 56–100% identity
`evident
`in 10 species of streptomycetes; production of epi-isozizaene
`and/or albaflavenone has been shown in several of these streptomycetes
`(Moody et al., 2012; Takamatsu et al., 2011).
`Epicubenol synthase was originally isolated by biochemical assay-guided
`methods from Streptomyces sp. LL-B7 that was known to produce cadin-
`4-en-1-ol (Gerber, 1971). The partially purified sesquiterpene synthase
`the Mg2þ
`catalyzes
`(Cane, Tandon, & Prabhakaran, 1993).
`
`-dependent cyclization of FPP to (þ)-epicubenol
`
`2.3. Diterpene synthases
`Diterpene synthases, which have been cloned predominantly from higher
`plants and fungi, have been reviewed (MacMillan & Beale, 1999). They
`can be classified into at least four groups based on the combination of two
`fundamental types of cyclization reaction: (i) type-A, (ii) type-B, (iii) type-
`A-type-B, and (iv) type-B-type-A. The cyclization reaction catalyzed by a
`type-A synthase is initiated by ionization of the diphosphate of GGPP to give
`an allylic cation, followed by cyclization and deprotonation to generate an ole-
`finic diterpene hydrocarbon or capture of a water molecule to generate a
`diterpene alcohol. The type-B reaction is initiated by protonation at the
`C14,15-double bond of GGPP distal to the allylic diphosphate.
`Bacterial genes encoding diterpene synthases were first recognized in
`connection with the observation that, whereas most actinomycetes lack
`the mevalonate (MVA) pathway and utilize the methylerythritol phosphate
`pathway to provide DMAPP and IPP, the key precursors for terpenoid
`metabolism are present in certain strains of Streptomyces. Genes encoding
`well-characterized MVA pathway enzymes are found in such organisms.
`One such strain, Kitasatospora griseola MF730-N6, was found to produce
`the diterpenoid metabolite, terpentecin (Dairi et al., 2001). The gene cluster
`for the MVA pathway lies downstream of a set of five genes required
`for terpentecin biosynthesis. Since two genes, orf11 and orf12, encode a
`499-aa protein with 29% identity to ent-kaurene synthase of Phaeosphaeria sp.
`and a 311-aa protein with 25% identity to S. exofoliatus pentalenene
`synthase, ORF11 and ORF12 correspond to type-B and type-A synthases,
`respectively. The deduced type-A ORF12 protein displays the universally
`conserved pair of divalent metal-binding motifs, an acidic amino acid-rich
`-DDRWD-, and the downstream triad -NDYYSWGRE-. Incubation of
`
`LCY Biotechnology Holding, Inc.
`Ex. 1048
`Page 6 of 40
`
`
`
`Diversity and Analysis of Bacterial Terpene Synthases
`
`129
`
`ORF11, ORF12, and GGPP in the presence of magnesium ions produced
`the diterpene hydrocarbon, terpentetriene (Hamano et al., 2002).
`A second bacterial diterpene synthase is involved in the biosynthesis of
`viguiepinol in Streptomyces sp. KO-3988. The five genes implicated in its bio-
`synthesis lie upstream of the gene cluster for the MVA pathway (Kawasaki
`et al., 2004) and transformants of Streptomyces lividans carrying the five genes
`produced the diterpenoid metabolite, viguiepinol (Kawasaki et al., 2006).
`Two of these genes, orf2 and orf3, encode a 511-aa protein with 32% identity
`to ent-copalyl diphosphate synthase of the fungus Gibberella fujikuroi and a
`295-aa protein with no significant homology to any known terpene synthase,
`respectively. Recombinant ORF2 catalyzes the Mg2þ
`-dependent cycliza-
`tion of GGPP to generate copalyl diphosphate, corresponding to a type-B
`synthase, while the type-A synthase ORF3 converts copalyl diphosphate
`to pimara-9(11),15-diene. Two conserved metal-binding motifs, an acidic
`amino acid-rich -DDHVE- and the downstream triad -NDLATFERE-,
`are present in the deduced ORF3 protein.
`The cyclooctatin biosynthetic gene cluster of Streptomyces melanospo-
`rofaciens MI614-43F2 consists of four genes, cotB1, cotB2, cotB3, and cotB4,
`encoding GGPP synthase, a type-A diterpene synthase, and two cytochrome
`P450s, respectively, but no type-B terpene synthase. The cotB2 gene
`encodes a 307-aa protein, which shows no significant sequence similarity
`to any of the known terpene synthases, displays only one of the conserved
`metal-binding motifs, the triad sequence -NDFYSYDRE-, found in the
`C-terminal region, but no identifiable upstream acidic amino acid-rich mo-
`tif. Recombinant CotB2 catalyzes the Mg2þ
`-dependent cyclization of
`GGPP to the diterpene alcohol, cyclooctat-9-en-7-ol (Kim et al., 2011).
`
`3. METHODS FOR THE STUDY OF BACTERIAL
`TERPENE SYNTHASES
`
`3.1. Bioinformatic analysis of bacterial terpene synthases
`3.1.1 Properties of terpene synthases and database search procedures
`Monoterpene, sesquiterpene, and diterpene synthases from plants and fungi
`have a strongly conserved amino acid sequence homology. Conversely,
`bacterial terpene synthases not only exhibit no significant overall sequence
`amino acid similarity to those from plants and fungi but also usually exhibit
`relatively low mutual sequence similarity to each other. Despite the very
`substantial differences in overall primary amino acid sequence, terpene
`synthases from plants, fungi, and bacteria typically display two highly
`
`LCY Biotechnology Holding, Inc.
`Ex. 1048
`Page 7 of 40
`
`
`
`130
`
`Yuuki Yamada et al.
`
`conserved metal-binding motifs: an acidic amino acid-rich motif, -DDxx
`[DE]- or -DDxxx[DE]-, usually located either 80–120 or 320–360 amino
`acids downstream of the N-termini of bacterial terpene synthases or the
`larger eukaryotic terpene synthases, respectively, and a triad of residues,
`
`-[ND]Dxx[ST]xx[KR][DE]-, located 140 5 amino acids downstream
`
`of the acidic amino acid-rich motif (Felicetti & Cane, 2004). The first con-
`served metal-binding motif in a typical bacterial sesquiterpene synthase is
`the acidic amino acid-rich domain with a high proportion of aromatic
`amino acids, -WFF[VL][FW]DD[LR][FH]D- (pentalenene and epi-
`isozizaene synthases) or -WVF[FY]FDDHFLE- (germacradienol/geosmin
`synthase). Although the upstream conserved motifs in diterpene synthases
`(-LIVNDDRWD-) and in the monoterpene synthases (-AVDDxxx[DE]-)
`also display an acidic amino acid-rich domain, the content of aromatic
`amino acids is lower than that in sesquiterpene synthases. The downstream
`motif, -ND[IL]xSxx[KR]E-, is conserved in all three classes of bacterial
`terpene synthases (Komatsu et al., 2008). Crystallographic analysis of
`monoterpene and sesquiterpene synthases from bacteria, fungi, and plants
`have established that these two metal-binding motifs lie at opposite sides of
`the rim of the deep active-site cavity and are responsible for cooperative
`binding of three divalent cations and the diphosphate moiety of the sub-
`strate (GPP or FPP), precisely positioning the acyclic allylic diphosphate
`substrate and activating it for the ionization that triggers the cyclization cas-
`cade (Christianson, 2006, 2008).
`In general, new proteins with novel functions can be recognized by
`application of the BLAST algorithm for local sequence alignment to deduced
`protein sequences in the public databases. In this manner, such BLAST
`searches of bacterial databases using known bacterial terpene synthases as
`the query have in fact revealed a significant number of presumptive synthases
`of unknown function. On the other hand, this search strategy may frequently
`miss cases of low overall sequence similarity. In spite of low overall primary
`sequence similarity, significant conserved metal-binding motifs such as the
`acidic amino acid-rich domain and the triad domain have been found in
`essentially all known bacterial, fungal, and plant terpene synthases. As an
`alternative to the widely used method of local sequence alignment, therefore,
`we have adopted a search method based on hidden Markov models (HMMs)
`and Pfam search (Finn et al., 2010) for the primary recognition of desired
`terpene synthases. The profile HMM can distinguish members of the rele-
`vant protein functional families from nonmembers with a high degree of
`accuracy (Sonnhammer, Eddy, Birney, Bateman, & Durbin, 1998).
`
`LCY Biotechnology Holding, Inc.
`Ex. 1048
`Page 8 of 40
`
`
`
`Diversity and Analysis of Bacterial Terpene Synthases
`
`131
`
`The first successful trial using a profile HMM was demonstrated in the
`bioinformatics-based discovery of 2-methylisoborneol synthase from bacte-
`rial databases (Komatsu et al., 2008). Presumptive bacterial terpene synthases
`were first harvested from the 2008 NCBI databases of bacterial proteins by
`searching on the basis of the profile HMM using as a model PF03936 (ter-
`pene synthase family, metal-binding domain). From 1,922,990 predicted
`proteins, 41 proteins were initially selected based on strong alignment
`matches and very low E-values. These proteins apparently segregated into
`three major groups on the basis of phylogenetic analysis. Group I containing
`12 proteins was provisionally assigned as monoterpene synthases, subse-
`quently shown to be 2-methylisoborneol synthases. Group II, by far
`the largest group consisting of 27 proteins, included the known sesquiter-
`pene synthases, pentalenene, germacradienol/geosmin, and epi-isozizaene
`synthases as well as several presumptive sesquiterpene synthases of unknown
`biochemical function. Group III contained two diterpene synthases, includ-
`ing terpentetriene synthases. Since the initial HMM analysis, among the pre-
`sumptive sesquiterpene synthases SAV_76 of S. avermitilis has been
`characterized as a new functional protein, avermitilol synthase (Chou
`et al., 2010) and SGR_2079 of Streptomyces griseus IFO 13350 has been
`
`shown to be (þ)-caryolan-1-ol synthase (Nakano, Horinouchi, & Ohnishi,
`
`2011). Searching for an orthologous 2-methylisoborneol synthase in the
`genome sequence of the 2-methylisoborneol-producing cyanobacterium
`Pseudanabaena limnetica str. Castaic Lake turned out to be initially unsuccess-
`ful. Since the original PF03936 profile had been based on an alignment of
`terpene synthase sequences that included those from both plants and fungi, a
`new set of HMM parameters was generated using as training set of the 41
`bacterial terpene synthase sequences that had been identified by the first trial
`HMM experiment (Komatsu et al., 2008), resulting in the effective recog-
`nition of the cyanobacterial 2-methylisoborneol synthase using the new pro-
`file of HMM (Giglio, Chou, Ikeda, Cane, & Monis, 2011).
`
`3.1.2 Genome mining of bacterial terpene synthases
`To date, more than 2700 bacterial genome sequences have been completed
`(http://www.genomesonline.org/). The newest bacterial protein sequence
`databases have been processed by HMM Pfam search using the above-
`described second-generation model specific for bacterial terpene synthases.
`A phylogenetic tree of the harvested terpene synthase sequences illustrating
`the grouping of over 140 candidate bacterial sequences is shown in Fig. 7.2.
`The majority of
`the presumptive terpene synthases are sesquiterpene
`
`LCY Biotechnology Holding, Inc.
`Ex. 1048
`Page 9 of 40
`
`
`
`132
`
`Yuuki Yamada et al.
`
`Figure 7.2 Phylogenetic analysis of terpene synthases from bacterial databases. Phylo-
`genetic analysis of aligned sequences was done by the bootstrap method (bootstrap
`number; 1000, seed number; 111) of CLUSTAL W (Thompson, Higgins, & Gibson,
`1994) version 2.012 (ftp://ftp.ebi.ac.uk/pub/software/unix/clustalw/). The bootstrap
`tree was drawn by njplot (http://pbil.univ-lyon1.fr/software/njplot.html). The “A,” “B,”
`and “C” zones indicate germacradienol/geosmin synthases, epi-isozizaene synthases,
`and 2-methylisoborneol/2-methylenebornane synthases, respectively. Asterisks are
`diterpene synthases.
`
`synthases, including two major clades corresponding to germacradienol/
`geosmin synthases (A zone in Fig. 7.2) and epi-isozizaene synthases (B zone
`in Fig. 7.2), and a third monoterpene synthase clade consisting of
`2-methylisoborneol/2-methylenebornane synthases (C zone in Fig. 7.2).
`Since the total number of predicted diterpene synthases was too small,
`the corresponding diterpene synthase clade could not be estimated precisely,
`although this clade is found at the middle-left of the tree.
`Table 7.1 (type-A synthase) and Table 7.2 (type-B synthase) summarize
`presumptive terpene synthases from the newest bacterial genome databases.
`
`LCY Biotechnology Holding, Inc.
`Ex. 1048
`Page 10 of 40
`
`
`
`Continued
`
` 251
` 21
` 251
` 73
` 251
` 7
` 66
`
` 228
`
` 63
`
` 234
`
` 9
` 231
`
`2.6010
`3.4010
`9.7010
`3.5010
`3.2010
`1.4010
`4.1010
`9.9010
`4.7010
`3.5010
`2.4010
`9.8010
`AccessionnumberE-valueb
`
`YP_001509819
`
`YP_483410
`
`YP_483306
`
`YP_711586
`
`YP_716636
`
`YP_003114277
`
`YP_003115314
`
`YP_003116895
`
`YP_003765432
`
`YP_003763541
`
`YP_003101527
`
`YP_003098781
`
`Franean1_5559
`
`Francci3_4335
`
`Francci3_4231
`
`FRAAL1336
`
`FRAAL6507
`
`Caci_3530
`
`Caci_4612
`
`synthasec
`2-MIB/2-MB
`
`geosminsynthase
`ActinomycetalesGermacradienol/
`
`Actinobacteria
`
`EAN1pec
`Frankiasp.
`
`geosminsynthase
`ActinomycetalesGermacradienol/
`
`Frankiasp.CcI3Actinobacteria
`
`geosminsynthase
`ActinomycetalesGermacradienol/
`
`Actinobacteria
`
`ACN14a
`Frankiaalni
`
`Caci_6200
`
`geosminsynthase
`ActinomycetalesGermacradienol/
`
`Actinobacteria
`
`AMED_3240
`
`synthasec
`2-MIB/2-MB
`
`AMED_1325
`
`geosminsynthase
`ActinomycetalesGermacradienol/
`
`Actinobacteria
`
`mediterraneiU32
`Amycolatopsis
`
`Amir_3801
`
`Amir_0977
`
`proteinID
`Locustagor
`
`geosminsynthase
`ActinomycetalesGermacradienol/
`
`Actinobacteria
`
`43827
`mirumDSM
`Actinosynnema
`
`Typeofsynthasea
`
`Order
`
`Classorphylum
`
`Microorganism
`
`Table7.1Predictedtype-Aterpenesynthasesfromgenome-sequencedbacteria
`
`44928
`acidiphilaDSM
`Catenulispora
`
`LCY Biotechnology Holding, Inc.
`Ex. 1048
`Page 11 of 40
`
`
`
` 64
` 56
` 63
` 13
` 64
` 68
` 32
` 48
`
` 216
`
` 251
` 24
` 57
` 58
` 69
`
` 245
`
`7.2010
`3.1010
`4.3010
`4.6010
`2.3010
`1.4010
`7.8010
`1.4010
`3.1010
`2.6010
`2.5010
`1.3010
`6.9010
`2.0010
`1.4010
`AccessionnumberE-value
`
`YP_003680543
`
`–
`
`–
`
`ZP_04604803
`
`ZP_04609212
`
`ZP_04606882
`
`–
`
`–
`
`–
`
`YP_003382082
`
`YP_004907928
`
`YP_004901829
`
`YP_004903082
`
`YP_004908735
`
`YP_004906345
`
`Ndas_2620
`
`MCBG_05692
`
`MCBG_03612
`
`MCAG_01060
`
`MCAG_05469
`
`MCAG_03139
`
`KUTG_08053
`
`KUTG_08607
`
`synthasec
`2-MIB/2-MB
`
`Actinomycetales
`
`Actinobacteria
`
`synthasec
`2-MIB/2-MB
`
`synthasec
`2-MIB/2-MB
`
`Actinomycetales
`
`Actinobacteria
`
`Actinomycetales
`
`Actinobacteria
`
`43111
`dassonvilleiDSM
`dassonvilleisubsp.
`Nocardiopsis
`
`M42
`Micromonosporasp.
`
`39149
`carbonaceaATCC
`Micromonospora
`
`KUTG_04531
`
`geosminsynthase
`ActinomycetalesGermacradienol/
`
`Kutzneriasp.744Actinobacteria
`
`Kfla_4247
`
`KSE_62070
`
`KSE_00200t
`
`KSE_12950
`
`KSE_70210
`
`KSE_46080
`
`proteinID
`Locustagor
`
`geosminsynthase
`ActinomycetalesGermacradienol/
`
`Actinobacteria
`
`DSM17836
`Kribbellaflavida
`
`synthasec
`2-MIB/2-MB
`
`geosminsynthase
`ActinomycetalesGermacradienol/
`
`Actinobacteria
`
`KM-6054
`Kitasatosporasetae
`
`Typeofsynthase
`
`Order
`
`Classorphylum
`
`Microorganism
`
`Table7.1Predictedtype-Aterpenesynthasesfromgenome-sequencedbacteria—cont'd
`
`LCY Biotechnology Holding, Inc.
`Ex. 1048
`Page 12 of 40
`
`
`
`Continued
`
` 78
`
` 245
`
` 65
`
` 243
`
` 16
` 52
`
` 202
`
` 240
`
` 244
`
`9.7010
`2.1010
`2.2010
`4.7010
`4.1010
`1.2010
`1.8010
`1.8010
`1.1010
`
` 80
`
`1.9010
`
`ZP_06593440
`
`SSHG_04343
`
`synthase
`epi-Isozizaene
`
`ZP_06593727
`
`SSHG_04630
`
`geosminsynthase
`ActinomycetalesGermacradienol/
`
`Actinobacteria
`
`J1074
`Streptomycesalbus
`
`YP_003510780
`
`Snas_1991
`
`synthase
`2-MIB/2-MB
`
`YP_003509930
`
`Snas_1127
`
`geosminsynthase
`ActinomycetalesGermacradienol/
`
`Actinobacteria
`
`YP_001536181
`
`Sare_1287
`
`Actinomycetales
`
`Actinobacteria
`
`YP_001105919
`
`SACE_3722
`
`2-MIBsynthasec,d1
`
`YP_001106173
`
`SACE_3977
`
`YP_001107098
`
`SACE_4907
`
`geosminsynthase
`Germacradienol/
`
`geosminsynthase
`Germacradienol/
`
`YP_001105388
`
`SACE_3187
`
`geosminsynthase
`ActinomycetalesGermacradienol/
`
`Actinobacteria
`
`YP_643279
`
`Rxyl_0493
`
`Rubrobacterales
`
`Actinobacteria
`
`DSM44728
`nassauensis
`Stackebrandtia
`
`CNS-205
`Salinisporaarenicola
`
`2338
`erythraeaNRRL
`Saccharopolyspora
`
`DSM9941
`xylanophilus
`Rubrobacter
`
`LCY Biotechnology Holding, Inc.
`Ex. 1048
`Page 13 of 40
`
`
`
` 36
` 250
` 66
` 71
`
` 75
`
` 234
`
` 75
`
` 77
`
` 81
`
` 255
`
`4.3010
`1.0010
`7.5010
`1.9010
`5.5010
`9.2010
`4.7010
`2.9010
`3.6010
`4.3010
`AccessionnumberE-value
`
`YP_004919385
`
`YP_004912463
`
`ADI04201
`
`ADI12797
`
`ADI12075
`
`ADI05189
`
`NP_821250
`
`NP_824174
`
`NP_824208
`
`NP_823339
`
`SCAT_p0091
`
`SCAT_2953
`
`SBI_01080
`
`geosminsynthase
`ActinomycetalesGermacradienol/
`
`Actinobacteria
`
`8005
`cattleyaNRRL
`Streptomyces
`
`PentalenenesynthaseSBI_09679
`
`SBI_08957
`
`SBI_02068
`
`SAV_76
`
`SAV_2998
`
`SAV_3032
`
`synthasec
`2-MIB/2-MB
`
`geosminsynthase
`ActinomycetalesGermacradienol/
`
`Actinobacteria
`
`synthased5
`Avermitilol
`
`synthased4
`Pentalenene
`
`synthased3
`epi-Isozizaene
`
`geosminsynthased2
`
`SAV_2163
`
`ActinomycetalesGermacradienol/
`
`Actinobacteria
`
`BCW-1
`bingchenggensis
`Streptomyces
`
`MA-4680
`avermitilis
`Streptomyces
`
`proteinID
`Locustagor
`
`Typeofsynthase
`
`Order
`
`Classorphylum
`
`Microorganism
`
`Table7.1Predictedtype-Aterpenesynthasesfromgenome-sequencedbacteria—cont'd
`
`LCY Biotechnology Holding, Inc.
`Ex. 1048
`Page 14 of 40
`
`
`
`Continued
`
` 2
` 8
` 10
` 20
` 29
` 41
` 42
` 48
` 55
` 62
` 69
`
` 70
`
` 75
` 77
` 78
` 82
` 237
`
`1.5010
`6.8010
`2.3010
`1.3010
`7.3010
`4.4010
`9.1010
`4.9010
`1.4010
`2.7010
`1.7010
`6.1010
`1.4010
`6.7010
`7.0010
`4.0010
`1.6010
`
`ZP_06775139
`
`SCLAV_5671
`
`ZP_06775069
`
`SCLAV_5601
`
`ZP_05007982
`
`SCLAV_p1169
`
`ZP_05002948
`
`SCLAV_p1429
`
`ZP_06775944
`
`SCLAV_p0765
`
`ZP_05004575
`
`SCLAV_p0491
`
`ZP_06776351
`
`SCLAV_p1173
`
`ZP_06775752
`
`SCLAV_p0571
`
`ZP_06775755
`
`SCLAV_p0574
`
`ZP_06776164
`
`SCLAV_p0985
`
`ZP_06776363
`
`SCLAV_p1185
`
`Linaloolsynthased8
`
`ZP_05006242
`
`SCLAV_p0068
`
`ZP_05004823
`
`SCLAV_p0328
`
`ZP_06776581
`
`SCLAV_p1407
`
`ZP_05003209
`
`SCLAV_p0982
`
`ZP_05005402
`
`SCLAV_p0635
`
`ZP_06913794
`
`SCLAV_0159
`
`synthased7
`(þ)-T-Muurolol
`synthased7
`(-)-d-Cadinene
`
`synthased6
`1,8-Cineole
`
`geosminsynthase
`ActinomycetalesGermacradienol/
`
`Actinobacteria
`
`ATCC27064
`clavuligerus
`Streptomyces
`
`LCY Biotechnology Holding, Inc.
`Ex. 1048
`Page 15 of 40
`
`
`
` 82
`
` 251
`
` 84
`
` 249
` 62
` 70
` 71
` 77
` 253
` 67
`
` 83
`
` 249
`
`8.8010
`1.8010
`3.0010
`2.8010
`1.1010
`4.2010
`8.8010
`3.0010
`4.6010
`1.6010
`8.2010
`4.8010
`AccessionnumberE-value
`
`ZP_07310844
`
`ZP_07309957
`
`ZP_06576746
`
`ZP_06575915
`
`YP_004922572
`
`YP_004926578
`
`YP_004925455
`
`YP_004926313
`
`YP_004926931
`
`NP_733742
`
`NP_629369
`
`NP_630182
`
`SSRG_02017
`
`synthase
`epi-Isozizaene
`
`SSRG_01130
`
`geosminsynthase
`ActinomycetalesGermacradienol/
`
`Actinobacteria
`
`SSFG_02456
`
`synthase
`epi-Isozizaene
`
`SSFG_01626
`
`geosminsynthase
`ActinomycetalesGermacradienol/
`
`Actinobacteria
`
`Sfla_1617
`
`Sfla_5667
`
`Sfla_4535
`
`Sfla_5399
`
`Sfla_6028
`
`synthasec
`2-MIB/2-MB
`
`geosminsynthase
`ActinomycetalesGermacradienol/
`
`Actinobacteria
`
`2-MIBsynthasec,d11SCO7700
`
`SCO5222
`
`synthased10
`epi-Isozizaene
`
`geosminsynthased9
`
`SCO6073
`
`ActinomycetalesGermacradienol/
`
`Actinobacteria
`
`Tu¨4000
`griseoflavus
`Streptomyces
`
`ATCC14672
`ghanaensis
`Streptomyces
`
`ATCC33331
`flavogriseus
`Streptomyces
`
`coelicolorA3(2)
`Streptomyces
`
`proteinID
`Locustagor
`
`Typeofsynthase
`
`Order
`
`Classorphylum
`
`Microorganism
`
`Table7.1Predictedtype-Aterpenesynthasesfromgenome-sequencedbacteria—cont'd
`
`LCY Biotechnology Holding, Inc.
`Ex. 1048
`Page 16 of 40
`
`
`
`Continued
`
` 57
` 69
` 83
`
` 249
` 69
` 76
`
` 257
`
` 264
`
`1.2010
`3.3010
`7.8010
`4.9010
`2.8010
`6.2010
`2.5010
`2.6010
`
` 72
` 74
` 80
` 258
`
`4.6010
`1.3010
`9.2010
`3.0010
`
`ZP_07293917
`
`SSOG_01998
`
`geosminsynthase
`ActinomycetalesGermacradienol/
`
`Actinobacteria
`
`YP_001823591
`
`SGR_2079
`
`YP_001822781
`
`SGR_1269
`
`YP_001827577
`
`SGR_6065
`
`YP_001828351
`
`SGR_6839
`
`synthased12
`(þ)-Caryolan-1-ol
`2-MIBsynthasec
`
`geosminsynthase
`ActinomycetalesGermacradienol/
`
`Actinobacteria
`
`IFO13350
`Streptomycesgriseus
`
`ATCC53653
`hygroscopicus
`Streptomyces
`
`ZP_06526253
`
`SSPG_00143
`
`ZP_06533634
`
`SSPG_07524
`
`ZP_06528571
`
`SSPG_02461
`
`ZP_05522837
`
`SSPG_01553
`
`ZP_07293082
`
`SSOG_01163
`
`ZP_07300792
`
`SSOG_08875
`
`ZP_07296488
`
`SSOG_04571
`
`synthasec
`2-MIB/2-MB
`
`synthase
`epi-Isozizaene
`
`geosminsynthase
`ActinomycetalesGermacradienol/
`
`Actinobacteria
`
`lividansTK24
`Streptomyces
`
`synthasec
`2-MIB/2-MB
`
`geosminsynthase
`Germacradienol/
`
`LCY Biotechnology Holding, Inc.
`Ex. 1048
`Page 17 of 40
`
`
`
` 59
` 71
`
` 236
`
` 81
`
` 255
` 54
` 54
` 68
`
` 249
`
` 71
` 82
` 50
` 85
` 252
`
`4.9010
`1.7010
`6.1010
`2.8010
`8.2010
`4.7010
`3.4010
`2.3010
`1.4010
`1.5010
`2.6010
`1.7010
`2.3010
`1.7010
`AccessionnumberE-value
`
`YP_003487693
`
`ZP_06587258
`
`ZP_06582730
`
`ZP_06911744
`
`ZP_06913376
`
`ZP_06913794
`
`CCA53839
`
`CCA60397
`
`CCA53556
`
`ZP_06919672
`
`ZP_06920565
`
`YP_003492893
`
`YP_003493696
`
`YP_003486275
`
`SVEN_0552
`
`SVEN_7111
`
`synthasec
`2-MIB/2-MB
`
`SVEN_0269
`
`geosminsynthase
`ActinomycetalesGermacradienol/
`
`Actinobacteria
`
`10712
`venezuelaeATCC
`Streptomyces
`
`SSEG_08483
`
`SSEG_08185
`
`SCAB73741
`
`SCAB82161
`
`SCAB5041
`
`synthase
`epi-Isozizaene
`
`geosminsynthase
`ActinomycetalesGermacradienol/
`
`Actinobacteria
`
`ATCC29083
`Streptomycessviceus
`
`synthasec
`2-MIB/2-MB
`
`SCAB20121
`
`geosminsynthase
`ActinomycetalesGermacradienol/
`
`Actinobacteria
`
`87.22
`Streptomycesscabiei
`
`SSGG_05086
`
`SSGG_00557
`
`SSDG_02809
`
`SSDG_01120
`
`SSDG_03228
`
`proteinID
`Locustagor
`
`synthase
`Caryolan-1-ol
`
`Actinomycetales
`
`Actinobacteria
`
`geosminsynthase
`ActinomycetalesGermacradienol/
`
`Actinobacteria
`
`15998
`roseosporusNRR