`0099-2240/03/$08.00⫹0 DOI: 10.1128/AEM.69.10.5983–5991.2003
`Copyright © 2003, American Society for Microbiology. All Rights Reserved.
`
`Vol. 69, No. 10
`
`Identification and Characterization of the CYP52 Family of Candida
`tropicalis ATCC 20336, Important for the Conversion of Fatty Acids
`and Alkanes to ␣,-Dicarboxylic Acids
`David L. Craft,* Krishna M. Madduri,† Mark Eshoo,‡ and C. Ron Wilson
`Biotechnology Group, Cognis Corporation, Cincinnati, Ohio 45232
`
`Received 20 March 2003/Accepted 25 July 2003
`
`Candida tropicalis ATCC 20336 excretes ␣,-dicarboxylic acids as a by-product when cultured on n-alkanes
`or fatty acids as the carbon source. Previously, a -oxidation-blocked derivative of ATCC 20336 was con-
`structed which showed a dramatic increase in the production of dicarboxylic acids. This paper describes the
`next steps in strain improvement, which were directed toward the isolation and characterization of genes
`encoding the -hydroxylase enzymes catalyzing the first step in the -oxidation pathway. Cytochrome P450
`monooxygenase (CYP) and the accompanying NADPH cytochrome P450 reductase (NCP) constitute the
`hydroxylase complex responsible for the first and rate-limiting step of -oxidation of n-alkanes and fatty acids.
`10 members of the alkane-inducible P450 gene family (CYP52) ofC. tropicalis ATCC20336 as well as the
`accompanying NCP were cloned and sequenced. The 10 CYP genes represent four unique genes with their
`putative alleles and two unique genes for which no allelic variant was identified. Of the 10 genes, CYP52A13 and
`CYP52A14 showed the highest levels of mRNA induction, as determined by quantitative competitive reverse
`transcription-PCR during fermentation with pure oleic fatty acid (27-fold increase), pure octadecane (32-fold
`increase), and a mixed fatty acid feed, Emersol 267 (54-fold increase). The allelic pair CYP52A17 and CYP52A18
`was also induced under all three conditions but to a lesser extent. Moderate induction of CYP52A12 was
`observed. These results identify the CYP52 and NCP genes as being involved in ␣,-dicarboxylic acid produc-
`tion by C. tropicalis and provide the foundation for biocatalyst improvement.
`
`␣,-Dicarboxylic acids (␣,-diacids) are versatile chemical
`intermediates useful as raw materials for the preparation of
`perfumes, polymers, adhesives, and macrolide antibiotics. Cur-
`rently, only three diacids, nonanedioic (azelaic), decanedioic
`(sebacic), and dodecanedioic, are available at a quantity and
`cost acceptable for commercial applications. Long-chain diac-
`ids with more than 12 carbons offer potential advantages over
`shorter-chain diacids, but their limited commercial availability
`and high price have prevented widespread growth in many of
`these applications.
`Several species of yeasts belonging to the genus Candida
`excrete ␣,-diacids as a by-product when cultured on n-alkanes
`or fatty acids as the carbon source (22). One such yeast, Can-
`dida tropicalis ATCC 20336, is the subject of this paper.
`In Candida spp., n-alkanes and fatty acids are metabolized
`by enzymes present in the -oxidation and -oxidation path-
`ways. For alkanes, the first reaction occurs in the -oxidation
`pathway with the formation of the corresponding alcohol. This
`reaction is catalyzed by a cytochrome P450 hydroxylase com-
`plex, which consists of a cytochrome P450 monooxygenase (the
`CYP protein) and the accompanying NADPH cytochrome
`P450 reductase (NCP). A terminal carboxy function is ulti-
`mately formed from two additional oxidation steps, catalyzed
`by a fatty alcohol oxidase and a fatty aldehyde dehydrogenase.
`Fatty acid formed via the -oxidation pathway or introduced as
`
`* Corresponding author. Mailing address: Biotechnology Group,
`Bldg. 53, 4900 Este Ave., Cincinnati, OH 45232. Phone: (513) 482-
`2368. Fax: (513) 482-2862. E-mail: craftdl@one.net.
`† Present address: Dow AgroSciences LLC, Indianapolis, IN 46268.
`‡ Present address: Buck Institute, Novato, CA 94945.
`
`a substrate can be oxidized through the same -oxidation path-
`way to the corresponding aliphatic ␣,-dicarboxylic acid. Fi-
`nally, the diacids as well as the fatty acid substrates can be
`activated to the corresponding acyl-coenzyme A ester and then
`metabolized through -oxidation to yield energy, CO2, and
`water. In Candida spp., the -oxidation pathway is localized in
`the peroxisomes. A necessary step to produce high yields of
`aliphatic diacids from C. tropicalis ATCC 20336 was to elimi-
`nate -oxidation so as not to consume the desired product.
`When -oxidation was blocked by disrupting the POX4 and
`POX5 genes, encoding acyl-coenzyme A oxidase, the enzyme
`catalyzing the first step in the pathway, the result was a dra-
`matic increase in the production of diacids. The resultant -ox-
`idation-blocked strain was designated C.
`tropicalis ATCC
`20962 (13).
`Once the -oxidation-blocked strain was available, the next
`step toward increasing diacid production was directed toward
`improving the flow through the -oxidation pathway by over-
`expressing the enzymes constituting the cytochrome P450 hy-
`droxylase complex. Cytochromes P450 (P450s) are terminal
`monooxygenases of a multicomponent enzyme system. They
`constitute a superfamily of proteins which exist widely in na-
`ture having been isolated from a variety of organisms (9). An
`alkane-inducible family of P450s, CYP52, has been described
`in several different Candida species (11, 20). These P450s and
`their corresponding reductase, NADPH cytochrome P450 re-
`ductase (NCP), constitute the hydroxylase complex responsible
`for the first and rate-limiting step of -oxidation of n-alkanes
`and fatty acids (3, 4). In order to alleviate this rate-limiting step
`in the conversion of n-alkanes and fatty acids to their corre-
`sponding aliphatic diacids,
`it is necessary to increase the
`
`5983
`
`LCY Biotechnology Holding, Inc.
`Ex. 1021
`Page 1 of 9
`
`
`
`5984
`
`CRAFT ET AL.
`
`APPL. ENVIRON. MICROBIOL.
`
`CYP52 protein responsible for the conversion. One method to
`effect the increase in CYP52 protein would be overexpression
`of a specific CYP52 gene.
`it is imperative to identify the
`Prior to overexpression,
`CYP52 enzymes responsible for the -oxidation of the target
`n-alkanes and fatty acids. Quantitative competitive reverse
`transcription-PCR (QC RT-PCR) was used to identify CYP52
`genes induced by selected fatty acid feedstreams. The identity
`of the CYP52 gene(s) involved in -oxidation and the level of
`mRNA transcription for each CYP52 was determined with this
`highly specific and sensitive procedure.
`This paper describes the isolation and characterization of 10
`members of the CYP52 gene family from Candida tropicalis
`ATCC 20336. A sensitive and specific QC RT-PCR was used to
`characterize the specific CYP52 mRNA transcription levels in
`C. tropicalis ATCC 20962 in response to oleic acid, octadecane,
`and a commercially available fatty acid feedstock. Transcrip-
`tional induction patterns identified specific CYP52 genes in-
`volved in the conversion of long-chain fatty acids and n-alkanes
`(⬎12 carbons) into their corresponding dicarboxylic acid.
`The isolation, characterization, and identification of the spe-
`cific CYP52s from C. tropicalis ATCC 20336 will allow the
`development of commercially useful strains for the production
`of aliphatic dicarboxylic acids from n-alkane and fatty acid
`feedstocks.
`
`MATERIALS AND METHODS
`
`Strains. Candida tropicalis ATCC 20336, a natural isolate from an oil field, was
`the wild-type strain used to generate genomic DNA and genomic libraries and
`was the parent of C. tropicalis ATCC 20962 (ura3A/ura3B pox4A::ura3A/pox4B::
`ura3A pox5::ura3A/pox5::URA3A). C. tropicalis ATCC 20962 (initially designated
`strain H5343), an acyl-coenzyme A oxidase-deficient strain blocked for -oxida-
`tion, was used for QC RT-PCR analysis of CYP and NCP gene expression.
`Substrate composition. Pure oleic acid had a composition of 99.45% oleic acid
`(C18:1), 0.14% C16:0 (palmitic acid), 0.20% C18:2 (linoleic acid), and 0.06% C20:1
`(eicosenoic acid). Pure octadecane was 99.73% octadecane with the remainder
`of the substrate consisting of unidentifiable isomers. Emersol 267 (E267) fatty
`acid feedstream had the following fatty acid composition: 2.4% C14:0, 0.7% C14:1,
`4.6% C16:0, 5.7% C16:1, 5.7% C17:1, 1.0% C18:0, 69.9% C18:1, 8.8% C18:2, 0.3%
`C18:3, and 0.9% C20:1.
`Genomic DNA preparation. Genomic DNA from C. tropicalis was prepared
`according to the standard protocol as defined in Current Protocols in Molecular
`Biology (2), with the following modifications. For spheroplasting, 50 l of a
`10-g/ml zymolase solution was added to the sorbitol mixture, and the cell
`suspension was incubated at 37°C for 1.5 h on a rotary shaker (200 rpm). In
`addition, the DNA concentration was determined by the ratio of the absorbance
`at 260 nm to that at 280 nm (A260/280).
`PCR strategy to isolate the CYP52 and NCP genes. CYP52 proteins are highly
`similar and contain several regions of amino acid conservation. Highly conserved
`regions within the CYP52 proteins include the I-helix and heme-binding (HR2)
`regions (6). These regions were used as a basis for the design of degenerate CYP
`PCR primers. In a similar manner, NCP proteins contain several regions of
`amino acid conservation, including flavin mononucleotide binding regions 1
`(FMN1) and 2 (FMN2), the flavin adenine dinucleotide (FAD) region, and the
`NAD phosphate region (NADPH) (23). With C. tropicalis ATCC 20336 genomic
`DNA as a template, PCR amplification was performed with degenerate primers
`for either CYP52 or NCP. PCRs were performed in a Perkin Elmer Thermocy-
`cler or a Perkin Elmer 2400 with the AmpliTaq Gold enzyme (Perkin Elmer
`Cetus) kit according to the manufacturer’s specifications. PCRs were analyzed by
`gel electrophoresis, and products of the predicted sizes were isolated and se-
`quenced. Analysis of the DNA sequence of these PCR products identified novel
`CYP52 and NCP sequences, which were used as probes to screen a C. tropicalis
`ATCC 20336 genomic library.
`Construction of Candida tropicalis ATCC 20336 genomic libraries. Over the
`course of this study, three genomic libraries of C. tropicalis ATCC 20336 were
`constructed. The first library was prepared for Cognis by Clontech laboratories.
`
`The second and third libraries were constructed at Cognis Corporation with
`the ZAP Express vector (Stratagene). C. tropicalis ATCC 20336 genomic DNA
`was partially digested with Sau3A1, and fragments in the range of 6 to 12 kb were
`purified from an agarose gel after electrophoresis of the digested DNA. These
`DNA fragments were then ligated to BamHI-digested ZAP Express vector arms
`and processed according to the manufacturer’s recommendation.
`Screening genomic libraries. Clontech genomic library filters were generated
`according to Clontech recommendations. Colony/Plaque Screen Hybridization
`Transfer Membrane disks (DuPont NEN Research) were used for lifting bacte-
`rial colonies representing the Clontech library. Additional treatment of mem-
`branes was as described in the protocol provided by NEN Research Products.
`Membranes were dried overnight before hybridizing to oligonucleotide probes
`prepared with a nonradioactive enhanced chemiluminescence (ECL) 3⬘ oligola-
`beling and detection system (Amersham Life Sciences). DNA labeling, prehy-
`bridization, hybridization, and detection were performed according to the man-
`ufacturer’s protocols. The hybridization signal was detected with Hyperfilm ECL
`(Amersham). Membranes were aligned to plates containing bacterial colonies
`from which colony lifts were performed, and colonies corresponding to positive
`signals on X-ray were then isolated and propagated in Luria-Bertani broth.
`Plasmid DNAs were isolated from these cultures and analyzed by restriction
`enzyme digestions and by DNA sequencing.
`The ZAP Express genomic library was screened by plating a phage suspen-
`sion containing ⬇2.5 ⫻ 104 amplified lambda phage according to the manufac-
`turer’s recommendation. Magna Lift nylon membranes (Micron Separations,
`Inc.) were used to prepare library filters.
`DNA fragments used as probes were purified from agarose gels with a QIAEX
`II gel extraction kit (Qiagen Inc.) according to the manufacturer’s protocol and
`labeled with an ECL direct nucleic acid labeling kit (Amersham). The mem-
`branes were prehybridized and hybridized with the conditions described for the
`ECL protocol. Labeled DNA (5 to 10 ng of hybridization solution per ml) was
`added to the prehybridized membranes, and the hybridization was allowed to
`proceed overnight. The following day, membranes were washed twice at 42°C for
`20 min in a buffer containing either 0.1⫻ SSC (1⫻ SSC is 0.15 M NaCl plus 0.015
`M sodium citrate) (high stringency) or 0.5⫻ SSC (low stringency) plus 0.4%
`sodium dodecyl sulfate and 360 g of urea per liter. The two washes were per-
`formed with shaking (60 rpm) and in a final buffer volume equivalent to 2 ml/cm2
`of membrane. This was followed by two 5-min washes in 2⫻ SSC at room
`temperature in a final volume equivalent to 2 ml/cm2 of membrane.
`Hybridization signals were generated with the ECL nucleic acid detection
`reagent and detected with Hyperfilm ECL (Amersham). Positive plaques were
`screened secondarily, by similar methods, in order to isolate individual plaques.
`To convert the ZAP Express plaques to plasmid form, Escherichia coli strains
`XL1Blue-MRF⬘ and XLOR were used. The conversion was performed according
`to Stratagene’s protocols for single-plaque excision.
`Plasmid DNA isolation. Plasmid DNA was isolated from E. coli cultures with
`the Qiagen plasmid isolation kit (Qiagen Inc.) according to the manufacturer’s
`protocols.
`DNA sequencing and analysis. DNA sequencing was performed at Sequetech
`Corporation (Mountain View, Calif.). DNA sequences were analyzed with the
`MacVector 7.1 and GeneWorks software packages (Oxford Molecular Group).
`Phylogenetic analysis was conducted with MacVector 7.1.1. The CYP52 gene
`sequences were submitted to David Nelson at the University of Tennessee at
`Memphis and assigned official cytochrome P450 designations based on estab-
`lished criteria set forth by the P450 nomenclature committee (http://drnelson
`.utmem.edu/CytochromeP450.html).
`Gene induction studies. Fermentations were performed according to an es-
`tablished protocol (1). Briefly, a semisynthetic growth medium containing, per
`liter, 75 g of glucose (anhydrous), 6.7 g of yeast nitrogen base (Difco Laborato-
`ries), 3 g ofyeast extract, 3 g ofammonium sulfate, 2 g ofmonopotassium
`phosphate, and 0.5 g of sodium chloride and with a final pH of ⬇5.2 was used for
`each fermentation. The fermentor was inoculated with 5 to 10% of an overnight
`culture of C. tropicalis ATCC 20962 and either pure oleic acid, pure octadecane,
`or Emersol 267 and a glucose cosubstrate feed were added in a fed-batch mode
`beginning near the end of exponential growth. It has been suggested that catab-
`olite repression can affect the expression of genes involved in alkane utilization
`from yeasts (8); therefore, the glucose feed was controlled so as not to allow
`glucose accumulation in the fermentation media. Caustic was added to maintain
`the pH in the desired range. Samples for gene induction studies were collected
`just prior to starting the fatty acid feed (time zero) and over the first 4 h of
`bioconversion.
`Cellular RNA was isolated with the Qiagen RNeasy mini kit (Qiagen Inc.). A
`2-ml sample of C. tropicalis culture was collected from the fermentor in a
`standard 2-ml screw-cap tube. Cell samples were immediately frozen in a dry
`
`LCY Biotechnology Holding, Inc.
`Ex. 1021
`Page 2 of 9
`
`
`
`VOL. 69, 2003
`
`C. TROPICALIS CYP52 FAMILY
`
`5985
`
`ice-alcohol bath. To isolate total RNA from the samples, the tubes were allowed
`to thaw on ice, and the cells were pelleted by centrifugation at 11,000 ⫻ g in a
`microcentrifuge for 5 min at 4°C, and the supernatant was discarded while
`keeping the pellet cold. The microcentrifuge tubes were filled half full with
`ice-cold zirconia-silica 0.5-mm-diameter beads (Biospec Products), and the tube
`was filled to the top with ice-cold RLT lysis buffer. Cell rupture was achieved by
`placing the samples in a mini bead beater (Biospec Products) and immediately
`homogenizing at full speed for 2.5 min. The samples were allowed to cool on ice
`for 1 min, and the homogenization-cooling process was repeated twice. The
`homogenized samples were microcentrifuged at 11,000 ⫻ g for 10 min, and 700
`l of the supernatant was removed and transferred to a new Eppendorf tube.
`Ethanol (700 l of 70% ethanol) was added to each sample, followed by mixing
`by inversion.
`Each sample was transferred to a Qiagen RNeasy spin column and processed
`according to the manufacturer’s protocols. RNase-free water (100 l) was added
`to the column, followed by centrifugation at 8,000 ⫻ g for 15 s to elute the RNA.
`An additional 75 l of RNase-free water was added to the column, followed by
`centrifugation at 8,000 ⫻ g for 2 min. In order to remove contaminating DNA,
`20 l of 10⫻ DNase I buffer (0.5 M Tris [pH 7.5], 50 mM CaCl2, 100 mM
`MgCl2), 10 l of RNase-free DNase I (Ambion), and 40 units of Rnasin (Pro-
`mega Corporation) were added to the RNA sample. The mixture was then
`incubated at 37°C for 30 min. Samples were placed on ice, and 250 l of lysis
`buffer RLT (Qiagen) and 250 l of 100% ethanol were added. The samples were
`transferred to Qiagen RNeasy spin columns and processed according to Qiagen
`protocols, beginning with the RPE wash buffer step. One hundred microliters of
`RNase-free water was added, followed by centrifugation at 8,000 ⫻ g for 15 s.
`Residual RNA was collected by adding an additional 50 l of RNase-free water
`to the spin column, followed by centrifugation at 11,000 ⫻ g for 2 min. 10 l of
`the RNA preparation was removed and quantified by the A260/280 method. RNA
`was stored at ⫺70°C.
`QC RT-PCR protocol. Unique primers directed to variable regions within
`target members of the CYP52 gene family were constructed to be specific enough
`to anneal to the variable region of the target CYP52 gene and its allele without
`annealing to other nontarget members of the CYP52 family. After conducting
`PCR with the specific primers for that target CYP52 gene, the reaction product
`was checked to ensure it represented the unique target gene product or its
`presumed allelic variant. If not, the reaction conditions were altered in terms of
`stringency to focus the reaction to the desired CYP52 target.
`The competitor DNA template was designed and synthesized as follows. The
`competitor RNA was synthesized in vitro from a competitor DNA template that
`has the T7 polymerase promoter and carries a small deletion of about 10 to 25
`nucleotides relative to the native target RNA sequence. In each case, the forward
`primer contains the T7 promoter consensus sequence GGATCCTAATACGA
`CTCACTATAGGGAGG fused to the respective target gene primer. The re-
`verse primer contains the sequence of the original primer followed by 20 bases
`of upstream target sequence creating a deletion of about 10% of the total
`product length (ca. 20 bp) between the primer sequence and the upstream target
`sequence. The forward primer was used with the corresponding reverse primer to
`synthesize the competitor DNA template. The primer pairs were combined in a
`standard Taq Gold polymerase PCR according to the manufacturer’s recom-
`mended conditions (Perkin-Elmer). The PCR mix contained a 250 nM final
`concentration of each primer and 10 ng of C. tropicalis chromosomal DNA for
`the template. The reaction mixture was placed in a thermocycler for 25 to 35
`cycles with the highest annealing temperature possible during the PCRs to
`ensure a homogeneous PCR product. The PCR products were either gel purified
`or filter purified to remove unincorporated nucleotides and primers. The com-
`petitor template DNA was then quantified with the A260/280 method.
`Competitor template DNA was transcribed in vitro to make the competitor
`RNA with the Megascript T7 kit (Ambion). Competitor DNA (250 ng) template
`and the in vitro transcription reagents were mixed according to the directions
`provided by the manufacturer. The reaction mixture was incubated for 4 h at
`37°C. The resulting RNA preparations were then checked by gel electrophoresis.
`The DNA template was then removed with DNase I as suggested by the man-
`ufacturer. The RNA competitor was then quantified by the A260/280 method.
`Serial dilutions of the RNA (1 ng/l to 1 fg/l) were made for use in the QC
`RT-PCRs.
`QC RT-PCRs. QC RT-PCRs were performed with rTth polymerase (Perkin-
`Elmer) according to the manufacturer’s recommended conditions. The reverse
`transcription reaction was performed in a 10-l volume with final concentrations
`of 200 M each deoxynucleoside triphosphate, 1.25 units of rTth polymerase, 1.0
`mM MnCl2, 1⫻ buffer, 100 ng of total RNA isolated from a fermentor-grown
`culture of C. tropicalis, and 1.25 M appropriate reverse primer. To quantitate
`
`CYP52 expression in C. tropicalis, an appropriate target gene reverse primer was
`used. Several reaction mixes were prepared for each RNA sample characterized.
`To quantitate CYP52 expression, a series of previously described QC RT-PCR
`mixes were aliquoted to different reaction tubes. To each tube, 1 l of a serial
`dilution containing from 100 pg to 100 fg of target gene competitor RNA per l
`was added, bringing the final reaction mixtures up to the final volume of 10 l.
`The QC RT-PCR mixtures were mixed and incubated at 70°C for 15 min ac-
`cording to the manufacturer’s recommended times for reverse transcription.
`After incubation, the sample temperature was reduced to 4°C to stop the reac-
`tion, and 40 l of the PCR mix was added to the reaction to bring the total
`volume up to 50 l. The PCR mix consists of an aqueous solution containing
`0.3125 M target gene forward primer, 3.125 mM MgCl2, and 1⫻ chelating
`buffer. The reaction mixtures were placed in a Perkin-Elmer GeneAmp PCR
`System 2400 thermocycler, and the following PCR cycle was performed: 94°C for
`1 min, followed by 94°C for 10 s, followed by 58 to 66°C for 40 s for a total of 17
`to 22 cycles. The PCR was completed with a final incubation at 58 to 66°C for 2
`min, followed by 4°C. In some reactions where no detectable PCR products were
`produced, the samples were returned to the thermocycler for additional cycles,
`and this process was repeated until enough PCR products were produced to
`quantify by high-pressure liquid chromatography (HPLC). This QC RT-PCR
`procedure was applied to all the target genes with the primers indicated.
`Upon completion of the QC RT-PCRs, the samples were analyzed and quan-
`titated by HPLC. From 5 to 15 l of the QC RT-PCR mix was injected into a
`Waters Bio-Compatible 625 HPLC with an attached Waters 484 tunable detec-
`tor. The detector was set to measure a wavelength of 254 nm. The HPLC
`contained a DNASep column (Sarasep, Inc.) which was placed within the oven
`at 52°C. The column was installed according to the manufacturer’s recommen-
`dation of having 30 cm of heated polyether ether ketone tubing installed between
`the injector and the column. The system was configured with a Sarasep brand
`guard column positioned before the injector. In addition, there was a 0.22-m
`filter disk just before the column, within the oven.
`Two buffers were used to create an elution gradient to resolve and quantitate
`the PCR products from the QC RT-PCRs. Buffer A consists of 0.1 M triethyl-
`ammonium acetate and 5% acetonitrile (vol/vol). Buffer B consists of 0.1 M
`triethylammonium acetate and 25% acetonitrile (vol/vol). The QC RT-PCR
`samples were injected into the HPLC, and a linear gradient of 75% buffer
`A–25% buffer B to 45% buffer A–55% buffer B was run over 6 min at a flow rate
`of 0.85 ml min⫺1. The amount of each QC RT-PCR product was plotted and
`quantitated with an attached Waters Corporation 745 data module. The log
`ratios of the amount of QC RT-PCR mRNA product to competitor QC RT-PCR
`product, as measured by peak areas, was plotted, and the amount of competitor
`RNA required to equal the amount of mRNA product was determined. In the
`case of each of the target genes, the competitor RNA contained fewer base pairs
`than the native target mRNA, and therefore the competitor PCR product eluted
`before the native PCR product.
`In addition to the labor-intensive HPLC method for QC RT-PCR quantifica-
`tion, agarose gel electrophoresis of the QC RT-PCRs was employed as an
`alternative method for quantification (15). Similar levels of gene induction were
`obtained when the HPLC and 4% agarose gel analysis methods were compared
`with identical QC RT-PCRs.
`Nucleotide sequence accession numbers. GenBank accession numbers have
`been obtained for CYP52A12 (AY230498), CYP52A13 (AY230499), CYP52A14
`(AY230500), CYP52A15 (AY230501), CYP52A16 (AY230502), CYP52A17
`(AY230504), CYP52A18 (AY230505), CYP52A19 (AY230506), CYP52A20
`(AY230507), and CYP52D2 (AY230503). Sequences used for phylogenetic anal-
`ysis included Saccharomyces cerevisiae CYP51 (M18109) and C. tropicalis CYP51
`(M23673), CYP52C2 (D12718), CYP52C1 (Z13014), CYP52A3 (D12475),
`CYP52A6 (Z13010), CYP52A4 (X51932), CYP52A1 (M15945), CYP52A5
`(D12714), CYP52A2 (M63258), CYP52A7 (Z13011), CYP52A9 (D26160),
`CYP52A8 (Z13012), CYP52A10 (D12719), CYP52A11 (D26159), CYP52D1
`(D12716), CYP52B1 (Z13013), CYP52E1 (X76225), and CYP52E2 (X87640).
`
`RESULTS
`Cloning and characterization of C. tropicalis 20336 cyto-
`chrome P450 monooxygenase (CYP) and cytochrome P450
`NADPH oxidoreductase (NCP) genes. To clone the CYP52 and
`NCP genes, several different strategies were employed. Avail-
`able CYP52 amino acid sequences were aligned, and regions of
`similarity were observed. These regions corresponded to de-
`scribed conserved regions seen in other cytochrome P450 fam-
`
`LCY Biotechnology Holding, Inc.
`Ex. 1021
`Page 3 of 9
`
`
`
`5986
`
`CRAFT ET AL.
`
`APPL. ENVIRON. MICROBIOL.
`
`ilies (5, 7). Proteins from eight eukaryotic cytochrome P450
`families share a segmented region of sequence similarity. One
`region corresponded to the HR2 domain, containing the in-
`variant cysteine residue near the carboxyl terminus which is
`required for heme binding, while the other region corre-
`sponded to the central region of the I helix, thought to be
`involved in substrate recognition (6).
`Degenerate oligonucleotide primers corresponding to these
`highly conserved regions of the CYP52 gene family present in
`Candida maltosa and Candida tropicalis ATCC 750 were de-
`signed and used to amplify DNA fragments of CYP genes from
`C. tropicalis ATCC 20336 genomic DNA. These discrete PCR
`fragments were then used as probes to isolate full-length CYP
`genes from the C. tropicalis ATCC 20336 genomic libraries. In
`a few instances oligonucleotide primers corresponding to
`highly conserved regions were directly used as probes to isolate
`full-length CYP genes from genomic libraries. In the case of
`NCP, a heterologous probe based upon the known DNA se-
`quence for the NCP gene from C. tropicalis 750 was used to
`isolate the C. tropicalis ATCC 20336 NCP genes.
`Cloning of the NCP gene from C. tropicalis ATCC 20336. The
`first genomic library was screened to isolate a full-length NCP
`gene, and three putative NCP clones were obtained. The three
`clones were determined to be truncated for an NCP open
`reading frame; however, they were shown to overlap, and a
`complete NCPA sequence was identified. The NCPA is 4,206
`nucleotides and includes a regulatory region and a protein
`coding region which is 2,037 bp in length. NCPA encodes a
`putative protein of 679 amino acids that shows extensive ho-
`mology to NCP proteins from C. tropicalis 750 and C. maltosa.
`To clone the second NCP allele, a genomic library was
`screened with DNA fragments from the NCPA truncated
`clones. Five clones were obtained, and these were sequenced
`with the three internal primers used to sequence NCPA. Se-
`quence analysis suggested that four of these clones contained
`inserts which were identical to the NCPA allele isolated earlier.
`All four contained a full-length NCPA gene. The fifth clone
`was very similar to the NCPA allele, especially in the open
`reading frame region, where the identity was very high. How-
`ever, there were significant differences in the 5⬘ and 3⬘ untrans-
`lated regions. This suggested that the fifth clone was the allele
`to NCPA. A 4.14-kb region of this plasmid was sequenced, and
`the analysis of this sequence confirmed the presence of the
`NCPB allele. NCPB encodes a putative protein of 679 amino
`acids.
`Cloning of CYP52A13, CYP52A15, CYP52A16, CYP52A17,
`and CYP52A18. Clones carrying five CYP52 genes were isolated
`from a genomic library with an oligonucleotide probe whose
`sequence was based upon the amino acid sequence for the
`highly conserved heme binding region (HR2) present through-
`out the CYP52 family. Based upon DNA sequence analysis,
`three of the CYP genes appeared unique, while the remaining
`two were designated alleles. These five genes were designated
`CYP52A13, CYP52A15, CYP52A16, CYP52A17, and CYP52A18.
`Cloning of CYP52A12 and CYP52A19. CYP52A12 and
`CYP52A19 were isolated from a genomic library with PCR
`fragments as probes. The PCR probe for CYP52A12 was gen-
`erated with oligonucleotide primers designed to amplify a re-
`gion from the helix I region to the HR2 region. Primers were
`designed with all available CYP52 gene sequences available
`
`from the National Center for Biotechnology Information. De-
`generate forward primers were designed based upon an amino
`acid sequence (RDTTAG) from the helix I region. Degenerate
`reverse primers were designed based upon an amino acid se-
`quence (GQQFAL) from the HR2 region. A PCR product of
`approximately 450 bp was obtained. This product was ligated
`to the pTAG vector (R&D Systems). Plasmids from several
`transformants were isolated, and their inserts were character-
`ized. One plasmid contained the PCR clone intact. The DNA
`sequence of the PCR fragment shared homology with the
`DNA sequences for the CYP52A1 gene of C. maltosa and the
`CYP52A3 gene of C. tropicalis 750. When the genomic library
`was screened with the 450-bp PCR product as a probe, a clone
`that contained the full-length CYP52A12 gene was isolated.
`A similar approach was taken to clone CYP52A19. The de-
`sign of the forward primer was based upon a sequence con-
`served near the N terminus of the CYP52A3, CYP52A2,
`CYP52A17, and CYP52A18 genes from C. tropicalis. The re-
`verse primer was designed based on the highly conserved
`heme-binding region. Amplification of C.
`tropicalis ATCC
`20336 genomic DNA with these two primers gave a mixed PCR
`product. When this PCR product was used to screen a genomic
`library, one clone was identified that contained a full-length
`CYP52 gene along with 5⬘- and 3⬘-flanking sequences. This
`gene was designated CYP52A19.
`Cloning of CYP52D2. Screening the genomic library with an
`HR2 degenerate probe yielded a clone that contained a trun-
`cated CYP52 gene. A 1.3- to 1.5-kb EcoRI-SstI fragment con-
`taining part of the truncated CYP52 gene was isolated and used
`as a probe to screen the genomic library for a full-length
`CYP52 gene. One clone containing a full-length CYP gene with
`extensive 5⬘- and 3⬘-flanking sequences was isolated and se-
`quenced. This gene was designated CYP52D2.
`Cloning of CYP52A14 and CYP52A20. A mixed probe con-
`taining CYP52A12, CYP52A13, CYP52A15, CYP52D2,
`CYP52A17, and CYP52A19 sequences was used to screen the
`genomic library, and putative positive clones were identified.
`Seven clones were sequenced with degenerate primers de-
`signed from highly conserved regions of the four CYP52 sub-
`families, namely CYP52A, B, C, and D. The complete DNA
`sequence, including regulatory and protein coding regions, of
`the 10 clones was determined. Two unique CYP52 genes were
`identified and were designated CYP52A14 and CYP52A20.
`Phylogenetic analysis of Candida CYP52 proteins. An anal-
`ysis of the 10 CYP52 deduced amino acid sequences isolated
`from C. tropicalis ATCC 20336 was conducted with the neigh-
`bor-joining method in MacVector 7.1.1 (16). In addition, the
`deduced amino acid sequences for other currently available
`CYP52 family sequences from C. maltosa, C. tropicalis ATCC
`750, and C. apicola were analyzed for their phylogenetic rela-
`tionship to the CYP52 sequences of C. tropicalis ATCC 20336.
`The CYP51 deduced amino acid sequences from Saccharomy-
`ces cerevisiae and C. tropicalis were used as a reference for the
`analysis. The resulting phylogenetic tree is presented in Fig. 1.
`The first observation from the analysis indicates that of the
`10 CYP52 sequences isolated from C. tropicalis ATCC 20336,
`four appear to be unique with their corresponding allelic vari-
`ant. Allelic variants of cytochromes P450 generally have less
`than 3% dissimilarity at the amino acid level; however, differ-
`ences in protein specificity and analy