`
`doi: 10.1093/femsyr/fow064
`Advance Access Publication Date: 3 August 2016
`Minireview
`
`M I N I R EV I EW
`Evolutionary genomics of yeast pathogens in the
`Saccharomycotina
`Toni Gabald ´on1,2,3,∗, Miguel A. Naranjo-Ort´ız1,2 and
`Marina Marcet-Houben1,2
`
`1Department of Bioinformatics and Genomics, Centre for Genomic Regulation (CRG), The Barcelona Institute
`of Science and Technology, Dr. Aiguader 88, 08003 Barcelona, Spain, 2Universitat Pompeu Fabra (UPF), 08003
`Barcelona, Spain and 3Instituci ´o Catalana de Recerca i Estudis Avanc¸ats (ICREA), Pg. Llu´ıs Companys 23, 08010
`Barcelona, Spain
`∗Corresponding author: Department of Bioinformatics and Genomics, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and
`Technology, Dr. Aiguader 88, 08003 Barcelona, Spain. Tel: +34-933160281; E-mail: tgabaldon@crg.es
`One sentence summary: The growing availability of genomic information from the increasingly relevant Candida pathogens is helping us to unravel
`their recent evolution.
`Editor: Carol Munro
`
`ABSTRACT
`
`Saccharomycotina comprises a diverse group of yeasts that includes numerous species of industrial or clinical relevance.
`Opportunistic pathogens within this clade are often assigned to the genus Candida but belong to phylogenetically distant
`lineages that also comprise non-pathogenic species. This indicates that the ability to infect humans has evolved
`independently several times among Saccharomycotina. Although the mechanisms of infection of the main groups of
`Candida pathogens are starting to be unveiled, we still lack sufficient understanding of the evolutionary paths that led to a
`virulent phenotype in each of the pathogenic lineages. Deciphering what genomic changes underlie the evolutionary
`emergence of a virulence trait will not only aid the discovery of novel virulence mechanisms but it will also provide
`valuable information to understand how new pathogens emerge, and what clades may pose a future danger. Here we
`review recent comparative genomics efforts that have revealed possible evolutionary paths to pathogenesis in different
`lineages, focusing on the main three agents of candidiasis worldwide: Candida albicans, C. parapsilosis and C. glabrata. We
`will discuss what genomic traits may facilitate the emergence of virulence, and focus on two different genome evolution
`mechanisms able to generate drastic phenotypic changes and which have been associated to the emergence of virulence:
`gene family expansion and interspecies hybridization.
`
`Keywords: Candida; Saccharomycotina; pathogens; genomics; evolution
`
`INTRODUCTION
`
`Saccharomycotina is a subphylum of the Ascomycota. It con-
`tains a single class (Saccharomycetes), and is comprised mostly
`by yeasts that have naked asci, do not form fruiting bodies
`and can reproduce asexually by budding (Kurtzman, Fell and
`Boekhout 2011). Many budding yeast species are normal compo-
`nents of the human microbiota, inhabit our immediate environ-
`
`ment, and some of them are traditionally used to ferment food
`or beverages (e.g. the baker’s yeast Saccharomyces cerevisiae). In
`addition, some yeasts are considered opportunistic pathogens
`because they can cause disease under certain circumstances
`such as weakening of the immune system. Such opportunis-
`tic pathogenic yeasts constitute an important medical problem
`of increasing incidence. Indeed, yeast pathogens have become
`
`Received: 18 May 2016; Accepted: 2 August 2016
`C(cid:3) FEMS 2016. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License
`(http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, pro-
`vided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
`
`1
`
`LCY Biotechnology Holding, Inc.
`Ex. 1037
`Page 1 of 10
`
`
`
`2
`
`FEMS Yeast Research, 2016, Vol. 16, No. 6
`
`Table 1. List of 31 species reported as agents of candidasis. They are roughly classified in three groups according to their clinical incidence.
`Most incidence data come from a global study involving 141 hospitals from 41 countries over 10 years (1997–2007) (Pfaller et al. 2010). Species
`marked with an asterisk were not listed in that study because they had not been discriminated as a different species, or because they have
`been reported from few cases elsewhere. Names between brackets indicate the currently accepted names for each Candida species.
`
`Candida albicans, C. glabrata, C. parapsilosis, C. tropicalis
`
`Species
`
`Incidence
`
`Common agents of candidiasis
`(5%–70%)
`Rare but may be locally common
`(0.1%–3%)
`
`Rarely reported (<0.1% including
`single-case reports)
`
`C. krusei (Pichia kudriavzevii), C. lusitanae (Clavispora lusitaniae), C. orthopsilosis
`, C. metapsilosis
`guilliermondii (M. guilliermondii), C. kefyr (Kluyveromyces marxianus), C. inconspicua, C. famata
`(synonym: Debaryomyces hansenii), C. rugosa, C. dubliniensis, C. norvegensis (Pichia norvegensis), C.
`∗
`∗
`nivariensis
`, C. bracarensis
`, C. lipolytica (Yarrowia lipolytica), C. sake, C.
`C. pelliculosa (Wickerhamomyces anomalus), C. subhashii
`apicola, C. zeylanoides, C. valida (Pichia membranifaciens), C. intermedia, C. pulcherrima (Metschnikovia
`pulcherrima), C. haemulonii (C. heamulonis), C. stellatoidea, C. utilis (Cyberlindnera jardinii), C. humicola
`(Asterotremella humicola), C. lambica (Pichia fermentans), C. ciferrii (Trichomonascus ciferrii), C. colliculosa
`(Torulaspora delbrueckii), C. holmii (Kazachstania exigua), C. marina (Cryptococcus marinus), C. sphaerica
`(Kluyveromyces lactis)
`
`∗
`
`∗
`
`∗
`
`, C.
`
`a major source of life-threatening nosocomial (i.e. hospital-
`acquired) infections since the 1980s, which is in part explained
`by recent medical progress. For instance, modern medical care
`has increased the survival of persons that are susceptible to
`yeasts opportunistic pathogens, such as premature neonates,
`elderly people, immunocompromised patients, as well as pa-
`tients undergoing immunosuppressive chemotherapy. On the
`other hand, reliance on catheters, large-spectrum antibiotics
`and surgery are all factors that favor the spread and coloniza-
`tion of pathogenic yeasts (Turner and Butler 2014). Candidiasis
`is a medical term that refers to any superficial or invasive fun-
`gal infection caused by any type of Candida yeast. Despite re-
`cent advances, the mortality rates associated to invasive Can-
`didiasis remain high at 30%–40%, and the treatment of such
`infections is complicated by the appearance of resistance to
`antifungals and the emergence of novel pathogenic species (Pa-
`
`pon et al. 2013). The Candida genus itself is a very complexThe Candida genus itself is a very complex
`and heterogeneous taxon comprised of over 160 anamorphic
`and heterogeneous taxon comprised of over 160 anamorphic
`(i.e. asexual reproductive stage) species, of which teleomorphs
`(i.e. asexual reproductive stage) species, of which teleomorphs
`(i.e. sexual reproductive forms) belong to at least 16 different
`(i.e. sexual reproductive forms) belong to at least 16 different
`genera including Pichia, Debaryomyces or Saccharomyces (Guarro,
`genera including Pichia, Debaryomyces or Saccharomyces (Guarro,
`
`
`
`´Gen ´e and Stchigel 1999). Thus, the Candida genus is poliphyleticGen ´e and Stchigel 1999). Thus, the Candida genus is poliphyleticT
`
`and comprises species belonging to evolutionary distant clades.
`and comprises species belonging to evolutionary distant clades.
`Given recent developments in fungal taxonomy, including in-
`Given recent developments in fungal taxonomy, including in-
`creased reliance on genomic data and the ‘one fungus, one
`creased reliance on genomic data and the ‘one fungus, one
`
`name’ rule (Hawksworth 2012), many Candida species are be-name’ rule (Hawksworth 2012), many Candida species are be-
`ing renamed and the whole genus will likely be completely re-
`ing renamed and the whole genus will likely be completely re-
`vamped in the coming years (Brandt and Lockhart 2012). Over
`vamped in the coming years (Brandt and Lockhart 2012).
`30 Candida (or formerly named Candida) species have been iden-
`tified as etiological agents in candidiasis (Table 1). Although in-
`cidence may vary from region to region, four species generally
`account for over 95% of the cases: Candida albicans, C. glabrata,
`C. parapsilosis and C. tropicalis, generally in this order (Pfaller and
`Diekema 2007; Diekema et al. 2012). After these four main play-
`ers, other Candida species can be found as causative agents of
`candidiasis but with much lower incidence, followed by a grow-
`ing list of species that are rarely found in infections, including
`single-case reports (Table 1).
`As mentioned above, despite their common genus name
`and their shared ability to infect humans, Candida pathogenic
`species can belong to phylogenetically distinct clades, which
`also contain non-pathogenic relatives. For instance, C. glabrata
`is a common pathogen that is closely related to the non-
`pathogenic species Saccharomyces cerevisiae, and C. parapsilosis is
`clearly a distinct lineage to that of C. albicans, and is more closely
`
`related to the non-pathogenic Lodderomyces elongisporus. Simi-
`larly, although C. tropicalis is not distantly related to C. albicans,
`it is sister to non-pathogenic species such as C. sojae. This indi-
`cates that the emergence of pathogenesis towards humans has
`occurred several times independently. Importantly, it is as yet
`poorly understood whether these different pathogenic clades
`may exploit common infection strategies or use lineage-specific
`mechanisms, or a combination of both. In addition, within each
`pathogenic clade, there is a large variability of virulence phe-
`notypes across different strains (e.g. hyper and hypovirulent
`strains), and even some pathogenic strains have been described
`for generally non-pathogenic species such as S. cerevisiae (Anoop
`et al. 2015). For many of the pathogenic species, however, it is
`unclear whether the human body constitutes their major eco-
`logical niche or whether they are adapted to a variety of envi-
`ronments. In fact, it is remarkable how little we know about the
`ecology of most yeast species. Ultimately, differences in viru-
`lence across species and strains are related to changes at the
`genomic level, and thus comparative genomics constitutes a
`promising research avenue.
`Recent progress in genomics has facilitated the completion
`of the genomic sequences from numerous Candida pathogenic
`species and their non-virulent relatives, as well as of multiple
`isolates from the same species. Here we review recent advances
`in the understanding of how virulence traits in the different
`clades may have emerged. First, we provide an introduction to
`the computational and experimental approaches that can be
`used to trace what evolutionary mechanisms may have played
`a role in the evolution of a virulence trait. Then, we survey re-
`cent efforts focused on the three most important clades, accord-
`ing to their incidence: C. albicans, C. glabrata and C. parapsilo-
`sis. Finally, we discuss future avenues, including the application
`of high-throughput sequencing, meta-genomics and population
`genomics, to understand the recent evolution and epidemiology
`of fungal pathogens.
`
`COMPARATIVE GENOMICS AND
`PHYLOGENOMICS TO DISCOVER AND
`UNRAVEL THE ORIGINS OF VIRULENCE
`TRAITS
`
`The comparison of genomes under an evolutionary perspective
`(i.e. phylogenomics) constitutes a very powerful tool to under-
`stand the function, origin and evolution of biological processes
`
`LCY Biotechnology Holding, Inc.
`Ex. 1037
`Page 2 of 10
`
`
`
`of interest (Eisen 1998; Gabald ´on 2008; Gabaldon and Marcet-
`Houben 2014). Given the widespread focus on pathogenesis,
`the genomes of virulent organisms have traditionally been the
`main target of whole genome sequencing projects. Comparison
`with non-pathogenic relatives constitutes a common strategy
`to identify and prioritize candidate virulence factors. In this re-
`spect the comparison of the pathogenic bacterium Listeria mono-
`cytogenes, with its non-pathogenic relative L. innocua, constitutes
`one of the first examples of the use of a comparative genomics
`approach to search for genes involved in virulence (Glaser et al.
`2001). In that study the researchers identified several species-
`specific genes, which were enriched in secreted or cell-wall pro-
`teins, including several known virulence factors that were found
`to be specific to L. monocytogenes. The basic rationale of this ap-
`proach is simple. Related species are expected to share the ge-
`netic factors that determine their shared phenotypes, whereas
`they are expected to differ genetically in those genomic regions
`that determine their unique phenotypic characteristics. Differ-
`ences in the presence and absence of genes is one obvious
`trait to focus on, and this approach has been shown to be ex-
`tremely useful in pathogenic bacteria, where genomic islands,
`often transmitted across species through horizontal gene trans-
`fer, can determine the existence of pathogenicity to a given host,
`or of resistance to a certain spectrum of antibiotics (Gal-Mor and
`Finlay 2006). Two main problems underlie this approach. First of
`all, the two compared species are likely to differ in many more
`phenotypes than the one that constitutes the focus of our atten-
`tion (i.e. virulence). In addition, most of the differing phenotypes
`or adaptations are likely to be unknown to us. Therefore, genetic
`differences will comprise those involved in all these many phe-
`notypic differences. This problem scales up with the genetic and
`phenotypic divergence of the species compared. A second prob-
`lem of the gene content comparison approach is that gene gain
`and loss is only one way in which changes in phenotype can be
`driven. Gene duplications, genomic re-arrangements and dis-
`crete point mutations in coding or non-coding regions may all
`determine key changes in phenotype, and would pass unnoticed
`in simple gene-content comparisons.
`Taking into account these considerations, the phylogenomic
`approaches used to understand the evolution of pathogenesis
`have evolved in two main directions (see Fig. 1). On the one hand,
`the resolution of the comparison has increased by considering
`less divergent and a higher number of genomes. This includes
`genomes from more closely related species as well as several
`strains per species. On the other hand, the comparisons have
`developed into more sophisticated ones, including the study of
`large genomic re-arrangements as well as the detection of past
`evolutionary events through the use of a phylogenetic approach
`or the analyses of genomic variation. These methods include not
`only the detection of genes duplicated in a given lineage but also
`those under positive selection. This more detailed approach has
`been driven by parallel developments in genomics and compu-
`tational methods.
`Recent sequencing of several Candida pathogens and their
`close non-pathogenic relatives has enabled the use of compara-
`tive genomics approaches to discover virulence factors and trace
`their evolutionary emergence. Some such examples will be high-
`lighted in the following sections. Before applying a comparative
`genomics approach, however, it is necessary to obtain an evolu-
`tionary scenario in which the relationships between pathogenic
`and non-pathogenic species are clearly delineated. In the fol-
`lowing section we provide an overview of the evolutionary rela-
`tionships among Candida species.
`
`Gabald ´on et al.
`
`3
`
`CANDIDA PATHOGENS ARE
`PHYLOGENETICALLY DIVERSE
`
`To provide a glimpse of the diversity of currently sequenced Can-
`dida species, we used a phylogenomics approach, based on the
`analysis of 516 shared orthologous genes, to reconstruct the evo-
`lutionary relationships of 71 Saccharomycotina species (Fig. 2).
`We highlighted species currently named Candida and those
`whose anamorph name or synonym name are Candida, and
`marked in red those species that are well-established pathogens
`or those that are recognized as emerging fungal pathogens.
`As observed in the tree, Candida species are spread through-
`out the Saccharomycotina phylogeny, being present in most of
`the clades. In addition, Candida species are generally intermin-
`gled with species assigned to other genera, highlighting the
`polyphyly of the genus. Pathogenic species are mainly found
`in two different clades: the CTG group and the Nakaseomyces
`group (marked with blue dots on the tree). The only exception
`is Candida krusei, which branches close to the wine yeast Bret-
`tanomyces bruxellensis. Two other Candida species, C. boidinii and
`C. parapolymorpha (syn. Ogataea parapolymorpha), are found in the
`same clade as C. krusei, though both of them are considered non-
`pathogenic.
`The CTG clade comprises several species that have the par-
`ticularity of using an alternative genetic code in which the CUG
`codon, typically used for leucine, has been reassigned to serine
`and is ambiguously translated in the cytoplasm (Santos et al.
`2011). Interestingly, a reassignment of the same codon to ala-
`nine has apparently occurred in the lineage leading to Pachysolen
`tannophilus (M ¨uhlhausen and Kollmar 2014), a distantly related
`yeast isolated from sulfite liquor. The CTG clade contains most
`of the pathogenic Candida species, including two of the three ma-
`jor candidiasis agents: C. albicans and C. parapsilosis. The group
`comprises many species and 19 have completely sequenced
`genomes. Out of the sequenced species nearly half (9) are from
`human pathogens. Rather than being monophyletic, pathogenic
`species appear interspersed within non-pathogenic ones, with
`at least six independent pathogenic lineages surrounded with
`non-pathogenic relatives. This includes (i) the clade containing
`C. albicans and C. dublinensis; (ii) a separate clade containing C.
`tropicalis, an emerging fungal pathogen which is naturally re-
`sistant to fluconazole and which is more closely related to the
`rarely pathogenic C. sojae than it is to C. albicans (Fig. 2); (iii) the C.
`parapsilosis species clade, comprising C. parapsilosis and the close
`relatives C. orthopsilosis and C. metapsilosis; (iv) a lineage com-
`prising Meyerozyma guilliermondii (C. guilliermondii), and its close
`non-pathogenic relative M. caribbica (Candida fermentati); (v) a lin-
`eage containing Clavispora lusitaniae (Candida lusitaniae), grouped
`with the plant-associated yeast Metschnikowia fructicola; and (vi)
`a separate branch with the multidrug resistant C. auris.
`The second main group of pathogenic Candida species is the
`Nakaseomyces, which is more closely related to Saccharomyces
`cerevisiae (Gabald ´on et al. 2013). The Nakaseomyces indeed is one
`of the several so-called post-WGD (post whole genome dupli-
`cation) lineages, a group of species that diverged from an an-
`cestor that duplicated its genome entirely (Wolfe and Shields
`1997; Dujon 2010). Importantly, it has now become apparent
`that this WGD was triggered by an interspecies hybridization
`event (Marcet-Houben and Gabald ´on 2015). The post-WGD clade
`is very diverse and contains many species of industrial in-
`terest, but is otherwise scarce in pathogenic species. Among
`the post-WGD clade, only the Nakaseomyces clade seems to be
`prone to present the ability to infect humans. It comprises three
`
`LCY Biotechnology Holding, Inc.
`Ex. 1037
`Page 3 of 10
`
`
`
`expansioneventsinthepathogenicspecies.Expansionpointsaremarkedwithastar.
`thatcanbedetectedwiththeuseofphylogenetictrees.Arrowsrepresentgenes.Geneduplicationandlossisrepresentedalongwithchangesingeneorder.(F)Geneexpansions.Phylogenetictreerepresentingtwoindependent
`strains.Eacharrowrepresentsagene.RedhorizontallinesrepresentSNPs.(D)Detectionofpositiveselection.Redhorizontallinesrepresentnon-synonymousSNPsandblackhorizontallinesrepresentsynonymousSNPs.(E)Events
`ofgenes,shownasatablewherethewhitesquaresrepresentmissinggenesandthegrayandredsquarerepresentpresentgenes.Inaddition,redsquaresindicateputativevirulencefactors.(C)Detectionofdifferencesbetween
`strainswithdifferentdegreesofvirulence.Theboxessurroundingthetreeindicatedifferentkindsofanalysesthatcanbeperformedtodetectvirulencetraits.(A)Searchforgenomicre-arrangements.(B)Presenceandabsence
`non-pathogenicspecies,wheretheonesdrawninredarepathogenic.Redstarsindicatetwopointsinthetreewherevirulenceemerged.ForspeciesE,alistofstrainsisindicatedontherightshowingthepresenceofsequenced
`Figure1.Schematicoverviewofcomparativegenomicsmethodstofindvirulencefactors.Inthecenteroftheimageaphylogenetictreerepresentstheevolutionaryrelationshipsbetweenanidealizedgroupofpathogenicand
`
`4
`
`FEMS Yeast Research, 2016, Vol. 16, No. 6
`
`LCY Biotechnology Holding, Inc.
`Ex. 1037
`Page 4 of 10
`
`
`
`Gabald ´on et al.
`
`5
`
`Figure 2. Phylogenetic relationships among Candida. A total of 516 genes detected as single copy genes in Saccharomyces species (Marcet-Houben and Gabald ´on 2015)
`were used to perform a homology search against a proteome database formed by the 71 Saccharomycotina species included in the tree. A phylogenetic tree was
`reconstructed for each group of homologs using the phylome reconstruction pipeline (Huerta-Cepas et al. 2014). Phylogenetic trees were examined with ETE (Huerta-
`Cepas, Dopazo and Gabald ´on 2010), species-specific duplications were deleted and one sequence belonging to the duplicated clade was randomly chosen to represent
`the clade. Trees that, after filtering, contained one-to-one orthologs in at least three-fourth of the species were retained and their alignments were concatenated. The
`final alignment contained 190 110 amino acid positions. The phylogenetic tree was reconstructed using RAxML with the PROTGAMMALG model (Stamatakis, Ludwig
`and Meier 2005). Bootstraps were reconstructed using RAxML rapid bootstrap approach, supports below 100 are marked on the tree. For Candida species, red-colored
`leaves indicate pathogenic species while green-colored leaves indicate non-pathogenic species.
`
`LCY Biotechnology Holding, Inc.
`Ex. 1037
`Page 5 of 10
`
`
`
`6
`
`FEMS Yeast Research, 2016, Vol. 16, No. 6
`
`pathogenic species, with C. glabrata as the most well known and
`which ranks as the second-most common source of candidiasis
`worldwide. Candida bracarensis and C. nivarensis are two related
`emergent pathogens of growing incidence (Gabald ´on and Car-
`ret ´e 2016). The three pathogens within the Nakaseomyces are not
`monophyletic, indicating that they may represent two, or even
`three parallel events of emergence of pathogenesis (Gabald ´on
`et al. 2013).
`Altogether, from a phylogenetic perspective, Candida species
`do not have a single evolutionary origin, and the genus is des-
`tined to be completely redefined if we are to adhere to taxo-
`nomic principles. From a clinical perspective, we should look at
`Candida pathogens as a diverse set of infectious agents, despite
`their shared generic name and the common denomination of
`candidiasis for their infections. Indeed, the phylogenetic diver-
`gence between the four most prevalent Candida species (C. al-
`bicans, C. glabrata, C. parapsilosis and C. tropicalis) is reflected in
`their phenotypic diversity (Brunke and Hube 2013; Papon et al.
`2013).
`Out of these four species, C. albicans is the one showing the
`highest morphological plasticity, being able to grow in the form
`of single cells (yeast form), pseudohyphae and true hyphae de-
`pending on the environment (Sudbery 2011; Thompson, Carlisle
`and Kadosh 2011; Modrzewska and Kurnatowski 2013; Priest and
`Lorenz 2015). In addition, C. albicans can switch to hyphal and
`pseudohyphal forms once phagocytosed and use filamentous
`growth to burst out and kill macrophages. From the other three
`species, only C. tropicalis is known to be able to switch to true
`hyphal form, a capability that is probably much more restricted
`than in the case of C. albicans (Priest and Lorenz 2015). The al-
`ternative genetic code of the CTG clade allows C. albicans to in-
`crease the diversity of its surface proteins, thus making it more
`difficult to detect by the immune system (Nather et al. 2008;
`Miranda et al. 2013). Little is known about the role in pathogen-
`esis of this alternative codon in C. tropicalis and C. parapsilosis.
`Additionally, it is totally absent in the non-CTG clade species
`C. glabrata. Similarly, the response in face of the immune sys-
`tem of the different Candida species is highly variable. Candida
`glabrata shows adaptation to the conditions inside the phago-
`some and is able to live and even reproduce in this harsh en-
`vironment during long periods of time (Fukuda et al. 2013; Sei-
`der et al. 2014). The other three species usually avoid phagocy-
`tosis in the first place, and switch to their hyphal or pseudohy-
`phal stages in order to break free from phagocytic cells (Jim ´enez-
`L ´opez and Lorenz 2013). Candida glabrata is also the only of the
`four that is a haploid, although parasexual cycle in C. albicans
`can produce aneuploidies that can result in partial haploidy
`(Gl ¨ockner and Cornely 2015). Candida glabrata is also naturally
`resistant to azole compounds, and is the most common yeast
`to be able to acquire resistance to echinocandins (Pfaller and
`Diekema 2007). In contrast to other Candida species, C. glabrata
`is more prevalent in adults, where aging, use of antibiotics and
`the severity of other medical conditions are important risk fac-
`tors (Pfaller and Diekema 2007; Guinea et al. 2014). From the four,
`C. parapsilosis is the one that has lower lethality. It is also the
`only one that is not considered a human commensal and thus
`infections of C. parapsilosis are usually acquired from the envi-
`ronment and consistently avoided with proper preventive mea-
`sures (Warnock 2007; Holland et al. 2014). Candida parapsilosis is
`mostly related to the use of catheters, prosthetics and other sim-
`ilarly invasive medical devices. Lastly, C. tropicalis infections are
`usually restricted to patients with neutropenia and other blood
`malignancies and usually respond to azole antifungals (Pfaller
`and Diekema 2007; Guinea et al. 2014). It is, however, more
`
`invasive than C. albicans and infections with this pathogen show
`higher mortality rates. Finally, beyond these generalities, the
`epidemiology of these fungal pathogens shows great variabil-
`ity among human populations and geographical regions (Pfaller
`and Diekema 2007; Warnock 2007; Guinea et al. 2014).
`In summary, the ability of Candida species to infect humans
`has emerged several times independently from each other from
`diverged non-pathogenic ancestors, and the extent to which
`they share virulence strategies is still an open question. From an
`evolutionary perspective, the parallel emergence of a common
`trait (i.e. the ability to infect humans) from different genomic
`backgrounds poses many intriguing questions: How many par-
`allel adaptations emerged independently in more than one lin-
`eage? What set of genomic traits favored the emergence of vir-
`ulence in each case? How many different paths may lead to the
`ability to infect humans? What evolutionary mechanisms have
`promoted the emergence of virulence? In the following sections,
`we provide an overview of recent studies that have addressed
`such and other questions in the main clades of pathogenic Can-
`dida.
`
`GENOME VARIATION IN THE CTG CLADE AND
`EXPANSION OF CELL-WALL FAMILIES IN
`PATHOGENS
`
`One of the first studies targeting genomic variation in Candida
`pathogens was conducted by Geraldine Butler and collabora-
`tors in 2009 (Butler et al. 2009). That study expanded the set of
`genome sequences of Candida pathogens (two by then Candida
`albicans and C. glabrata), by adding five new species, C. tropicalis,
`C. parapsilosis, Lodderomyces elongisporus, C. guilliermondii and C.
`lusitaniae, and a new strain of C. albicans. One of the first find-
`ings of the genome comparison was the discovery of large vari-
`ations in genome size within the CTG clade (up to 50% varia-
`tion in size), despite a similar gene repertoire size, with gene
`numbers ranging from 5733 to 6318. Importantly, this study pro-
`vided the first hint for an enriched cell-wall repertoire in Candida
`pathogens, as compared to non-pathogenic relatives. By exam-
`ining the phylogenetic distribution of 9209 gene families present
`across the analyzed species, they identified 21 which were en-
`riched in the more common pathogens. These included families
`encoding GPI-anchored cell-wall adhesins, secreted lipases, as
`well as oligopeptide transporters, and transcription factors.
`Among the families associated to the cell wall that were
`found to be expanded in pathogenic species, there were some,
`such as Hyr/Iff and Als adhesins, that were found to be dupli-
`cated in tandem in some pathogenic species, with the paralo-
`gous gene clusters sometimes including five or six genes of the
`same family. Tandem duplications are an essential source of ge-
`netic novelty that have been found to commonly underlie novel
`phenotypic traits, and are generally indicative of recent direc-
`tional selection (Conant and Wolfe 2008). In addition, these fam-
`ilies were found to have high levels of sequence variation, in-
`cluding variations in number of intragenic tandem repeats. All
`these findings suggested that some virulence traits, such as an
`increased or more versatile adherence, may have emerged rela-
`tively fast by simple genome re-arrangements such as gene du-
`plications followed by sequence divergence. An alternative ap-
`proach identified 64 families likely subjected to positive selec-
`tion in the lineages leading to highly pathogenic Candida species.
`Again, cell-wall proteins emerged in these analyses, together
`with families involved in hyphal and pseudohyphal filamentous
`growth, as well as in biofilm formation. Altogether, this study
`
`LCY Biotechnology Holding, Inc.
`Ex. 1037
`Page 6 of 10
`
`
`
`highlighted the importance of cell-wall components, probably
`mediating host–pathogen interactions, in shaping the ability to
`infect humans. Despite these important advances, the lack of
`close relatives for most of the pathogenic species included pro-
`vided a limited resolution of the specificities of each pathogenic
`lineage.
`The Butler et al. (2009) study also provided the first in-
`traspecies genome comparison within a Candida pathogen, as
`it provided the genome sequence of a second strain belong-
`ing to a different Multilocus sequence type-based clade of C.
`albicans. The two strains were found to be highly colinear but
`they differed greatly in the extent of homozygous regions in
`their diploid genomes, indicating different events of loss of het-
`erozygosity by break-induced recombination or recent passage
`through a parasexual cycle. More recent studies have increased
`our knowledge on the genetic variability at the species level in
`C. albicans, by analyzing the sequences of 21 independent clini-
`cal isolates (Hirakawa et al. 2015) and 43 serial isolates from 11
`distinct patients (Ford et al. 2015). These analyses indicated that
`most genetic variation among C. albicans isolates corresponded
`to distinct patterns of loss of heterozygosis as well as total or
`partial chromosomal aneuploidies. These genomic changes un-
`derlaid important phenotypic variations such as up to 3-fold dif-
`ferences in growth rates, ability to filament or the resistance to
`antifungals. Importantly, these studies also showed that cell-
`wall proteins, including those mediating adhesion, were en-
`riched among those gene families with accelerated evolutionary
`rates suggestive of positive selection. In the particular case of se-
`rial clinical isolates from patients undergoing azole treatments,
`loss of heterozygosity was found to be an important mechanism
`to mediate fast acquisition of reduced sensitivity to antifungals.
`In addition, point mutations affecting efflux pumps or drug tar-
`gets appeared recurrently in isolates from several patients. How-
`ever, it remains unclear whether this was the result of direc-
`tional selection of standing variation in the patient’s Candida
`population, or the result of the acquisition of novel mutations
`(Ford et al. 2015).
`
`THE CANDIDA PARAPSILOSIS COMPLEX AND A
`POTENTIAL ROLE OF HYBRIDIZATION
`
`Candida parapsilosis was initially recognized as a single species
`and recently re-organized into three different species: C. parap-
`silosis sensu stricto, C. orthopsilosis and C. metapsilosis, respectively
`(Tavanti et al. 2005). Notably, the three different species differ in
`their prevalence and their degree of virulence, C. parapsilosis be-
`ing highly pathogenic and prevalent followed by C. orthopsilosis,
`and the least pathogenic and rarely isolated species C. metap-
`



