`https://doi.org/10.3168/jds.2018-15971
`© 2019, The Authors. Published by FASS Inc. and Elsevier Inc. on behalf of the American Dairy Science Association®.
`This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
`Variance of gametic diversity and its application in selection programs
`D. J. A. Santos,1,2* J. B. Cole,3 T. J. Lawlor Jr.,4 P. M. VanRaden,3 H. Tonhati,2 and L. Ma1*
`1Department of Animal and Avian Sciences, University of Maryland, College Park 20742
`2Departamento de Zootecinia, Universidade Estadual Paulista, Jaboticabal, 14884-900, Brazil
`3Henry A. Wallace Beltsville Agricultural Research Center, Animal Genomics and Improvement Laboratory, Agricultural Research Service, USDA,
`Beltsville, MD 20705-2350
`4Holstein Association USA, Brattleboro, VT 05302-0808
`
`ABSTRACT
`
`) can be
`2(
`The variance of gametic diversity σgamete
`used to find individuals that more likely produce prog-
`eny with extreme breeding values. The aim of this
`study was to obtain this variance for individuals from
`routine genomic evaluations, and to apply gametic vari-
`ance in a selection criterion in conjunction with breed-
`ing values to improve genetic progress. An analytical
`2
` by the sum
`approach was developed to estimate σgamete
`of binomial variances of all individual quantitative trait
`loci across the genome. Simulation was used to verify
`the predictability of this variance in a range of scenari-
`os. The accuracy of prediction ranged from 0.49 to 0.85,
`depending on the scenario and model used. Compared
`with sequence data, SNP data are sufficient for esti-
`2
`. Results also suggested that markers
`mating σgamete
`with low minor allele frequency and the covariance be-
`tween markers should be included in the estimation. To
`2
` into selective breeding programs, we
`incorporate σgamete
`proposed a new index, relative predicted transmitting
`ability, which better utilizes the genetic potential of
`individuals than traditional predicted transmitting
`ability. Simulation with a small genome showed an ad-
`ditional genetic gain of up to 16% in 10 generations,
`depending on the number of quantitative trait loci and
`2
` to the US
`selection intensity. Finally, we applied σgamete
`genomic evaluations for Holstein and Jersey cattle. As
`expected, the DGAT1 gene had a strong effect on the
`2
` for several production traits. How-
`estimation of σgamete
`ever, inbreeding had a small impact on gametic vari-
`ability, with greater effect for more polygenic traits. In
`conclusion, gametic variance, a potentially important
`parameter for selection programs, can be easily com-
`
`Received November 10, 2018.
`Accepted February 27, 2019.
`*Corresponding authors: daniel _jordan2008@ hotmail .com and
`lima@ umd .edu
`
`puted and is useful for improving genetic progress and
`controlling genetic diversity.
`sampling,
`Key words: Mendelian
`heterozygosity, selective breeding, dairy cattle
`
`gamete,
`
`INTRODUCTION
`
`
`
`Since the introduction of marker-assisted selection
`and genomic selection, technological improvements
`have resulted in widespread incorporation of molecular
`information into genetic evaluations (Nejati-Javaremi
`et al., 1997; Meuwissen et al., 2001; Schaeffer, 2006).
`Increased prediction accuracy, along with reduced gen-
`eration intervals, has made genomic selection an impor-
`tant tool for achieving fast progress in dairy selection
`programs (García-Ruiz et al., 2016). Despite concerns
`about inbreeding in selection and mating designs, most
`selection programs only consider breeding values when
`making selection decisions. Even with genomic selec-
`tion models, genomic breeding value or PTA and evalu-
`ation of future progeny are mostly based on expected
`breeding values without consideration of the variability
`of those values due to Mendelian sampling.
`In addition to breeding value or PTA, other selection
`strategies have been proposed to increase the rate of
`genetic progress. One idea was to select animals that
`will provide greater genetic gains in the future rather
`than choosing the best animals in the current popu-
`lation. Goiffon et al. (2017) showed improved genetic
`gains when selecting for the best gametes from a subset
`of individuals in a population. Segelke et al. (2014) dis-
`cussed the potential use of the variation within groups
`of offspring, which allows the assignment of probabilities
`to obtain progeny with a breeding value over a given
`threshold, as well as the number of matings required.
`In a follow-up study, Bonk et al. (2016) showed how
`exact within-family genetic variation can be calculated
`using data from phased genotypes. Recently, Müller et
`al. (2018) proposed a new selection criterion based on
`the expected maximum haploid breeding value. Col-
`lectively, these studies suggest that the incorporation of
`variation of future gametic values into mating decisions
`5279
`
`Exhibit 1042
`Select Sires, et al. v. ABS Global
`
`
`
`5280
`
`SANTOS ET AL.
`
`
`
`and σ
`
`jk
`
`=
`
`−(
`p
`jk
`
`p p
`j k
`
`)
`
`α α
`j
`k
`
`,
`
`[1]
`
`where pj = pk = 0.5, and pjk is the probability that the
`2 reference alleles of the 2 loci are transmitted together;
`pjk can be obtained from the linkage phase and recom-
`bination rate between the 2 loci. For example, pjk =
`0.25 and σjk = 0 when the loci are in linkage equilibri-
`um; pjk = 0.5 and σjk
`= 0 25. α α when the 2 reference
`j
`k
`alleles are on the same chromosome and the loci are in
`complete linkage.
`Extending this calculation from 2 loci to all QTL on
`2
` of individual i can be obtained
`the genome, the σgamete
`as the sum across all N heterozygous QTL:
`
`can improve genetic progress on top of the selection on
`breeding values.
`However, a few questions need to be answered before
`the application of gametic variance to breeding pro-
`grams, such as how to assess the variation of future ga-
`metic values of an individual, how large is the gametic
`variance, how to use this information for selection, and
`how to estimate the variance of gametic diversity and
`use it in existing genomic evaluations. In this study,
`we aimed to address these questions from a statistical
`point of view, demonstrating the equivalence between
`gametic variance and Mendelian sampling variance in
`the classical BLUP (pedigree) model. We also sought
`to explore how this variance can be used as a selection
`criterion in conjunction with breeding values, with the
`goal of maximizing future genetic gains. We propose
`an approach for estimating this variance from routine
`genomic evaluations, verifying the adequacy of the es-
`timates for individuals with and without progeny, and
`estimating the variance of breeding values of future
`progeny for a given mating. Finally, we evaluate the ap-
`plication of gametic variance to improve the selection of
`dairy traits in the US Holstein and Jersey populations.
`
`MATERIALS AND METHODS
`
`Estimation of the Variance of Gametic Diversity
`
`σ
`
`jk
`
`
`.
`
`jN
`
`
`
`1= +
`
`
`
`σ
`
`2
`gamete
`
`=
`
`N
`∑
`j
`
`
`
`1=
`
`]
`
`
`
`1=
`
`N
`∑∑[
`2
`2
`+
`σ
`j
`j
`k
`
`This can be represented in matrix format as follows:
`
`
`
`σ
`
`2
`gamete
`
`= [
`
`α
`1
`
`…
`
`α
`N
`
`]
`
`M
`
`[
`
`α
`1
`
`…
`
`α
`N
`
`]′
`
`,
`
`[2]
`
`)
`=(
` N,...,1
`
`j
` are the allele substitution effects,
`where αj
`and M is the (co)variance matrix of the Mendelian
`transmission probabilities for the N heterozygous loci:
`
`
`
`0 25.
`
`
`
`0 25.
`
`,
`
`
`
`
`
`
`
`
`
`+
`
`cM
`N
`
`1,
`200
`(cid:30)
`
`
`
`−
`
`al
`N
`1,
`
`
`0 25
`.
`
`…
`
`(cid:29)
`
`…
`
`
`
`
`
`225
`
`+
`
`0
`
`.
`
`(cid:30)
`cM
`N
`1,
`200
`
`
`
`
`
`−
`
`N
`
`
`
`1,
`
`
`
`al
`
`
` M =
`
`where aljk is a phase indicator for loci j and k, with
`value 1 when both loci have the reference allele on
`the same chromosome and −1 otherwise; cMjk is the
`genetic distance between the 2 loci (in centimorgans).
`Any 2 loci with genetic distance >50 cM on the same
`chromosome, or on different chromosomes, are assumed
`to be independent and thus have zero values for the
`corresponding elements of M. When all the loci are
`independent,
`
`
`.
`
`
`
`0
`0
`
`0 25.
`
`0
`(cid:29)
`0
`
`
`0 25.
`0
`0
`
`
`
`M =
`
`
`
`We refer to the variance of gametic diversity as
`2
`, which is equivalent to half of the Mendelian
`σgamete
`2
`sampling variance (Appendix A1). σgamete
` measures the
`deviation of progeny breeding values from parent aver-
`age and can be calculated using the probabilities of
`transmission of alleles at all QTL from an individual to
`its gametes. Gametic variance represents the variability
`of all possible gametic values generated by the permu-
`tation and recombination of each parental chromosome.
`In fact, only the heterozygous loci of an individual
`2
`, so we only consider heterozygous
`contribute to σgamete
`loci in the following text.
`Let’s first consider one locus. For a biallelic locus j of
`2
`an individual i with allele substitution effect αj, σgamete
`
`can be calculated from a binomial variance of
`2
`2
`−(
`)
`np
`p
` where the probability of transmis-
`1
`,
`=
`σ
`α
`j
`j
`]
`[
`sion of a reference allele to a gamete p = 0.5 and the
`number of alleles transmitted to a gamete n = 1. When
`2 loci, j and k, are considered for an individual i, the
`resulting variance can be obtained as
`
`
`
`2
`σ
`j k
`[
`+
`
`]
`
`=
`
`2
`σ
`[ ]
`j
`
`+
`
`2
`σ
`[
`k
`
`]
`
`+
`
`2
`σ
`
`
`
`jk
`
`Instead of using genetic distances, M can be set up
`when direct recombination rates are available.
`
`Journal of Dairy Science Vol. 102 No. 6, 2019
`
`Exhibit 1042
`Select Sires, et al. v. ABS Global
`
`
`
`THEORY AND APPLICATION OF GAMETIC VARIANCE
`
`
`
`CRV
`i
`
`=
`
`Simulation
`
`To estimate gametic variance in real data where ge-
`nomic evaluation is available, we proposed to use the
`estimated SNP effects to replace true QTL effects in
`Equation [2]. This approximation of QTL with SNP
`marker effects is similar to that described by Bonk et
`al. (2016). Note that using estimated SNP effects in [2]
`may bias the estimation due to the covariance between
`estimated effects of SNP in linkage disequilibrium (LD)
`and potential biases from shrunken estimators of SNP
`effects, which warrants further investigation.
`
`Application of Gametic Variance
`in Selection Programs
`
`2
`A new selection strategy using σgamete can be pro-
`
`posed, focusing on the future genetic progress (Bijma et
`al., 2018). When a small proportion of animals are se-
`2
` can help identify those that
`lected for breeding, σgamete
`are most likely to produce progeny with extreme breed-
`ing values. Assuming selection intensity is maintained
`across generations, the average genetic value of the
`animals selected in the future will be related to the
`variance of gametes of the selected animals in the cur-
`rent generation. The average breeding value transmit-
`ted to future progeny can be calculated by summing
`the expected value and i times the standard deviation
`σ(
`). The selection intensity (i)
`of gametic diversity i gamete
`represents the number of standard deviations between
`the population average and the average of selected in-
`dividuals. The same intensity can be applied when us-
`ing PTA as the expected value and σgamete as standard
`deviation, to obtain the mean breeding value transmit-
`ted to the selected individuals in the next generation.
`Similar approaches have been proposed by Lehermeier
`et al. (2017) via a usefulness criterion (UC) with ge-
`nomic EBV (GEBV) and the standard deviation of a
`given mating.
`Here, we propose a new selection criterion relative to
`the intensity of selection applied in the next generation
`(if) for an individual i (unknowing mating),
`
`
`
`RPTA
`i
`
`=
`
`PTA
`i
`
`+
`
`σ
`gamete
`
`_
`
`i
`
`×
`
`i
`
`f
`
`,
`
`[3]
`
`where RPTAi (relative PTA) is the average of the ge-
`netic values relative to the group of progeny that will
`be selected in the future (see Appendix A2). In addi-
`tion, we introduce a new concept of coefficient of rela-
`tive variation (CRV) as a measure of the variability
`of the additive genetic values (u) transmitted from an
`individual to its progeny (Appendix A3). The CRV of
`an individual i is defined as follows (where E indicates
`expected value):
`
`5281
`
`[4]
`
`.
`
`2
`i
`
`)
`
`σ
`gamete
`(
`E u
`
`
`
`0 5.
`
`2
`To verify the estimation of σgamete by genomic models
`
`and the use of this new parameter to aid selection, we
`simulated different scenarios with various QTL, geno-
`type, and phenotype data using the QMSim version
`1.10 software (Sargolzaei and Schenkel, 2009). In brief,
`we simulated a historical population, a 10-generation
`recent population, and a 10-generation future popula-
`tion (Table 1).
`To mimic real populations, a historical population
`was simulated with the same proportion of males and
`females that were mated randomly. This population
`was generated in 3 phases: the first phase consisted
`of 500 generations with a constant population size of
`1,000 individuals; the second phase had 500 generations
`with a constant reduction of population size from 1,000
`to 200 to generate LD and establish drift-mutation bal-
`ance; and the third phase included 10 generations of
`expansion, where the population size increased from
`200 to 3,000. From the last generation of this historical
`population, 200 males and 800 females were randomly
`selected as founders to generate the study population,
`which consisted of 10 generations with 5 progeny per
`dam and a ratio of 50% males in the offspring. We
`simulated a selection for breeding values estimated by
`the classical BLUP (Henderson, 1975). The replacement
`ratio was set at 20% for dams and 60% for sires (Brito
`et al., 2011), and mating was random among selected
`individuals. The replacement ratio is the proportion of
`animals to be culled and replaced in each generation.
`From the study population (last 10 generations of the
`simulation), genotype and QTL data were obtained for
`the 9th generation (treated as a reference population)
`and the 10th generation (the validation population).
`The marker effects were first estimated in the reference
`2
` values for all individuals were
`generation. The σgamete
`estimated for both the reference and validation popula-
`tions using the marker effects estimated in the reference
`generation. For comparison, true gametic variance was
`also calculated using the QTL effects and their geno-
`type data from the simulation.
`To reduce computational load, a small genome, with
`4 autosomal chromosomes of 50 cM each, was simu-
`lated. The mutation rate was fixed at 2.5 × 10−5 in
`the historical population. The number of crossovers was
`sampled from a Poisson distribution. A total of 200,000
`markers and different sets of QTL were simulated to be
`randomly distributed along the genome. After the ge-
`
`Journal of Dairy Science Vol. 102 No. 6, 2019
`
`Exhibit 1042
`Select Sires, et al. v. ABS Global
`
`
`
`5282
`
`SANTOS ET AL.
`
`nome was simulated, a panel with 10% of the polymor-
`phic markers was sampled every 0.5 cM and another
`panel with 20% of the markers was sampled every 0.5
`cM. The first panel was chosen to mimic a high-density
`SNP panel and the second for sequence data. A detailed
`description of the parameters is reported in Table 1.
`Six traits were simulated with heritabilities of 0.1,
`0.3, and 0.5 and 20 QTL (i.e., 0.1 QTL per cM) or 200
`QTL (i.e., 1 QTL per cM), respectively. We used 2
`QTL densities similar to those used by Meuwissen et al.
`
`(2001). The QTL effects were generated based on a
`gamma distribution with parameter β = 0.4 (Hayes and
`Goddard, 2001). The phenotypic variance was assumed
`to be 1 for all traits. Four replicates were used for each
`trait. In addition, 10 future generations were simulated
`where the individuals were selected either by the true
`breeding value (T_PTA) or by true RPTA (T_RPTA)
`to verify and compare the genetic gains obtained using
`these criteria. To assess the effect of these indices on
`selection in the future generations, the replacement ra-
`
`Table 1. Summary of simulation parameters
`
`Parameter
`
` Value
`
`200 cM
`4
`20 and 200
`10,000 (high-density panel) and 20,000+ QTL (sequence data)
`2.5 × 10−5
`2.5 × 10−3
`Evenly spaced
`Random (uniform distribution)
`Gamma distribution (β = 0.4)
`
`6
`0.10, 0.30, 0.50
`1
`No
`
`500
`Constant (500 males and 500 females)
`Random
`
`500
`1,000
`200 (100 males and 100 females)
`Random
`
`10
`200 (100 males and 100 females)
`3,000 (1,500 males and 1,500 females)
`Random
`
`10
`9th
`9th and 10th
`5
`1,000 (200 males 800 females)
`Random
`BLUP
`BLUP
`20% females and 60% males
`Yes
`(
`Correlation σ
`
`_
`
`estimated
`
`)
`
`Genome parameter
` Genome size
` Number of chromosomes
` Number of QTL
` Number of markers
` Mutation rate, QTL
` Mutation rate, marker
` Marker positions in genome
` QTL position in genome
` QTL allele effect
`Trait parameters
` Number of traits
` Heritability
` Phenotypic variance
` Sex-limited trait
`Population structure parameters
` Historical generation
` Phase 1
` Number of generations
` Number of animals
` Mating
` Phase 2
` Number of generations
` Initial number of animals
` Final number of animals
` Mating
` Phase 3
` Number of generations
` Initial number
` Final number
` Mating
` Recent generation
` Number of generations
` Reference population
` Validation population
` Number of offspring per dam
` Founders
` Mating
` Selection
` Cutting
` Replacement
` Overlapping generation
` Generation 9–10 (predictability)
` Future generation
` Number of generations
`10
`2
` Criterion of selection1
`T_PTA = TRUE/2 or T_RPTA (TRUE/2) + σgamete
` Number of offspring per dam
`5 or 10
` Replacement
`100% females and 100% males
` Better criterion
`Genetic gain per generation
`2
`1T_PTA = true PTA; T_RPTA = true relative PTA; σgamete
`
`2
`gamete
`
`,
`
`
`σ
`
`2
`gamete
`
`Journal of Dairy Science Vol. 102 No. 6, 2019
`
` = variance of gametic diversity.
`
`Exhibit 1042
`Select Sires, et al. v. ABS Global
`
`
`
`THEORY AND APPLICATION OF GAMETIC VARIANCE
`
`5283
`
`trix in Equation [2] was applied to incorporate recom-
`bination rate
`
`,
`
`
`
`0 25.
`
`+
`
`jk
`
`rate
`2
`
`
`
`−
`
`Mjk
`
`=
`
`al
`
`jk
`
`
`
`when the recombination rate is <0.5; and Mjk = 0 when
`the rate ≥0.5.
`
`RESULTS AND DISCUSSION
`
`Estimation of Gametic Variance
`with Genomic Models
`
`The variance of progeny breeding values has been
`investigated in previous studies (Cole and VanRaden,
`2011; Segelke et al., 2014; Bonk et al., 2016). Here, we
`sought to use simulation to evaluate the predictability
`of gametic variance as a parameter for selection. To
`evaluate the predictability, a comparison with classical
`simulation studies with genomic prediction was adopt-
`2(
`) was cal-
`ed. The variance of gametic diversity σgamete
`culated considering both dependence and independence
`between loci, using all QTL and QTL with MAF ≥5%,
`and utilizing high-density SNP and sequence data with
`marker effects obtained from genomic models. The
`Pearson correlation between the true and estimated
`2
` ranged from medium to high (Table 2), similar
`σgamete
`to those studies on breeding values (Meuwissen et al.,
`2001; Daetwyler et al., 2010; Clark et al., 2011). In
`general, the correlation increased when the heritability
`(h2) of traits increased, whereas the same relation was
`not apparent when the number of QTL was large. Dif-
`ferently, for the GEBV prediction, the increase in ac-
`curacy has been reported with increased h2 and for
`scenarios with a small number of QTL, particularly
`when these were estimated by differential shrinkage
`models (Daetwyler et al., 2010; Clark et al., 2011).
`We observed higher correlations between the true
`2
` using BLASSO compared with
`and predicted σgamete
`GBLUP in all scenarios (Table 2). These results were
`partly due to the small genome and large QTL effects
`simulated. Although GBLUP can have a similar or
`slightly better performance for prediction of GEBV
`than differential shrinkage models for scenarios with a
`large number of QTL (Daetwyler et al., 2010), the ac-
`curacy of the estimated marker effects, mainly for QTL
`regions, is greater from differential shrinkage models
`(Meuwissen et al., 2001; Shepherd et al., 2010; Legarra
`2
`, the marker effect
`et al., 2011). For estimating σgamete
`has a greater impact than for GEBV prediction because
`
`tio was maintained at 100% and the number of offspring
`per dam was 5 (corresponding to a selection intensity of
`0.996 for females and 1.76 for males) or 10 (correspond-
`ing selection intensities of 1.4 for females and 2.06 for
`2
` is a latent variance, its
`males). As the predicted σgamete
`realized value depends on the number of progeny of an
`individual. Any inference using this variance should be
`regarded as a bet (probability of an event considering
`the number of attempts). Therefore, the selection in-
`tensity applied to RPTA (if) may need to be adjusted
`accordingly, and 3 values of if (0.5, 0.8, and 1) were
`tested in this study.
`
`Genomic Analysis
`2
`Because σgamete depends on the marker effects in ge-
`
`nomic models, we used a model that assumed homoge-
`neity of variance of marker effects, GBLUP (SNP-
`BLUP), and another model that allowed heterogeneity
`of marker effects with differential shrinkage through
`the improved Bayesian LASSO (BLASSO; Legarra et
`al., 2011). The analyses were performed using the GS3
`v.3 software (Legarra et al., 2015). The model included
`the population mean, marker effects, and residual. Only
`markers with minor allele frequency (MAF) >0.05
`were considered. For estimation of additive and residu-
`al variances, the simulated true values were used as
`initial values to reduce computational complexity, fol-
`lowed by 20,000 iterations with the burn-in of 2,000
`initial chains.
`
`Application of Gametic Variance to Real Data
`
`The data used were part of the 2017 US genomic
`evaluations from the Council on Dairy Cattle Breeding
`(CDCB, Bowie, MD), consisting of 1,364,278 Holstein
`and 164,278 Jersey cattle from the national dairy cattle
`database. Five dairy traits based on up to 5 lactations
`were analyzed: milk (MY), fat (FY) and protein (PY)
`yields, and fat (F%) and protein (P%) percentages.
`The genotype data were generated from different SNP
`arrays with the number of SNP ranging from 7K to
`50K. All individuals were imputed to a common panel
`of 60,671 SNP and their linkage phase were determined
`by FindHap version 3 (VanRaden et al., 2011). The
`2
` was calculated using Equation [2] with estimated
`σgamete
`
`SNP effects ˆ .α1( ) The marker effects were derived from
`the PTA obtained from the genomic evaluation. Sex-
`specific recombination rates between SNP markers in
`Holstein and Jersey cattle were directly used in this
`study (Ma et al., 2015; Shen et al., 2018). Thus, a
`modification to the off-diagonal elements of the M ma-
`
`Journal of Dairy Science Vol. 102 No. 6, 2019
`
`Exhibit 1042
`Select Sires, et al. v. ABS Global
`
`
`
`5284
`
`SANTOS ET AL.
`
`2
` uses the squared marker effects as well as the
`σgamete
`dependency of the chromosome segments. Therefore,
`this observation can also be attributed to the greater
`accuracy of the marker effects estimated by BLASSO
`and to the high dependency of the chromosome seg-
`ments simulated.
`The effect on prediction was inferred by a linear re-
`2
`. For the
`gression between true and estimated σgamete
`intercept of regression (a), GBLUP had a lower scale
`effect (close to zero) than BLASSO but the difference
`was not large (Table 3). A low scale effect is important
`2
` prediction because it affects the precision of
`for σgamete
`
`the limit values of the confidence interval for future
`progeny PTA. The scale effect may be affected by the
`prediction models and by factors inherent to the trait.
`However, GBLUP had a larger prediction bias, worse
`values of mean squared error, and regression coefficients
`(b) more different from 1 (Table 3). For genomic pre-
`diction, lower bias has also been reported for differential
`shrinkage models (Meuwissen et al., 2001). Our result
`can be attributed to the accuracy of the estimated
`marker effects and to the small number of independent
`chromosome segments simulated.
`For a trait with h2 = 0.10 and 20 QTL (Table 2), the
`2
` obtained with all QTL and
`correlations between σgamete
`
`)
`), for QTL with minor allele frequency (MAF) ≥0.05 σgm2(
`
`2(
`Table 2. Pearson correlations between variance of gametic diversity for all QTL σg
`2(
`
`) and QTL with MAF ≥0.05 σdm2(
`), and their estimations using a high-density marker panel and
`and disregarding the covariances for all QTL σd
`(
`
`) and disregarding σ(
`) the dependency
`2
`2
`2
` and
` and
`sequence data by genomic BLUP (bp) and Bayesian LASSO (ls), considering σ
`σ
`σ
`gbp
`gls
`dls
`between the markers1
`
`2
`dbp
`
`Trait
`
`High-density SNP
`
`Sequence data
`
`QTL data
`
`h2
`
`QTL
`(no.)
`
`
`
`Gametic
`variance
`
`
`
`2
`σgbp
`
`2
`σgls
`
`0.1
`
`20
`
`200
`
`0.3
`
`20
`
`200
`
`2
`0.56
`0.49
`σg
`2
`0.74
`0.53
`σgm
`2
`0.53
`0.45
`σd
`2
`0.74
`0.50
`σdm
`2
`0.60
`0.50
`σg
`2
`0.61
`0.48
`σgm
`2
`0.28
`0.29
`σd
`2
`0.29
`0.27
`σdm
`2
`0.83
`0.64
`σg
`2
`0.87
`0.65
`σgm
`2
`0.81
`0.60
`σd
`2
`0.85
`0.60
`σdm
`2
`0.77
`0.63
`σg
`2
`0.78
`0.62
`σgm
`2
`0.48
`0.42
`σd
`2
`0.48
`0.41
`σdm
`2
`0.67
`0.54
`σg
`2
`0.67
`0.51
`σgm
`2
`0.64
`0.52
`σd
`2
`0.64
`0.49
`σdm
`2
`0.85
`0.79
`σg
`2
`0.86
`0.77
`σgm
`2
`0.61
`0.53
`σd
`2
`0.61
`0.51
`σdm
`1Values in bold represent the best estimates.
`
`0.5
`
`20
`
`200
`
`Journal of Dairy Science Vol. 102 No. 6, 2019
`
`2
`σdbp
`
`0.17
`0.21
`0.15
`0.18
`0.29
`0.29
`0.51
`0.52
`0.28
`0.28
`0.30
`0.30
`0.25
`0.25
`0.52
`0.52
`0.28
`0.26
`0.30
`0.28
`0.37
`0.37
`0.49
`0.49
`
`2
`σdls
`
`
`
`2
`σgbp
`
`0.39
`0.54
`0.43
`0.61
`0.37
`0.39
`0.30
`0.32
`0.66
`0.68
`0.69
`0.71
`0.49
`0.51
`0.63
`0.63
`0.50
`0.47
`0.53
`0.51
`0.51
`0.55
`0.83
`0.85
`
`0.46
`0.48
`0.43
`0.45
`0.46
`0.45
`0.28
`0.26
`0.59
`0.59
`0.54
`0.55
`0.59
`0.57
`0.40
`0.39
`0.48
`0.44
`0.46
`0.43
`0.76
`0.74
`0.52
`0.50
`
`2
`σgls
`
`0.57
`0.75
`0.53
`0.73
`0.61
`0.63
`0.27
`0.29
`0.83
`0.87
`0.81
`0.85
`0.77
`0.78
`0.49
`0.48
`0.66
`0.65
`0.63
`0.63
`0.84
`0.86
`0.61
`0.61
`
`2
`σdbp
`
`0.20
`0.25
`0.19
`0.24
`0.29
`0.30
`0.48
`0.49
`0.07
`0.07
`0.07
`0.07
`0.29
`0.29
`0.54
`0.54
`0.18
`0.16
`0.19
`0.18
`0.29
`0.30
`0.38
`0.37
`
`2
`σdls
`
`
`
`0.40
`0.55
`0.43
`0.61
`0.40
`0.41
`0.31
`0.33
`0.65
`0.68
`0.68
`0.70
`0.48
`0.49
`0.62
`0.63
`0.49
`0.46
`0.51
`0.49
`0.51
`0.55
`0.83
`0.85
`
`2
`σg
`
`—
`0.75
`0.96
`0.69
`—
`0.96
`0.50
`0.48
`—
`0.94
`0.95
`0.90
`—
`0.95
`0.55
`0.52
`—
`0.86
`0.94
`0.81
`—
`0.95
`0.65
`0.62
`
`2
`σgm
`
`0.75
`—
`0.66
`0.93
`0.96
`—
`0.46
`0.49
`0.94
`—
`0.90
`0.95
`0.95
`—
`0.53
`0.53
`0.86
`—
`0.79
`0.93
`0.95
`—
`0.64
`0.65
`
`2
`σd
`
`0.96
`0.66
`—
`0.71
`0.50
`0.46
`—
`0.97
`0.95
`0.90
`—
`0.95
`0.55
`0.53
`—
`0.99
`0.94
`0.79
`—
`0.85
`0.65
`0.64
`—
`0.98
`
`2
`σdm
`
`0.69
`0.93
`0.71
`—
`0.48
`0.49
`0.97
`—
`0.90
`0.95
`0.95
`—
`0.52
`0.53
`0.99
`—
`0.81
`0.93
`0.85
`—
`0.62
`0.65
`0.98
`—
`
`Exhibit 1042
`Select Sires, et al. v. ABS Global
`
`
`
`THEORY AND APPLICATION OF GAMETIC VARIANCE
`
`5285
`
`Table 3. Mean squared prediction (MSE), intercept (a), and coefficient (b) of the linear regression between the variance of gametic diversity
`for QTL and its estimates using a high-density SNP panel and sequence data by genomic models1
`
`Trait
`
`High-density SNP
`
`Sequence data
`
`h2
`
` QTL (no.)
`
` Model2
`
`MSE
`
`a
`
`b
`
`
`
`MSE
`
`0.0022
`8e-05
`0.0016
`0.0001
`0.0028
`0.0002
`0.0035
`0.0004
`0.0030
`0.0001
`0.0033
`0.0007
`
`a
`
`−0.00033
`0.00185
`0.00637
`0.00681
`−0.00625
`0.00247
`0.01123
`0.00950
`−0.002039
`0.001866
`0.006547
`0.008799
`
`b
`
`0.20
`1.26
`0.18
`1.03
`0.35
`1.41
`0.33
`1.13
`0.19
`1.37
`0.56
`1.09
`
`0.1
`
`0.3
`
`0.5
`
`20
`
`200
`
`20
`
`200
`
`20
`
`200
`
`GBLUP
`BLASSO
`GBLUP
`BLASSO
`GBLUP
`BLASSO
`GBLUP
`BLASSO
`GBLUP
`BLASSO
`GBLUP
`BLASSO
`1Values in bold represent the least-biased estimates.
`2GBLUP = genomic BLUP; BLASSO = Bayesian LASSO.
`
`0.0014
`8e-05
`0.0010
`0.0001
`0.0017
`0.0002
`0.0021
`0.0004
`0.0019
`0.0001
`0.0022
`0.0008
`
`−0.0010
`0.0027
`0.0058
`0.0074
`−0.00697
`0.00282
`0.00979
`0.00945
`−0.00294
`0.00188
`0.00560
`0.00851
`
`0.27
`1.20
`0.23
`1.01
`0.43
`1.46
`0.40
`1.14
`0.26
`1.41
`0.62
`1.10
`
`with QTL of MAF ≥5% were of moderate to high mag-
`nitude, lower than that of other traits (high magni-
`2
`
`tude), resulting in lower correlations with the σgamete
`estimated by genomic models. Although this result may
`be due to allele frequency fluctuations in historical
`population, it also implies that QTL with low MAF are
`2
`.
`important for obtaining accurate estimates of σgamete
`This variance does not depend directly on population
`allele frequencies but on the individual’s heterozygote
`status. Although MAF filtering (≥5%) can be used to
`improve the prediction of GEBV (Uemoto et al., 2015),
`markers with low MAF may have greater linkage dis-
`equilibrium with QTL with low MAF, providing better
`predictions of gametic variance.
`2
`, we tested
`To facilitate rapid calculation of σgamete
`some scenarios without considering the covariance (de-
`pendence) between markers. However, the correlation
`2
` was always lower
`between true and estimated σgamete
`compared with the full model, with the difference rang-
`ing from moderate to high when the estimates were
`obtained from QTL, and from low to high when ob-
`tained from the marker effects (Table 2). However, the
`high correlation observed for one of the scenarios (h2 =
`0.30 and QTL = 20) can be attributed to the random
`distribution of QTL in the genome. Therefore, covari-
`ance between markers should always be included for
`2
`, and thus, be preferred over the tra-
`calculating σgamete
`ditional Mendelian sampling variance (Appendix A1).
`This result is consistent with Bonk et al. (2016), who
`recommended the use of haplotype and direct recombi-
`nation data (Cole and VanRaden, 2011).
`No difference in correlation between true and esti-
`2
` from BLASSO was observed between the
`mated σgamete
`high-density SNP and sequence data scenarios (Table
`
`2), indicating that SNP panels with moderate densities
`2
`are sufficient for estimating σgamete
`. However, a decrease
`in correlation was observed for estimates obtained with
`GBLUP when the sequence data panel was used, re-
`gardless of the number of simulated QTL. For GEBV
`prediction, Clark et al. (2011) observed a small differ-
`ence in performance using differential shrinkage with
`sequence data compared with medium-density SNP
`panels. Pérez-Enciso et al. (2017) also reported a mod-
`est increase in accuracy using differential shrinkage
`model on sequence data. Therefore, sequence data are
`unlikely to offset SNP panels for predicting GEBV
`when the number of loci is large and the prior given to
`each SNP is uniform. Although no improvement in ac-
`2
` was observed with an increased num-
`curacy for σgamete
`ber of markers, the difference in performance between
`the 2 types of methods was in line with the literature
`on GEBV studies. This fact, together with the increase
`in overestimation due to an increased number of mark-
`ers (Table 3), confirms the preference of shrinkage
`2
` in our simulation of small
`models for estimating σgamete
`genome and relative large QTL effects.
`The correlation between true and predicted CRV was
`2
` (Supplemental Table S1;
`lower than that of σgamete
`https: / / doi .org/ 10 .3168/ jds .2018 -15971). There was no
`unanimous model, but GBLUP showed better predic-
`tion performance for many scenarios, whereas BLASSO
`had better results when ignoring the covariance be-
`tween markers in scenarios with moderate heritability
`and a small number of QTL. Generally, the prediction
`with high-density markers showed a higher accuracy
`than that with sequence data. The CRV is a relative
`parameter that indicates how variable the GEBV of an
`individual is when transmitted to its gametes. The
`magnitude of the correlation showed that this parame-
`
`Journal of Dairy Science Vol. 102 No. 6, 2019
`
`Exhibit 1042
`Select Sires, et al. v. ABS Global
`
`
`
`5286
`
`SANTOS ET AL.
`
`ter can be predicted, although the decreased accuracy
`with an increased number of markers indicated some
`difficulty for prediction in these cases.
`These results may also be explained by a partition of
`CRV (Appendix A3). Similar results were observed for
`2
` and CRV in the 10th generation using the mark-
`σgamete
`er effects estimated from the 9th generation. It means
`that predictions for these parameters can follow the
`same design in genomic selection programs to calculate
`2
` can be estimated using the training
`GEBV, and σgamete
`data from previous generations (Habier et al., 2007).
`
`Application of Gametic Variance
`in Selection Programs
`
`The percentage of additional genetic gain (ΔG) per
`generation in selection by using RPTA compared with
`PTA (ΔGRPTA-PTA/ΔGPTA), as well as the accumulated
`gain for a period of 10 generations, was used to assess
`the suitability of the new selection index (Figure 1 and
`Supplemental Figure S1; https: / / doi .org/ 10 .3168/ jds
`.2018 -15971). The accumulated genetic gains obtained
`with RPTA were higher than those obtained with PTA
`when the number of QTL increased. No significant in-
`crease was observed for a small number of QTL (20).
`However, in scenarios with more QTL, the genetic gain
`was close to expected (Appendix A2), with ΔG ranging
`from 5 to 16% in 10 generations, indicating an advan-
`tage of RPTA for traits with large numbers of QTL.
`These results were in agreement with those reported
`by Daetwyler et al. (2015) using a genomic optimal
`haploi