class: center, middle, inverse, title-slide .title[ # Correlated traits: Index selection ] .author[ ### Jinliang Yang ] .date[ ### April 26, 2024 ] --- # The correlation between two traits `\begin{align*} & r_P = r_Ah_Xh_Y + r_Ee_Xe_Y \\ \end{align*}` - __ `\(r_P\)`__: the phenotypic correlation between two traits X and Y - __ `\(r_A\)`__: the genetic correlation due to breeding values between X and Y - __ `\(r_E\)`__: the environmental correlation between X and Y, including non-additive genetic effects - __ `\(h^2\)`__: heritability - __ `\(e^2\)`__: `\(1-h^2\)` ---- This proof generally shows that the genetic and environmental correlation come together to create the phenotypic correlation. - If both traits have __low heritabilites__: - then phenotypic correlation is determined mainly be the environmental correlations. - If they have __high heritabilities__: - genetic correlation is more important. --- # Correlated response to selection `\begin{align*} & CR_Y = ih_Xh_Yr_A\sigma_{P_Y} \\ \end{align*}` - Here `\(i\)` is the selection intensity for trait X, or `\(i_X\)`. ----- In the formula: - `\(h_Xh_Yr_A\)` is referred to as the __coheritability__, as it takes the place of the heritability in the direct response equation. - If `\(h_Xh_Yr_A\)` is larger than `\(h^2\)` of trait Y, then selection on a correlated trait should be used. --- # Indirect selection The trade-off between selection on a correlated trait and direct selection on a trait can also be seen: `\begin{align*} \frac{CR_Y}{R_Y} & = \frac{i_Xh_Xr_A\sigma_{A_Y}}{i_Yh_Y\sigma_{A_Y}} \\ & = \frac{i_Xh_Xr_A}{i_Yh_Y}\\ \end{align*}` -- ### Assuming selection intensity is the same - When __ `\(h_Xr_A > h_Y\)`__, a correlated response from selection on a secondary trait (X) is greater than response to direct selection on Y . -- - Note that this exact same property applies to usefulness of molecular markers for __marker-assisted selection__ --- # Indirect selection The trade-off between selection on a correlated trait and direct selection on a trait can also be seen: `\begin{align*} \frac{CR_Y}{R_Y} & = \frac{i_Xh_Xr_A\sigma_{A_Y}}{i_Yh_Y\sigma_{A_Y}} \\ & = \frac{i_Xh_Xr_A}{i_Yh_Y}\\ \end{align*}` ### Practical considerations - If trait Y is very __expensive__ and __difficult__ to measure, but trait X is very cheap and easy to measure. - e.g. high-throughput phenotyping technologies -- - Or the desired traits is measurable __in one sex__ only, but the secondary traits is measurable in both. - e.g. milk yield and body weight in dairy cow --- # Indirect selection The trade-off between selection on a correlated trait and direct selection on a trait can also be seen: `\begin{align*} \frac{CR_Y}{R_Y} & = \frac{i_Xh_Xr_A\sigma_{A_Y}}{i_Yh_Y\sigma_{A_Y}} \\ & = \frac{i_Xh_Xr_A}{i_Yh_Y}\\ \end{align*}` ### Genotype-by-environment interaction - Performance in different environments can be regarded as __two separate, but correlated traits__. - Improvement in one environment by selection in another environment can be predicted by knowing the heritability in each environment and the genetic correlation between them. - For future environments in the face of climate change --- # Selection on multiple traits Total economic value is a composite of many traits. How does one __maximize response for many traits__ simultaneously? -- #### Options for multiple trait selection - __Tandom selection__: - Selection for one trait at a time until that trait is improved to a desired level. - After that, selection proceeds for another trait. -- - __Independent culling levels__: - Only individuals that meet the minimum standard for each trait are selected. -- - __Index selection__: - Select for multiple traits simultaneously by constructing an index value. - Index value is then treated as a single economic trait. --- # Genetic merit - Any individual has a genetic value for what we'll call as __genetic merit__, or simply, __merit__. -- - Merit is the summation of all traits contributing to an individual's worth, fitness, economic value, etc. -- #### The __true value of merit__ is represented as: `\begin{align*} T & = a_1G_1 + a_2G_2 + ... + a_mGm \\ & = \sum_{i=1}^ma_iG_i \\ \end{align*}` - `\(G_i\)` is the genetic value for trait `\(i\)` - `\(a_i\)` is the economic weight placed on trait `\(i\)` - The economic weights are set by the breeder according to production needs and value. --- # Index trait To accurately predict genentic merit `\(T\)`, we want to combine the values of multiple traits into one value, denoted `\(I\)`. -- `\begin{align*} I & = b_1P_1 + b_2P_2 + ... + b_mPm \\ & = \sum_{i=1}^mb_iP_i \\ \end{align*}` - `\(P_i\)` is the phenotypic value for trait `\(i\)` that goes into the index - `\(b_i\)` is the weighting factor on trait `\(i\)` -- ### Correlation between T and I The goal is to __find the values of the `\(b_i\)`s__ that could __maximize the correlations__ between `\(T\)` and `\(I\)`, or `\(r_{TI}\)`. `\begin{align*} r_{TI} = \frac{Cov(T, I)}{\sigma_T\sigma_I} \end{align*}` --- # Correlation between T and I `\begin{align*} r_{TI} = \frac{Cov(T, I)}{\sigma_T\sigma_I} \end{align*}` Obtaining the maximum value of the correlation involves taking the derivative and setting to zero. -- ### Optimum index The vector of weights, which is called the __Smith-Hazel index__, or __optimum index__: - It is the most widely used set of weights for a linear selection index - See [here](ch37_index-selection.pdf) Page 411 for the proof `\begin{align*} & \mathbf{b} = \mathbf{P^{-1}}\mathbf{G^Ta} \\ \end{align*}` - `\(\mathbf{P}\)` is the phenotypic variance-covariance matrix - `\(\mathbf{G}\)` is the genetic variance-covariance matrix - `\(\mathbf{a}\)` is the vector of __known__ economic weights - `\(\mathbf{b}\)` is the vector of __unknow__ weights, or the weights to be applied to the phenotypic values of the different traits composing the index, `\(I\)`. --- # Optimum index The vector of weights, which is called the __Smith-Hazel index__, or __optimum index__: `\begin{align*} & \mathbf{b} = \mathbf{P^{-1}}\mathbf{G^Ta} \\ \end{align*}` -- - The vector of `\(\mathbf{b}\)` can be estimated if we know the __phenotypic and genetic variances__ and the __phenotypic and genetic covariances__ -- - This assumes that we have good information on the economic weights, which can be hard to determine. -- - Then the resulting vector of weights will give a selection index that maximized genetic gain in `\(T\)`, the true genetic merit. -- - The drawback of this index is that the genetic variance and covariances are often estimated with large amounts error, which may reduce the correlation between I and T. --- # A simulated example - A corn breeder, aiming to increase grain yield in a cost-effective manner for their small breeding program, selects key yield component traits instead of relying solely on combine-based yield measurements. - These traits include kernel count (KC), three ear-related traits and three cob-related traits. <div align="center"> <img src="yct.png" height=230> </div> -- How to build a selection index to maximize the genetic gain? `\begin{align*} I & = \sum_{i=1}^{m=7}b_iP_i \\ \end{align*}` --- # A simulated example ```r G <- read.csv("https://jyanglab.com/slides/2024-agro931/Gmatrix.csv") P <- read.csv("https://jyanglab.com/slides/2024-agro931/Pmatrix.csv") a <- read.csv("https://jyanglab.com/slides/2024-agro931/Wmatrix.csv") # in R solve function gives the inverse of a matrix solve(as.matrix(P[, -1])) ``` ``` ## [,1] [,2] [,3] [,4] [,5] [,6] [,7] ## KC 0.79474139 -0.08412097 -0.6252394 -0.5008019 -0.10351099 -0.5123586 -2.5482715 ## EL -0.08412097 0.31442910 -0.1616803 0.2426977 0.06199121 -0.3469347 0.9845905 ## EW -0.62523938 -0.16168025 6.9321926 1.5576210 -0.78367948 1.5797301 -0.6336474 ## ED -0.50080191 0.24269775 1.5576210 3.1885762 -0.34724248 1.9990310 -1.3838360 ## CL -0.10351099 0.06199121 -0.7836795 -0.3472425 0.40582077 -0.7215205 1.7977689 ## CW -0.51235860 -0.34693473 1.5797301 1.9990310 -0.72152053 21.4511330 -24.2007667 ## CD -2.54827155 0.98459046 -0.6336474 -1.3838360 1.79776894 -24.2007667 85.0088254 ``` ```r # function t to compute transpose t(as.matrix(G[,-1])) ``` ``` ## [,1] [,2] [,3] [,4] [,5] [,6] [,7] ## KC 1.2566 0.3294 0.1588 0.2430 0.7350 0.1276 0.0926 ## EL 0.3294 1.5602 0.1734 -0.3129 -0.2331 0.1168 0.0330 ## EW 0.1588 0.1734 0.1325 -0.0316 0.3201 -0.0086 -0.0124 ## ED 0.2430 -0.3129 -0.0316 0.2432 0.3019 -0.0209 0.0074 ## CL 0.7350 -0.2331 0.3201 0.3019 0.9608 -0.0692 -0.0582 ## CW 0.1276 0.1168 -0.0086 -0.0209 -0.0692 0.0174 0.0085 ## CD 0.0926 0.0330 -0.0124 0.0074 -0.0582 0.0085 0.0103 ``` --- # A simulated example The vector of weights, which is called the __Smith-Hazel index__, or __optimum index__: `\begin{align*} & \mathbf{b} = \mathbf{P^{-1}}\mathbf{G^Ta} \\ \end{align*}` ```r b <- solve(as.matrix(P[, -1])) %*% t(as.matrix(G[, -1])) %*% as.matrix(a[,1]) b ``` ``` ## [,1] ## KC 1.0280734 ## EL 0.4042746 ## EW 2.3218498 ## ED 0.9935738 ## CL -0.1079861 ## CW 0.2339995 ## CD -0.6489541 ```