Correlated traits: Index selection

class: center, middle, inverse, title-slide

# Correlated traits: Index selection
### Jinliang Yang
### Dec. 9th, 2019

---

# The correlation between two traits

`\begin{align*}
& r_P = r_Ah_Xh_Y + r_Ee_Xe_Y \\
\end{align*}`

- __ `\(r_P\)`__: the phenotypic correlation between two traits X and Y
- __ `\(r_A\)`__: the genetic correlation due to breeding values between X and Y
- __ `\(r_E\)`__: the environmental correlation between X and Y, including non-additive genetic effects
- __ `\(h^2\)`__: heritability
- __ `\(e^2\)`__: `\(1-h^2\)`

---
  
This proof generally shows that the genetic and environmental correlation come together to create the phenotypic correlation.

- If both traits have __low heritabilites__: 
  - then phenotypic correlation is determined mainly be the environmental correlations.
- If they have __high heritabilities__:
  - genetic correlation is more important.

---

# The correlation between two traits

`\begin{align*}
& r_P = r_Ah_Xh_Y + r_Ee_Xe_Y \\
\end{align*}`

---

### Estimates of genetic correlation

- Rests on the resemblance between relatives in a similar manner to the estimation of `\(h^2\)`
- But rather than perform an ANOVA, perform __an analysis of covariance__. 
  - The interpretation of covariance is exactly the same.

### Some precautions with genetic correlations:

- They have large sampling error.
  - Increasing the precision of genetic correlations can be attained through using the same techniques for increasing the precision of heritability estimates.

- Subject to change with allele frequency differences.
  
---

# Correlated response to selection

If we select on trait X, how will trait Y change?

This comes down to the relationship between the breeding values (BVs) for trait Y and breeding values for trait X.

- It can be expressed as __"what is the expected BV of Y given a BV of X"__
- Or `\(E(A_Y | A_X)\)`

.pull-left[
<img src="Ch19-c2_files/figure-html/unnamed-chunk-1-1.png" width="80%" style="display: block; margin: auto;" />
]

.pull-right[

`\begin{align*}
& b_{A_{YX}} = \frac{Cov_A}{\sigma^2_{A_X}} \\
& Cov_A = r_A\sigma_{A_X}\sigma_{A_Y} \\
\end{align*}`

Therefore,
`\begin{align*}
 b_{A_{YX}} & = \frac{Cov_A}{\sigma^2_{A_X}} \\
&  = \frac{r_A\sigma_{A_Y}}{\sigma_{A_X}} \\
\end{align*}`

]

---
# Correlated response to selection

`\begin{align*}
 b_{A_{YX}}  = r_A\frac{\sigma_{A_Y}}{\sigma_{A_X}} \\
\end{align*}`

Recall that `\(R_X = ih_X\sigma_{A_X}\)`, correlated response of trait Y is:

`\begin{align*}
CR_Y &  =  b_{A_{YX}}R_X  \\
& = r_A \frac{\sigma_{A_Y}}{\sigma_{A_X}} ih_X\sigma_{A_X}\\
& = ih_Xr_A\sigma_{A_Y} \\
\end{align*}`

Because `\(\sigma_{A_Y} = h_Y\sigma_{P_Y}\)`, therefore,

`\begin{align*}
& CR_Y = ih_Xh_Yr_A\sigma_{P_Y} \\
\end{align*}`

In the formula:
- `\(h_Xh_Yr_A\)` is referred to as the __coheritability__, as it takes the place of the heritability in the direct response equation.
- If `\(h_Xh_Yr_A\)` is larger than `\(h^2\)` of trait Y, then selection on a correlated trait should be used.

---

# Indirect selection

The trade-off between selection on a correlated trait and direct selection on a trait can also be seen:

`\begin{align*}
\frac{CR_Y}{R_Y}  & =  \frac{i_Xh_Xr_A\sigma_{A_Y}}{i_Yh_Y\sigma_{A_Y}} \\
& = \frac{i_Xh_Xr_A}{i_Yh_Y}\\
\end{align*}`

### Assuming seletion intensity is the same

- When __ `\(h_Xr_A > h_Y\)`__, a correlated response from selection on a secondary trait (X) is greater than response to direct selection on Y .

- Note that this exact same property applies to usefulness of molecular markers for __marker-assisted selection__

---

# Indirect selection

The trade-off between selection on a correlated trait and direct selection on a trait can also be seen:

`\begin{align*}
\frac{CR_Y}{R_Y}  & =  \frac{i_Xh_Xr_A\sigma_{A_Y}}{i_Yh_Y\sigma_{A_Y}} \\
& = \frac{i_Xh_Xr_A}{i_Yh_Y}\\
\end{align*}`

### Practical considerations

- If trait Y is very __expensive__ and __difficult__ to measure, but trait X is very cheap and easy to measure.
  - e.g. high-throughput phenotyping techonologies

- Or the desired traits is measurable __in one sex__ only, but the secondary traits is measurable in both.
  - e.g. milk yield and body weight in dairy cow

---

# Indirect selection

The trade-off between selection on a correlated trait and direct selection on a trait can also be seen:

`\begin{align*}
\frac{CR_Y}{R_Y}  & =  \frac{i_Xh_Xr_A\sigma_{A_Y}}{i_Yh_Y\sigma_{A_Y}} \\
& = \frac{i_Xh_Xr_A}{i_Yh_Y}\\
\end{align*}`

### Genotype-by-environment interaction

- Performance in different environments can be regarded as __two separate, but correlated traits__.
 - Improvement in one environment by selection in another environment can be predicted by knowing the heritability in each environment and the genetic correlation between them.

---

# Selection on multiple traits

Total economic value is a composite of many traits. How does one __maximize response for many traits__ simultaneously?

#### Options for multiple trait selection

- __Tandom selection__: 
  - Selection for one trait at a time until that trait is improved to a desired level.
  - After that, selection proceeds for another trait.

- __Independent culling levels__: 
  - Only individuals that meet the minimum standard for each trait are selected.

- __Index selection__: 
  - Select for multiple traits simultaneously by constructing an index value.
  - Index value is then treated as a single economic trait.

---

# Genetic merit

- Any individual has a genetic value for what we'll call as __genetic merit__, or simply, __merit__.

- Merit is the summation of all traits contributing to an individual's worth, fitness, economic value, etc.

#### The __true value of merit__ is represented as:

`\begin{align*}
T & = a_1G_1 + a_2G_2 + ... + a_mGm \\
& = \sum_{i=1}^ma_iG_i \\
\end{align*}`

- `\(G_i\)` is the genetic value for trait `\(i\)`
- `\(a_i\)` is the economic weight placed on trait `\(i\)`
  - The economic weights are set by the breeder according to production needs and value.

---
# Index trait

To accurately predict genentic merit `\(T\)`, we want to combine the values of multiple traits into one value, denoted `\(I\)`.

`\begin{align*}
I & = b_1P_1 + b_2P_2 + ... + b_mPm \\
& = \sum_{i=1}^mb_iP_i \\
\end{align*}`

- `\(P_i\)` is the phenotypic value for trait `\(i\)` that goes into the index
- `\(b_i\)` is the weighting factor on trait `\(i\)`

### Correlation between T and I

The goal is to __find the values of the `\(b_i\)`s__ that could __maximize the correlations__ between `\(T\)` and `\(I\)`, or `\(r_{TI}\)`.

`\begin{align*}
r_{TI} = \frac{Cov(T, I)}{\sigma_T\sigma_I}
\end{align*}`

---

# Correlation between T and I

`\begin{align*}
r_{TI} = \frac{Cov(T, I)}{\sigma_T\sigma_I}
\end{align*}`

Obtaining the maximum value of the correlation involves taking the derivative and setting to zero.

--
### Optimum index

The vector of weights, which is called the __Smith-Hazel index__, or __optimum index__:
  - It is the most widely used set of weights for a linear selection index
  - See [here](ch37_index-selection.pdf) Page 411 for the proof

`\begin{align*}
& \mathbf{b} = \mathbf{P^{-1}}\mathbf{G^Ta}  \\
\end{align*}`

- `\(\mathbf{P}\)` is the phenotypic variance-covariance matrix
- `\(\mathbf{G}\)` is the genetic variance-covariance matrix
- `\(\mathbf{a}\)` is the vector of __known__ economic weights
- `\(\mathbf{b}\)` is the vector of __unknow__ weights, or the weights to be applied to the phenotypic values of the different traits composing the index, `\(I\)`.

---

# Optimum index

The vector of weights, which is called the __Smith-Hazel index__, or __optimum index__:

`\begin{align*}
& \mathbf{b} = \mathbf{P^{-1}}\mathbf{G^Ta}  \\
\end{align*}`

- The vector of `\(\mathbf{b}\)` can be estimated if we know the __phenotypic and genetic variances__ and the __phenotypic and genetic covariances__

- This assumes that we have good information on the economic weights, which can be hard to determine.

- Then the resulting vector of weights will give a selection index that maximized genetic gain in `\(T\)`, the true genetic merit.

- The drawback of this index is that the genetic variance and covariances are often estimated with large amounts error, which may reduce the correlation between I and T.

???
# A problem

This litter size of mice could be increased: 
- 1) by selection of females for their little size
- 2) by selection of both parents for body weight

Which would be the better of these two simple procedures, given the following parameters?

- `\(h^2\)` of litter size: 0.22
- `\(h^2\)` of body weight: 0.35
- Genetic correlation `\(r_A\)`: 0.43
- Proportion of selection:
  - Females: 25% ( `\(i_f\)` =1.271)
  - Males: 10% ( `\(i_m\)` =1.755)