Load the dataset `liver.toxicity`

available in the `mixOmics`

package:

```
library("mixOmics")
data("liver.toxicity")
help(liver.toxicity)
x <- liver.toxicity$gene
y <- liver.toxicity$clinic$ALB.g.dL.
dim(x)
x[1:6, 1:6]
str(y)
```

Build a RF with default parameters which computes the

**permutation variable importance**index. Note its OOB error.Plot VI scores (sorted in decreasing order), then plot only the scores associated to the 100 most important variables.

*[The*`sort()`

function can be used to sort VI scores, and the`index.return=TRUE`

must be specified to output the indices associated to the permutation performed during the sort]Find a subset of variables containing the most important variables, based on the previous graph (you can apply an elbow rule, like in PCA for example). Keep the indices of the selected variables. We note \(p_{\mathrm{sel}}\) the number of selected variables.

Build a RF only using the previously selected variables. Comment on the associated OOB error.

Estimate the prediction error of a RF only using the \(p_{\mathrm{sel}}\) most important variables with a 4-fold cross-validation procedure.

*[Variables can vary from one fold to the other, only \(p_{\mathrm{sel}}\) is fixed]*

Load the dataset, available in the `mlbench`

package:

Build a RF with default parameters which computes the permutation variable importance index.

Plot VI scores (sorted in decreasing order). Interprate the result.