Choosing a Classifier

Choosing a Classifier

In order to illustrate the problem of chosing a classification model consider some simulated data, __ n = 500 __ set.seed(1) __ X = rnorm(n) __ ma = 10-(X+1.5)^2*2 __ mb = -10+(X-1.5)^2*2 __ M = cbind(ma,mb) __ set.seed(1) __ Z = sample(1:2,size=n,replace=TRUE) __ Y = ma*(Z==1)+mb*(Z==2)+rnorm(n)*5 __ df = data.frame(Z=as.factor(Z),X,Y) A first strategy is to split the dataset in two parts, a training dataset, and a testing dataset.

Comments