All Leave-One-Out Models

Instead of creating one binary classifier from a training set containing N samples, ALOOM⁽²⁾⁽³⁾⁽⁴⁾ creates N binary classifiers in the exactly way, i.e. using the same hyper-parameters, but on samples of size (N-1). The model trained on all N samples is here referred as the original model.

For a single test sample ALOOM produces N predicted probabilities and thus one may create an ALOOM individual prediction interval⁽⁴⁾ (min ALOOM probabilities, max ALOOM probabilities) for the test sample.

ALOOM provides a solution for assessing the reliability⁽⁵⁾ of a single binary prediction. As shown below, the widths of ALOOM individual prediction intervals vary between test samples. Therefore, the width of the ALOOM individual prediction interval may be used as a measure of the reliability of the original model's single predicted probability.

ALOOM also provides a solution for assessing the decideability⁽⁵⁾ for the single binary prediction. If All Leave-One-Out Models do not all agree on the predicted category for the test sample, then we would suggest returning NotAvailable.

ALOOM is a non-parametric approach where binary model and data define NotAvailable predictions.

In our experience ALOOM predictions which are available, i.e. not NotAvailable, have on average higher accuracy. Otherwise, there would be no point in using the ALOOM approach.

An initial simulation study⁽³⁾ has shown that ALOOM may be very useful in active learning. This means that if one has a binary model and wants to update the training set with new samples, then we would suggest to update it with samples that ALOOM currently predicts as NotAvailable.

ALOOM is not meaningful for model building algorithms which are affected by the value of a seed number. It is not suitable for deep learning models. However, for random forests it works fine with large-ish number of trees.

ALOOM is a simple idea, but its application is very computer-intensive and thus not really suitable for personal computers.

Example of using aloom and some interesting results

We use publicly available mutagenicity dataset from Kazius et al. (2005)⁽⁶⁾. It contains 4335 compounds, 2400 categorised as “mutagen” and the remaining 1935 compounds as “nonmutagen”. The dataset is available from the QSARdata⁽⁷⁾ R package and each compound comes with 1579 descriptors. Half of the dataset is used for training and the remaining half for testing.

1. Create train and test datasets


            library(QSARdata)
library(caret)

data(Mutagen)
x           <- as.matrix(Mutagen_Dragon)
rownames(x) <- rownames(Mutagen_Dragon)
colnames(x) <- colnames(Mutagen_Dragon)
y           <- Mutagen_Outcome

set.seed(1)
lvFolds <- createFolds(y,k=2)

train.x <- x[-lvFolds[[1]],]
train.y <- y[-lvFolds[[1]]]
test.x  <- x[lvFolds[[1]],]
test.y  <- y[lvFolds[[1]]]

The train dataset consists of 2168 samples, while test has 2167. Here they are available as csv files: train_x.csv, train_y.csv, test_x.csv, test_y.csv.

2. Create aloom object with randomForest⁽⁹⁾

NOTE 1: On machine with 48 CPUs AMD Opteron 6168 with 15 Gb RAM when using all 48 CPUs this takes 23 hours
NOTE 2: On machine with 24 CPUs Intel Xeon 6240@2.60GHz with 4 Gb RAM when using all 24 CPUs this takes 8 hours


            library(aloom)
library(parallel)
library(randomForest)

ntree     <- 1000
num.cores <- detectCores()

fit <- aloom(train.x, train.y, test.x, method="rf",list(ntree=ntree),mc.cores=num.cores)

2. Create aloom object with glmnet⁽¹⁰⁾

NOTE 1: On machine with 48 CPUs AMD Opteron 6168 with 15 Gb RAM when using all 48 CPUs this takes 4 hours
NOTE 2: On machine with 24 CPUs Intel Xeon 6240@2.60GHz with 4 Gb RAM when using all 24 CPUs this takes 2 hours

Prior to calling aloom() we execute cv.glmnet() to find optimal lambda.


            library(aloom)
library(parallel)
library(glmnet)

cv.fit          <- cv.glmnet(train.x,train.y,family="binomial",type.measure="auc")
selected.lambda <- cv.fit$lambda.1se
lambda          <- cv.fit$lambda
model.params    <- list(lambda=lambda, alpha=1, selected.lambda=selected.lambda)

num.cores <- detectCores()

fit <- aloom(train.x, train.y, test.x, method="glmnet",model.params,mc.cores=num.cores)

3. Examine aloom object

All Leave-One-Out Models, as well as the original model, are created during the execution of aloom(). Their predictions of test samples are the return results.
An aloom object is a list containing:

predicted.y - predicted categories of test samples produced by the original model
predicted.prob.y - predicted probabilities of test samples produced by the original model
aloom.probs - ALOOM probabilities of test samples. It is a matrix with rownames equal to names of test samples and colnames equal to names of training samples.


            predicted.y      <- fit$predicted.y
predicted.prob.y <- fit$predicted.prob.y
aloom.probs      <- fit$aloom.probs

Calculate original's misclassification error, ALOOM's proportion of NA and ALOOM's misclassification error.


            original.misclassification <- sum(predicted.y!=test.y)/length(test.y)

find.na       <- function(x){if ((min(x) < 0.5) & (max(x) > 0.5)) TRUE else FALSE}
predicted.na  <- apply(aloom.probs,1,find.na)
aloom.proportion.na <- sum(predicted.na)/length(predicted.na)

aloom.misclassification <-  sum(predicted.y[!predicted.na]!=test.y[!predicted.na])/length(test.y[!predicted.na])

	Original misclassification	% NotAvailable	ALOOM misclassification
glmnet	0.196	7.24	0.17
randomForest	0.182	15.32%	0.152

4. Calculate ALOOM individual prediction intervals, examine their width and show the sample with the maximum width


            min.aloom <- apply(aloom.probs,1,min)
max.aloom <- apply(aloom.probs,1,max)

Calculate width of every ALOOM individual prediction interval and examine its distrubition.


            width <- max.aloom - min.aloom
summary(width)

ALOOM individual prediction interval width stats are:

	Min	Q1	Median	Mean	Q3	Max
glmnet	0	0.019	0.034	0.053	0.062	0.766
randomForest	0.009	0.085	0.104	0.109	0.119	0.571


            id.with.max.width <- rownames(test.x)[which.max(width)]
max.width.range   <- c(min.aloom[which.max(width)],max.aloom[which.max(width)])

	Compound ID	Min ALOOM interval	Max ALOOM interval
glmnet	'3224'	0.135	0.901
randomForest	'782'	0.261	0.832

All Leave-One-Out Models (ALOOM)

Introduction

Example of using aloom and some interesting results

1. Create train and test datasets

2. Create aloom object with randomForest⁽⁹⁾

2. Create aloom object with glmnet⁽¹⁰⁾

3. Examine aloom object

4. Calculate ALOOM individual prediction intervals, examine their width and show the sample with the maximum width

References

All Leave-One-Out Models (ALOOM)

Introduction

Example of using aloom and some interesting results

1. Create train and test datasets

2. Create aloom object with randomForest(9)

2. Create aloom object with glmnet(10)

3. Examine aloom object

4. Calculate ALOOM individual prediction intervals, examine their width and show the sample with the maximum width

References

2. Create aloom object with randomForest⁽⁹⁾

2. Create aloom object with glmnet⁽¹⁰⁾