ROC curves with confidence bounds
Contents
Obtain a ROC curve without bounds
We use the ionosphere data from the UCI machine learning repository, included in MATLAB distributions.
load ionosphere;
For reproducibility, set the random number generator seed for data partitioning.
rng(1);
Cross-validate a pseudo-linear discriminant model.
cvplda = ClassificationDiscriminant.fit(X,Y,'DiscrimType','pseudoLinear','crossval','on');
Boost decision stumps by AdaBoostM1 and cross-validate.
cvada = fitensemble(X,Y,'AdaBoostM1',200,'Tree','crossval','on');
Compute scores for out-of-fold data.
[~,SfitLDA] = kfoldPredict(cvplda); [~,SfitAda] = kfoldPredict(cvada);
Obtain false positive rate (FPR) and true positive rate (TPR) choosing 'good returns' for the positive class.
[fprLDA,tprLDA] = perfcurve(Y,SfitLDA(:,2),'g'); [fprAda,tprAda] = perfcurve(Y,SfitAda(:,2),'g');
Plot the ROC curves for the two classifiers.
plot(fprLDA,tprLDA,'b--'); hold; plot(fprAda,tprAda,'r-.'); line([0 1],[0 1],'color','c'); xlabel('False positive rate'); ylabel('True positive rate'); legend('Pseudo LDA','Boosted stumps','Fair coin toss','Location','SE');
Current plot held

Vertical averaging by bootstrap
Use 1000 bootstrap replicas to compute confidence bounds for Y at fixed X values, where X and Y are 1st and 2nd output arguments from perfcurve. By default, perfcurve computes FPR for X and TPR for Y and therefore performs vertical averaging.
[fprLDA,tprLDAboot] = perfcurve(Y,SfitLDA(:,2),'g',... 'xvals',0:0.01:0.4,'nboot',1000); [fprAda,tprAda] = perfcurve(Y,SfitAda(:,2),'g',... 'xvals',0:0.01:0.4);
Plot the ROC curve obtained by pseudo LDA with error bars. Plot the ROC curve obtained by AdaBoost without error bars, to keep the plot clean.
figure; errorbar(fprLDA,tprLDAboot(:,1),tprLDAboot(:,2)-tprLDAboot(:,1),... tprLDAboot(:,3)-tprLDAboot(:,1),'*'); hold; plot(fprAda,tprAda,'r*'); hold off; legend('PLDA','AdaBoost','Location','SE'); xlabel('False positive rate'); ylabel('True positive rate'); ylim([-0.1 1.1]); xlim([-0.05 0.45]); grid on;
Current plot held

Threshold averaging using binomial intervals
Copy TPR and FPR arrays obtained earlier into TPR and FPR arrays with binomial intervals. The new bino arrays have 3 columns. The 1st column is for central values; we copy the FPR and TPR values obtained earlier into the first columns of the respective bino arrays. The 2nd column is for lower bounds and the 3rd column is for upper bounds.
tprLDAbino = zeros(size(tprLDAboot)); tprLDAbino(:,1) = tprLDAboot(:,1); fprLDAbino = zeros(size(tprLDAboot)); fprLDAbino(:,1) = fprLDA;
N1 = sum(strcmp(Y,'g')); % Number of observations in the 'good' class N0 = sum(strcmp(Y,'b')); % Number of observations in the 'bad' class
Compute lower and upper bounds for Clopper-Pearson confidence intervals.
[~,tprLDAbino(:,2:3)] = binofit(tprLDAbino(:,1)*N1,N1); [~,fprLDAbino(:,2:3)] = binofit(fprLDAbino(:,1)*N0,N0);
Plot the ROC curve for pseudo LDA with horizontal error bars only. Although we compute the vertical errors as well, we do not show them because the plot would look messy. To plot horizontal error bars, we use herrorbar utility by Jos van der Geest downloadable from MATLAB File Exchange http://www.mathworks.com/matlabcentral/fileexchange/3963-herrorbar
figure; hLDA = herrorbar(fprLDAbino(:,1),tprLDAbino(:,1),... fprLDAbino(:,2)-fprLDAbino(:,1),fprLDAbino(:,3)-fprLDAbino(:,1),'*'); hold; hAda = plot(fprAda,tprAda,'r*'); hold off; legend([hLDA(1) hAda],{'PLDA' 'AdaBoost'},'Location','SE'); xlabel('False positive rate'); ylabel('True positive rate'); ylim([-0.1 1.1]); xlim([-0.05 0.45]); grid on;
Current plot held
