DIAGNOSTIC ROBUST APPROACH OF OUTLIER DETECTION IN REGRESSION
Keywords:
Outliers, High Leverage Points, In uential Observations, Regression Diagnostics, Masking, Swamping, Robust Regression, Generalized Studentized Residuals, Generalized Potentials, Generalized DFFITS.Abstract
The identication of outliers in data has been an area of a great deal of attention
for many years. The outlier detection procedure is more cumbersome in regression
where outliers may occur in the response variable or in the explanatory variables
or both. A variety of diagnostic methods are now being used for the identication
of dierent types of outliers in regression. These methods, however, are successful
only if the data set contains a single outlier. In the presence of multiple outliers
diagnostic methods often fail to detect the outliers. This is due to the well-known
problems of masking and swamping eects. On the other hand the robust methods
can identify the outliers correctly but they are too prone to declare observations
to be outlier which is not also desired. In this paper we discuss an approach which
is a compromise between these two approaches. We call this approach diagnosticrobust
approach where the suspect outliers are identied rst by robust methods
and diagnostic methods are applied later to conrm the suspicion. We consider
several well-known data sets to investigate the performance of the diagnosticrobust
approach in the detection of outliers in regression.