Analysis Of Brownlee's Stack Loss Data

2236 Words5 Pages

4.0 Analysis and Findings
There are three numerical examples presented in order to evaluate the performance of the methods discussed previously. The three numerical examples are Brownlee’s Stack Loss data set, Hawkins-Bradu-Kass(1984) data set and Miller Lumber Company data set.
4.1 Brownlee’s Stack Loss Data
The data set stack loss, known as Brownlee’s Stack Loss Plant Data, contains operational data for a plant for the oxidation of ammonia to nitric acid; there are 21 observations on 4 variables. The dependent variable (Y) is Stack Loss and the independent variables are Air Flow (X1), Water Temperature (X2) and Acid Concentration (X3). Using Diagnostic Robust Generalized Potential (DRGP) method, we found that there are 4 high leverage points for these data, i.e. case no. 1, 2, 3 and 21.

Index DRGP (0.728) Index DRGP (0.728)
1 1.371 12 0.285
2 1.436 13 0.204
3 0.760 14 0.289
4 0.264 15 0.505
5 0.124 16 0.294
6 0.172 17 0.711
7 0.318 18 0.215
8 0.318 19 0.236
9 0.177 20 0.093
10 0.289 21 0.895
11 0.190

Table 4.1 : DRGP values for Stack loss data set
Using All Possible Subsets Regression Models, there are 2p -1 models, where p is the number of predictors in the full model . There are 23 -1 = 7 possible subsets for these data. Regression models with 1 variable are X1,X2 and X3 , with two variables X1X2, X1X3 and X2X3 and with three variables, X1X2X3. The results for the model selection criterion for each of the subset with classical method using OLS for both original and clean data and Robust MAD using LTS for original data are shown below:
Variable Original data with n=21 Clean data with n=17 r2 adjr2 Cp AIC r2 adjr2 Cp AIC
X1 0.846 0.838 13.336 2.912 0.743 0.726 8.038 2.23
X2 0.767 0.754 28.929 3.326 0.603 0.57...

... middle of paper ...

...or using OLS method and original data set with robust MAD using LTS method. The results showed that the classical selection criterion values are badly affected by the presence of high leverage points based on the changes of the classical selection criterion values in the clean data set compared to the original data set for all examples. The robust variable selection criterion values are not much affected by the presence of the high leverage points since the values of R2 and adjusted R2 are high indicating the good fit of the data for all the three examples.
We could conclude that the robust method is more reliable than the classical method in dealing with the variable selection criterion. Further research could be done on this study by using some other robust methods, e.g. the MM estimate, Least Median Squares (LTS) etc. which might give better and clearer results.

Open Document