## Monday, February 12, 2007

### Nothing to prove *

I don’t really go in much for proofs in my Introduction to Econometrics (INEMET) course.

That doesn’t mean that proofs aren’t important in econometrics. Without a proof how could we be sure that the least squares estimators are unbiased, given the classical assumptions, or that omitting a relevant explanatory variable from a model will not only cause bias in the estimation of the coefficients of the other variables but will also affect their standard errors and t-values.

Certainly anyone who wishes to pursue the study of econometrics beyond an introductory course will need to become familiar with proofs of these and other important propositions found in the textbooks. But beginners can be overwhelmed by all the technical stuff (as I can still remember from my own initial exposure to the subject back in the late 1960s!). It is more important for students who are just beginning their study of econometrics to get a good intuitive feel for the subject, its scope and methodology, than to grapple with formal proofs. So in this respect I go along completely with Christopher Dougherty, who says in the preface to his book Introduction to Econometrics (Third Edition) p vi “For nearly everyone, there is a limit to the rate at which formal mathematical analysis can be digested. If this limit is exceeded, the student spends much mental energy grappling with the technicalities rather than the substance, impeding the development of a unified understanding of the subject.”

That doesn’t mean that students have to just accept a whole set of results without any attempt being made to justify them. In a number of cases a convincing intuitive argument can be provided for the propositions in question, without the need to resort to a proof. Or alternatively a simple quantitative example can be used to support the argument.

Take the case of the formula for the standard error of the X coefficient in the simple linear regression model. As Dougherty shows mathematically (!) on page 83 of his book the theoretical variance of the X coefficient is the variance of the disturbance term divide by the sum of squares of the deviations of the X variable from its sample mean. We can calculate the latter but we have to estimate the former as we don’t observed the actual disturbances. The square root of this estimate of the variance is then the standard error that we use for the t-test and for computing confidence intervals for the parameter.

In my lectures I have tried to convince students of the result by using a simple spreadsheet (Excel) demonstration which amounts to a simple Monte Carlo experiment. I begin by setting up an assumed model with known intercept and slope parameters – say Y = 2 + 0.8X + u.

Next I create a sample of fixed X values in the spreadsheet. I usually centre the values on a mean of 100 and have maybe 12 values either side of that (so X runs from 88 to 112). Then I use the random number generator to create a large number of sets (say 500) of 25 values of u, initially making u ~ N(0,1). From this I can create 500 sets of Y values to go with the Xs. Now I can run regressions based on these 500 data sets and collect the 500 estimates of the slope coefficient. (You might prefer to set this up as a batch job in EViews or some other specialist econometric software packages if you wish). After that, plot a histogram of the beta hat values, as well as calculating the mean and standard deviation of the 500 values. (An interesting discussion point is whether you should use 0 or the mean of the 500 beta hat values in this calculation). Compare these values with those predicted by the theory for the sampling distribution of beta hat.

Now you can repeat the process, varying first the variance of u (maybe make it smaller than 1 – say 0.5). You should see immediately that the variance of the beta hat estimates falls proportionately.

Then you can illustrate the effect of more spread out values of the Xs. Multiply each of them by 10 and recalculate all of the Y values (go back to the original standard normal distribution for u). Rerun the regressions and compare the distribution of the beta hat values. The standard deviation should be one tenth of what it was initially.

If all this takes too much time to do this interactively you could prepare everything in advance.

Another advantage of this exercise is that it introduces students to the idea of Monte Carlo studies and computer simulation at an early stage.

References
[1] Dougherty, C (2006)Introduction to Econometrics. (Third Edition), Oxford University Press.
[2] Judge, G (1999) Simple Monte Carlo studies on a spreadsheet CHEER Volume 13, 2.

* The phrase "Nothing to prove" is one that I always associate with the Sunderland striker David Connolly. He began his career at Watford, averaging a goal in every two games, before he left for a spell at the Dutch side Feyenoord. A bit of a flop in Holland he returned to England to play for Wimbledon declaring that he had "nothing to prove". This caused some amusement among Watford supporters who think of him as rather arrogant and used to label him W4BS (Watford's 4th Best Striker).