### Notation, notation, notation

If you are new to econometrics you may well get confused or even irritated by the variations in the notation used by different text book authors. Most (but not all) authors use the Greek letter beta to represent the unknown parameter that is the coefficient of an independent variable in a regression equation, with a subscript to indicate the variable that it is associated with. But different authors accommodate the constant intercept in such equations in different ways. Some give a subscript zero to this first beta, continuing with subscripts 1 to k for the betas linked to the X variables (which have matching subscripts 1 to k). Kennedy, Stock and Watson, and Wooldridge all adopt this form of notation. But an alternative approach is to give a subscript 1 to the constant intercept with the other betas then following on with subscripts 2 to k. In this approach the variable X1 just consists of a column of constant values (=1). This is the convention followed by Dougherty, Greene, Gujarati (Basic Econometrics), Hill, Griffiths and Judge, and Pindyck & Rubinfeld. Yet another variant is for the constant intercept to be labelled alpha, with the beta coefficients numbered from 1 to k (Maddala).

These differences are relatively minor, but when we move on to choice of symbols for the least squares estimates to go with these parameters there is a further lack of consistency. Kennedy, Maddala, Pindyck & Rubinfeld, Stock and Watson, Wooldridge and Gujarati (Basic Econometrics) put a "hat" over the Greek letter to show that we have an estimate (or estimator) of the parameter rather than its unknown value. But Dougherty, Greene, Hill, Griffiths and Judge and Gujarati in his other book (Essential Econometrics) instead use the equivalent Roman letterb for each of the betas.

And then we have the disturbances and their estimated equivalents, the residuals. Of the ten textbooks that I examined six use the letter u to denote the (unobservable) disturbance in the regression equation but three of the others use the Greek letter epsilon. One book (Hill, Griffiths and Judge) uses an ordinary e for the disturbance (or error) term. Now that can be confusing because two of the authors that use u as the disturbance have e to stand for the associated residual, as does one of the authors (Greene) who has epsilon for the disturbance. Hill, Griffiths and Judge put a hat over the e to denote the residual while Gujarati (Basic Econometrics), Maddala, Stock & Watson, and Wooldridge put a hat on the u to denote the residual. Pindyck & Rubinfeld put a hat on the epsilon that they use for the disturbance when they want to indicate the residual that goes with it. You can find a table showing the different symbols used by the various authors on my Introduction to Econometrics website at Portsmouth.

What is to be done about this? Why can't all these authors agree on a common system of notation? Taking the second question first I guess that each would argue that there are advantages of working with the particular convention that they adopt. There is certainly a logic to each of the choices made by the different authors but it does make it difficult for a student who consults more than one text book as he tries to get to grips with the subject. You might advise him to stick just to one textbook until he is confident enough about the meaning of the various symbols to recognise a slightly different label being used elsewhere. But I have never wanted just to recommend a single textbook for the courses that I teach. Different types of exposition suit different students. Some want a formal presentation and can handle the proofs and derivations that go with it. Others need a more intuitive approach with lots of examples and illustrations. And I find that some authors give a better exposition on one topic (perhaps autocorrelation) but maybe not such a good one as elsewhere on another (multicollinearity perhaps). So students can benefit by reading more than one account. In any case at some point they will have to get to grips with different notational systems used in the journal articles that they must read so it might be better to face up to this sooner rather than later.

So my point is...? Let's face up to the fact that there are different notational conventions, look at each of them and compare them explicitly and thereby enable students to become flexible enough to switch form one to another as the situation requires. It may also help students gain a better understanding of the underlying concepts if they have to think more carefully about what they are reading or writing.

These differences are relatively minor, but when we move on to choice of symbols for the least squares estimates to go with these parameters there is a further lack of consistency. Kennedy, Maddala, Pindyck & Rubinfeld, Stock and Watson, Wooldridge and Gujarati (Basic Econometrics) put a "hat" over the Greek letter to show that we have an estimate (or estimator) of the parameter rather than its unknown value. But Dougherty, Greene, Hill, Griffiths and Judge and Gujarati in his other book (Essential Econometrics) instead use the equivalent Roman letter

And then we have the disturbances and their estimated equivalents, the residuals. Of the ten textbooks that I examined six use the letter u to denote the (unobservable) disturbance in the regression equation but three of the others use the Greek letter epsilon. One book (Hill, Griffiths and Judge) uses an ordinary e for the disturbance (or error) term. Now that can be confusing because two of the authors that use u as the disturbance have e to stand for the associated residual, as does one of the authors (Greene) who has epsilon for the disturbance. Hill, Griffiths and Judge put a hat over the e to denote the residual while Gujarati (Basic Econometrics), Maddala, Stock & Watson, and Wooldridge put a hat on the u to denote the residual. Pindyck & Rubinfeld put a hat on the epsilon that they use for the disturbance when they want to indicate the residual that goes with it. You can find a table showing the different symbols used by the various authors on my Introduction to Econometrics website at Portsmouth.

What is to be done about this? Why can't all these authors agree on a common system of notation? Taking the second question first I guess that each would argue that there are advantages of working with the particular convention that they adopt. There is certainly a logic to each of the choices made by the different authors but it does make it difficult for a student who consults more than one text book as he tries to get to grips with the subject. You might advise him to stick just to one textbook until he is confident enough about the meaning of the various symbols to recognise a slightly different label being used elsewhere. But I have never wanted just to recommend a single textbook for the courses that I teach. Different types of exposition suit different students. Some want a formal presentation and can handle the proofs and derivations that go with it. Others need a more intuitive approach with lots of examples and illustrations. And I find that some authors give a better exposition on one topic (perhaps autocorrelation) but maybe not such a good one as elsewhere on another (multicollinearity perhaps). So students can benefit by reading more than one account. In any case at some point they will have to get to grips with different notational systems used in the journal articles that they must read so it might be better to face up to this sooner rather than later.

So my point is...? Let's face up to the fact that there are different notational conventions, look at each of them and compare them explicitly and thereby enable students to become flexible enough to switch form one to another as the situation requires. It may also help students gain a better understanding of the underlying concepts if they have to think more carefully about what they are reading or writing.

## 0 Comments:

Post a Comment

<< Home