Egwald Statistics: Multiple Regression

Egwald's popular web pages are provided without cost to users.

by

Elmer G. Wiens

The following study demonstrates a practical application of the statistical technique of regression analysis.

I am interested in analyzing how regional voting patterns affect and are influenced by the Canada wide percentage of seats that the winning party takes in an election. In particular I want to look at the relationship between the winning party's percentage of seats across Canada in relation to the percentage of seats it won in Eastern Canada, Ontario and Quebec, the Prairie Provinces, and British Columbia. To this end, I collected data and calculated percentages for the 19 Canadian elections between 1949 and 2006.

I use the following notation:

Pty = Winning party
Lib = Liberal Party
PC = Progressive Conservative Party
M = Winning party has a majority of federal seats
C = Percent of seats of Canada taken by the winning party
E = Percent of seats of Eastern Canada taken by the winning party
OQ = Percent of seats of Ontario and Quebec taken by the winning party
P = Percent of seats of the Prairie Provinces taken by the winning party
BC = Percent of seats of B.C. taken by the winning party
M = Winning party forms a majority government M = 1, otherwise M = 0

I want to predict Canadian elections using the linear equation:

C = ß0 + ß1 * E + ß2 * OQ + ß3 * P + ß4 * BC + ß5 * M

where the symbols, ß0, ß1, ß2, ß3, ß4, and ß5 represent parameters whose values I will determine from previous elections.

If I use the data from the 19 Canadian elections during 1949 to 2006 to estimate this equation's parameters I get:

C = 9.271 + 0.126 * E + 0.382 * OQ + 0.069 * P + 0.176 * BC + 8.028 * M

The following pages explain how these parameter estimates for the linear equation were obtained.

Table 1
The Winning Party's Percentage of Seats by Year and by Region
```____________________________________________________________
Year  Pty    M      C      E      OQ     P       BC    M
____________________________________________________________
2006  Con   No      40.7   28     27.6   85.7    47    0
2004  Lib   No      43.8   69     53     11      22    0
2000  Lib   Yes     57.5   60     77     17      15    1
1997  Lib   Yes     52     34     72     17      18    1
1993  Lib   Yes     60     97     67     39      19    1
1988  PC    Yes     57     34     63     67      38    1
1984  PC    Yes     75     78     74     80      68    1
1980  Lib   Yes     52     63     74      4       0    1
1979  PC    No      48     56     58     78      68    0
1974  Lib   Yes     54     41     71     11      35    1
1972  Lib   No      41     31     56      7      17    0
1968  Lib   Yes     59     22     74     25      70    1
1965  Lib   No      50     45     69      2      32    0
1963  Lib   No      49     61     62      6      32    0
1962  PC    No      44     55     49     88      27    0
1958  PC    Yes     79     76     73     95      82    1
1957  PC    No      42     64     52     29      32    0
1953  Lib   Yes     64     34     78     35      38    1
1949  Lib   Yes     73     74     78     58      61    1
Average            54.6   53.7   64.7   39.7    38.1
Ave   Lib          54.3   52.4   69.1    19     29.8
Ave   PC           54.6   56.2    57    74.8    52.1
________________________________________________________________________

```

Notice that the number of elections = 19, which equals the number of observations. Since we are interested in the extent to which the four regions determine the winning party in a Canadian election, the number of independent variables = 5 (number of regions: E, OQ, P and BC plus M).   C, the percentage of seats taken across Canada by the winning party, is the dependent variable.

When you are using my regression package for a study of your own, you must fill in the tables in the same way that I have done below.

To keep things manageable, in my package the number of observations must be less than 20 and the number of independent variables less than 6. Also, the number of observations must be greater than or equal to the number of variables. I'll explain what the intercept term means on the next page.

Regression title
Number of observations
Number of independent variables
Intercept term
Yes No
So you give your study a name and enter it with the number of observations and independent variables into the table. Then you click 'submit parameters'.