Monday, October 13, 2014

Examples of the 4 main capabilities of the 'margins' command in Stata in use with linear models

The below gives examples of the four main capabilities of the margins command in Stata, as highlighted by J. Scott Long & Jeremy Freese in the 3rd edition of "Regression Models for Categorical Dependent Variables using Stata".

I used data from the 2012 edition of the European Social Survey to illustrate these capabilities. To simplify things, I used only four variables for 1,714 individuals from Great Britain. The main outcome variable is 'ls' which is life-satisfaction scored from 0 = Extremely dissatisfied to 10 = Extremely satisfied. Gender is coded 1 = male, 2 = female. Income is grouped by decile where 1 = lowest income decile and 10 = highest income decile. There is also an ID variable.


(1) Predictions for each observation
Margins can predict the probability of an outcome for each person in the data, taking into account all the covariates included in the regression. The 'predict' command can also do this.

reg ls i.gender income
margins gender




























The average predicted life satisfaction is 7.309 after controlling for gender and income. The same thing can be computed using the predict command:

reg ls i.gender income
predict predicted_ls
sum predicted_ls










The predict command produces the same predicted life satisfaction score of 7.309, but margins also provides standard errors and confidence intervals

(2) Predictions at specified values
Margins can compute the probability of an outcome at specific values of covariates. The below code instructs Stata to compute predicted life satisfaction at 1 unit increments of the ten deciles of income, starting from 1 and stopping at 10, while controlling for gender. The 'vsquish' option is just an aesthetic change in how the results are presented and can be ignored.

reg ls i.gender income
margins, at(income=(1(1)10)) vsquish

Note: I could have instead used "margins, at(income=(1 2 3 4 5 6 7 8 9 10))".



























Predicted life satisfaction scores range from 6.82 for the bottom income decile to 7.87 for the top decile after controlling for gender.

(3) Marginal effects
Margins can compute how changes in a covariate are associated with changes in the outcome, holding other covariates constant. In other words, it can compute marginal effects.

reg ls i.gender income
margins, dydx(gender)

















Women have -0.61 lower predicted life-satisfaction scores than men after controlling for income, although the difference is not significant. Because this is a linear OLS model, this same result could have been calculated by looking at the gender coefficient in the main regression. The real value of this command is when estimating marginal effects in a non-linear model such as a Probit, which doesn't return intuitive coefficients as a default.

(4) Graphs of predictions
Lastly, margins can be easily combined with marginsplot to graph its predictions, with 95% confidence intervals included as a default.
reg ls i.gender income
margins, at(income=(1(1)10))
marginsplot























It's also very easy to graph interaction terms using this method. The below code examines whether the relationship between income and life-satisfaction differs by gender.

reg ls i.gender##c.income
margins gender, at(income=(1(1)10))
marginsplot






















This is part of a series of posts designed to highlight the usefulness of the margins command & various graphing capabilities in Stata. It draws on the 3rd edition of "Regression Models for Categorical Dependent Variables using Stata" by Long & Freese as a primary reference.

No comments: