ECONOMETRICS ASSIGNMENT 2
For this assignment, use the dataset eaef_as2_2014.dta, which has 1199 observations, The data contain information on wages and characteristics workers in
United States, in 2002. It includes a variable union which indicates whether the person’s wages are set by collective bargaining.
a) Estimate how much more or less private sector workers in the US on average earn when their wages are set by collective bargaining, holding their characteristics

constant. Interpret the findings. [75%]
b) Extend part a) by adding an interaction for union and female. Include all three variables (union, female and union*female) into the model and interpret the results

[25%]
Practical notes:
It may be sensible to use a do-file to avoid retyping the regression commands multiple times. If you however prefer to work from the command line, note that by

pressing page up/down-buttons you can get the previous commands to the command line this may be the quickest way to modify the your estimation.
Points to keep in mind:
-Could you improve by making transformations to variables / assume non-linearities?
-Are there outliers that distort your model?
-Can you rule out endogeneity of your explanatory variables?
-To tabulate 2 variables use tabulate. Example: tabulate female urban
-To look up how a command works, use help, such as help regress.
-Are the key assumptions of OLS holding (note that some of them can’t be directly tested)?
-Study the dataset and variables and think what you can and can’t do with it.
-Do your results make sense to you? (No need for literature review or outside references!)
The submitted answer should consist of maximum of 3 printed pages (longer answers penalised),
using font size 10 or 12. To align Stata output nicely use Courier 10 Pitch, or Courier New
font, and font size 10 in Word. For estimations in both part a) and b) only add the 4 sections as
shown on next page. Nothing more, please. The next page is a simplified sample answer for part a).
Answer should be in similar format in part b), but unnecessary repetition should be left out.
Grading is based on the overall sensibility and informativeness of the preferred models, and their
correct interpretation and testing. There is no one right answer for this assignment.

