Instructions

  • This assignment uses data on lung function (Lung), adolescents with parents that have HIV (Parhiv), and the depression study (Depress). Read about these data sets in PMA6 Chapter 1. As part of HW0, you downloaded all the data sets and codebooks into your MATH456/data folder.
  • Right click this link to download the Quarto template for this homework.
    • This template is required for this homework as it provides a demonstration on how to use markdown headers # and ## to nicely format your homework to make it easy for me to find your problems. All further homework must be in a format like this (a template is not always provided.)
  • Render to PDF and submit to Google Drive: Topic 01: Regression Modeling/Draft folder by the due date.
  • This is a peer reviewed assignment. After the PR period, you should update your assignment with corrections by your peers, and submit a final version to Canvas.

Part I: Statistical Modeling

  1. From the depression data set, fit a linear regression model on depression as given by cesd, using income and age as independent variables. (PMA6 problem 8.5, modified)
    1. Analyze the residuals and decide whether or not it is reasonable to assume that they follow a normal distribution.
    2. Interpret each coefficient (including the intercept) in full context of the problem. You must include the point estimate (\(\hat{\beta}\)), a 95% confidence interval for that estimate, and a p-value in your conclusion.


  1. Does gender (sex) modify the relationship between age and depression? Don’t forget to recode sex into a categorical variable before you begin. Answer this question in two ways: a) using a stratified model b) using an interaction model.


  1. Which of the two models in question 2 assumes that the affect of income on depression is constant (does not change) between males and females?


  1. For adult males it has been demonstrated that age and height are useful in predicting FEV1. Using the lung function data set, determine whether the regression plane can be improved by also including weight. Use two measures of model fit to justify your answer to this question (PMA6 problem 9.4, Ref PMA6 section 9.4). Note: all variables for adult males start with the letter F (FAGE, FFEV1, FHEIGHT, etc.)


  1. For adult males, does weight confound the relationship between age or height and FEV1? You must show relevant model output and explain how you came to your conclusion.


  1. Using the depression data set, fit a model of income using age, sex, educational level and religion as predictors. (PMA6 problem 10.3 modified)
    1. Use a general F test to determine whether religion has an effect on income.
    2. State the reference categories for both religion and educational level.
    3. Interpret the coefficient for each level of educational level in full context of the problem, being sure to include the comparison group. You must include the point estimate (\(\hat{\beta}\)), a 95% confidence interval for that estimate, and a p-value in your conclusion.



Part II: Variable Selection

PMA6 problems 9.9, 9.11, 9.12, 9.13, 9.14