This homework uses the modified Parental HIV data set found in Canvas under this assignment. Use this template qmd file.

- Identify the amount of missing in the entire data set.
- Identify the amount of missing in the 2 Parental bonding and 10 Brief Symptom Inventory scales.
- Explore and describe bivariate missing patterns between the parental overprotection subscale and a different scale variable.
- Single impute
`parent_overprotection`

using the`hotdeck(dataset, variable = "var")`

function in VIM. See`vignette("donorImp")`

for more information. - Multiply impute
`parent_overprotection`

using a non-mice based imputation method of your choice that has a random component to it.- Calculate the point estimate \(Q\) and the variance \(U\) from each imputation.
- Pool estimates

- Comparison of Estimates. Create a summary table and plot containing the point estiamte and 95% CI parental overprotection variable under a) complete case, b) single imputation done in #4, and c) multiple imputation done in #5. Summarize your findings.

- Build an better imputation model for
`parental_overprotection`

. Do this by imputing the`pb01-pb25`

, then recreate the`parental_overprotection`

scale post-imputation. “Talk me” through your process.- Explore missing data patterns in other (non-scale) variables before
you build your model. Not all variables should be considered in the
imputation models but be sure to include
`gender`

and`hookey`

. Use tables and plots but ensure all output is discussed and don’t create output that won’t be discussed. - Multiply impute this data set between \(m=5\) and \(m=10\) times using MICE. Make sure the imputation models used for each variable are showing in your final output. Adjust any that may not make sense for their variable type.
- Update the summary table and plot from Part I and compare how your new model did compared to the others.

- Explore missing data patterns in other (non-scale) variables before
you build your model. Not all variables should be considered in the
imputation models but be sure to include
- After controlling for other measures, what is the effect of gender
on the
*odds*a student will skip school? Adjust the model for fit or stability as needed. Report your results in a nice table and/or plot.- Fit this model on the complete cases (no imputation).
- Fit this model on the multiply imputed data sets from the prior problem, report the pooled estimates and intervals.
- Interpret the effect of gender on playing hookey. Did it change from the complete case model?
- Create a plot to compare the results for all coefficients in the model.
- What are the biggest differences you notice? Would the inference/interpretation of the effect of any covariate on the odds of a student skipping school change depending on what model you use?