Files
Abstract
The case-cohort study design was originally proposed by Prentice (1986). Under this design,a random sub-cohort of individuals is selected from the cohort of study. Full covariate data
are collected from all the cases in the cohort and the sub-cohort, not all the original cohort,
saving time and money if measures such as biomarkers or genotypes are required. Thus, certain
covariates will be missing from a large number of individuals in the cohort of study. This
design has been widely used in clinical and epidemiological studies to study the effect of covariates on failure times. The Cox proportional hazards model (Cox 1972) is a popular and
classical choice in such data due to its nice interpretation of regression coefficients and the
availability of efficient inference procedures implemented in all statistical software packages.
Few other methods allow for time varying regression coefficients. An underlying assumption
of the Cox model is the so-called proportional hazards assumption, that is, the hazard ratio
remains constant over time or covariates have log-linear effects on the risk of the event of interest.
However, in many real datasets, covariates may exhibit much more complicated effects
than log-linear effects; thus, the proportional hazards assumption may be violated, and the
Cox model may not be an appropriate choice. In addition, most methods do not use the data of
the non-cases that are outside of sub-cohort which results into inefficient inference. Addressing
these issues, we have proposed an estimation procedure for the semiparametric additive
hazards model for case-cohort data, allowing the covariates of interest to be missing for cases
and for non-cases. We have considered an additive model in which effects of some covariates
are time varying while the effects of some other covariates are constants. Further, we have
assumed that the missing covariates have constant effect on failure time. We have proposed
an Augmented Inverse Probability Weighted Estimation (AIPW) procedure. It uses auxiliary
information that is correlated with missing covariates. We have established the asymptotic
properties of the proposed AIPW estimation. Our simulation study shows that Augmented
Inverse Probability Weighted estimation is more efficient than the widely used Inverse probability Weighed (IPW) and Complete case estimation method. This result is apparent if the
sub cohort is very small. The method is applied to analyze a data from a HIV vaccine efficacy
trial.