Go to main content
Formats
Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

Large cohort studies under simple random sampling could be prohibitive to conduct with a limited budget for epidemiological studies seeking to relate a failure time to some exposure variables that are expensive to obtain. In this case, two-phase studies are desirable. Failure-time-dependent sampling (FDS) is a commonly used cost-effective sampling strategy in such studies. To enhance study efficiency upon FDS, counting the auxiliary information of the expensive variables into both sampling design and statistical analysis is necessary.Chapter~2 discusses the semiparametric inference for a two-phase failure-time-auxiliary-dependent sampling (FADS) design that allows the probability of obtaining the expensive exposures to depend on both the failure time and cheaply available auxiliary variables. To account for the sampling bias, we develop a semiparametric maximum pseudo-likelihood approach for inference and a nonparametric bootstrap procedure for variance estimation. The proposed estimator of regression coefficients is shown to be consistent and asymptotically normal. The simulation studies indicate that the proposed method works well in practical settings and is more efficient than other competing sampling schemes or methods. The analyses of two real data sets are provided for illustration.In survival analysis, it's commonly assumed that all subjects in a study will eventually experience the event of interest. However, this assumption may not hold in various scenarios. For example, when studying the time until a patient progresses or relapses from a disease, those who are cured will never experience the event. These subjects are often labeled as ``long-term survivors'' or ``cured'', and their survival time is treated as infinite. When survival data include a fraction of long-term survivors, censored observations encompass both uncured individuals, for whom the event wasn't observed, and cured individuals who won't experience the event. Consequently, the cure status is unknown, and survival data comprise a mixture of cured and uncured individuals that can't be distinguished beforehand. Cure models are survival models designed to address this characteristic.Chapter~3 considers the generalized case-cohort design for studies with a cure fraction. Under this design, the expensive covariates are measured only for a subset of the study cohort, called subcohort, and for all or a subset of the remaining subjects outside the subcohort who have experienced the event, called cases. We propose a two-step estimation procedure under the semiparametric transformation mixture cure models. We first develop a sieve maximum weighted likelihood method based only on the complete data and also devise an EM algorithm for implementation. We then update the resulting estimator via a working model between the outcome and cheap covariates or auxiliary variables using the full data. We show that the proposed estimator is consistent and asymptotically normal, regardless of whether the working model is correctly specified or not. We also propose a weighted bootstrap procedure for variance estimation. Extensive simulation studies demonstrate the superior performance of the proposed method in finite-sample. An application to the National Wilms' Tumor Study is provided for illustration.A few directions for future research are discussed in Chapter~4.

Details

PDF

Statistics

from
to
Export
Download Full History