The increasing performance of feature extraction and regression modeling in various domains raises the hope for machine and deep learning to assist clinicians in numerous healthcare applications. However, the complex and multimodal nature of the problems and the scarce resource of high-quality labeled data in this domain introduces several challenges and limitations. These challenges, along with lack of interpretability, undermines the generalizability and usability of many state-of-the-art machine learning models.This dissertation focuses on using multimodal sources of data for regression modeling in healthcare applications. The argument is that domain knowledge describes the nature of each modality's relationship with the target function. This relationship can characterize the appropriate level of representation and an efficient integration method. We define a framework with two heterogeneous modalities; one modality provides more local features, while another contains higher-level global information. We demonstrate the framework's applicability for multiple healthcare regression tasks. In this framework, we propose a pipeline for incorporating domain knowledge in designing an effective integration model. Particularly, we suggest two approaches for increasing the performance in the absence of large-scale data: leveraging the modality representations' abstraction based on domain knowledge and a tree-structure convolutional neural network for integrating the information from the heterogeneous modalities. This framework is discussed in more detail for two different cases of "radiation therapy treatment planning" and "Alzheimer's disease progression prediction." The former predicts a two-dimensional using the leveraged representation, the latter evaluates the whole pipeline for predicting a scalar target variable. The performance is compared with the previous submissions for the same dataset; it outperforms the best-reported results.