Travel Time Forecasting on a Freeway Corridor: a Dynamic Information Fusion Model based on the Machine Learning Approaches
Abstract
Metropolitan areas suffer from frequent road traffic congestion not only during peak hours but also during off-peak periods. Currently, the increasing availability of vehicle probe data has made the real-time travel time prediction a reality. The traffic on freeways is complicated to interpret, which can be impacted by various traffic features, many of which are also unpredictable. Despite the difficulties, a more profound understanding of the change of travel times and the TTP will greatly help infrastructure design, traffic management and operations, and transportation related decision-makings.Various statistical methods and machine learning methods have been employed in travel time forecasting. However, such machine learning methods practically face the problem of overfitting. Tree-based ensembles have been applied in various prediction fields, and such approaches usually produce high prediction accuracy by aggregating and averaging individual decision trees. The inherent advantages of these approaches can not only help obtain better prediction results but also have an excellent bias-variance tradeoff, which can help avoid overfitting. To improve the accuracy and the interpretability of the model, the random forest (RF) method is developed and used to analyze and model the travel time on freeways in this research. However, when the travel time prediction (TTP) time horizon increases (i.e., greater than 15 min), the performance of the RF method decreases significantly. Recently, as another powerful prediction method, the Long Short-Term Memory (LSTM) neural network methods have been widely applied to short-term traffic prediction. In this research, the attention mechanism (AM) is implemented by developing the neural network to capture the inner relationship within the traffic data. The proposed LSTM with attention mechanism (LSTM_AM) method achieves its superior capability for TTPs longer than 15 minutes (i.e., from 30 min to 60 min), overcoming the performance issue through long temporal dependency and memory blocks. To validate the accuracy and reliability of proposed models, the proposed approaches are tested using a freeway corridor in Charlotte, North Carolina, using the probe vehicle-based traffic data. The input features are introduced in detail, and data preprocessing is also presented. The mean absolute percentage errors (MAPEs) are computed for different observation segments in varying prediction horizons ranging from 15 to 60 minutes to measure the effectiveness of the proposed TTP algorithms. The features’ relative importance values show that variables (such as travel time 15 minutes before and time of day) have the highest contribution to the predicted results. The results also indicate that the proposed TTP models perform better in prediction at the 15-minute interval than the other time horizons. Besides, the RF model has the best prediction performance with an average MAPE of 6.34% in the 15-minute prediction horizon, and the LSTM_AM model has the best performance in all other prediction horizons (including 30 min, 45 min, and 60 min). In practice, they can be applied in their preferred prediction horizons. A comparison with other prediction methods validates that the proposed RF and LSTM methods can achieve a better prediction performance in both accuracy and efficiency, suggesting that they can be used as a part of the successful solutions to address critical and real-world transportation challenges.