Files
Abstract
Human activities are important driving factor of electricity consumption. Human activities can be captured by calendar information. This research uses the hour of a day, days of a week, month of a year and 24 solar terms as calendar variables to classify the data. While using class variables in regression model, complexity of model is directly proportional to the level of the class variables. Additionally, interaction with temperature terms further increases the number of coefficients to be estimated. When there are few training observations, more complexity may lead to overfit the model. Grouping of calendar variable can be one of the solutions to this issue. Previously, many researchers grouped the calendar variables to improve load forecast. This research proposes three heuristic algorithms which are the structured way of finding the optimal grouping pattern instead of selecting it empirically. These heuristic algorithms are faster than enumerating all possible combinations of grouping calendar variables. Additionally, this research studies the number of optimal grouping obtained when model grows. The forecasting accuracy obtained by grouping calendar variable is improved as compared to without grouping calendar variables on validation data and some cases in test data. The experiments were conducted on total system load of ISO New England’s and GEFCom2012 publicly available data.