Files
Abstract
Software bugs result in a variety of issues including system crashes, loss of system performance, incorrect output and security vulnerabilities. In this thesis, we explore the design of a patch recommender system for the Linux kernel. Software tools that aid the developer in quickly developing bug fixes can help in improving programmer productivity. Previous efforts in automated bug resolution uses a search approach for exploring the automatically generated patch space for functionally correct patches. In contrast, we focus on bug reports, and patch commit descriptions manually generated by humans and expressed in a natural language such as English. Our goal is to relate a new bug description to the most closely related patch that resolved a similar bug in the past. Compared to existing approaches, we do not attempt to generate a patch code, but instead point the user to potential patches that enable developers to identify the part of the code base that they should focus on. Our approach is thus complementary to existing research. We explore the use of Natural Language Processing (NLP) to mine bug/patch descriptions. Our dataset consists of previously resolved bugs and the corresponding patches from the Linux kernel project. We pose the bug-patch matching as a semantic similarity NLP problem. After generating a custom word embedding for the bug-patch dataset, we train a Siamese LSTM network that outputs the Manhattan distance between bug and the patch. The Keras-Tensorflow framework is used. We then evaluate our approach with bugs from the test set, and determine the top-K matches for the bug from all existing patches. At the 50th percentile of the test bugs, the correct patch occurs within top 11.5 patch recommendations output by the model.