Go to main content
Formats
Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

Cyber attacks cost companies and organizations billions of dollars each year. To alleviate this issue, security professionals and companies continuously attempt to discover new vulnerabilities, and also to share their cyber threat information about ongoing attacks with other defenders and the public. However, the amount of data that is being shared by them is immense; making the problem of finding useful information tantamount to looking for a needle in a haystack.In this dissertation, I present a new framework utilizing various machine learning, text mining and natural language processing techniques to automatically extract cyber threat intelligence, in special Indicators of compromise, from public data sharing platforms such as social media, discussion forums, and text sharing websites. I also present two systems to predict malicious IP addresses and to detect phishing URLs. These systems can be integrated with the presented framework and consume its output to improve their results. Moreover, I introduce a new class of vulnerabilities that arises from conflicting requirements in modern operating systems. To show its feasibility, I reveal one of such vulnerabilities in Microsoft Windows operating systems and based on that propose a new stealthy lateral movement that cannot be detected by existing state-of-the-art detection systems.To extract useful threat information from public data sharing platforms, I, first, present a reputation model to identify credible cyber threat intelligence sources. Only streams of data published by such sources are tracked. Although the identified sources publish threat information in general, they may also post about other topics such as personal matters. Hence, I devise another model to filter-out non-threat information from the observed data streams. Next, I introduce an IoC extraction tool to extract and combine IoCs from the filtered streams. The output of this framework in given to two predictive models to validate the IP addresses and URLs associated with the resulted IoCs. The confirmed IoCs can further be used to train these system. In this dissertation, I focus on Twitter and Pastebin as exemplars of social media and text-sharing platforms respectively. However, the presented work can be adapted to other similar platforms without requiring significant effort.

Details

PDF

Statistics

from
to
Export
Download Full History