Go to main content
Formats
Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

As science advances, the academic community has published millions of research papers. Researchers devote time and effort to search relevant manuscripts when writing a paper or simply to keep up with current research. In this dissertation, we consider the problem of citation recommendation on graph. Our analysis shows the degrees of cited papers in the subgraph induced by the citations of a paper, called projection graph, follow a power law distribution. Existing popular methods are only good at finding the long tail papers, the ones that are highly connected to others. In other words, the majority of cited papers are loosely connected in the projection graph but they are not going to be found by existing methods. To address this problem, a family of random walk based algorithms combining author, venue and keyword information is proposed to interpret the citation behavior behind those loosely connected papers. We further explore neural node embedding in graph for citation recommendation and the proposed task specific sampling strategy turns out to be much robuster than classic methods when hidden ratio changes. In particular, with the aim of improving the quality of meta data, we also present a keyphrase extraction algorithm from scientific articles by addressing overgeneration error and it outperforms state-of-the-art approaches.

Details

PDF

Statistics

from
to
Export
Download Full History