"Paper, Meet Code": A Deep Learning Approach to Linking Scholarly Articles With GitHub Repositories
Prahyat Puangjaktha, Morakot Choetkiertikul, Suppawong Tuarob
Abstract
Abstract
Computer scientists often publish their source code accompanying their publications, prominently using code repositories across various domains. Despite the concurrent existence of scholarly articles and their associated official code repositories, explicit references linking the two are often missing. Traditionally, identifying whether scholarly content and code repositories pertain to the same research project requires manual inspection, a time-consuming task. This paper proposes a deep learning-based algorithm for automatically matching scholarly articles with their corresponding official code repositories. Our findings indicate that the most common linking information includes the paper title and BibTeX entries, typically found in the repository’s readme document. In this study, we employed SPECTER for vector embedding of paper and repository metadata, providing a robust automated solution for linking these valuable academic resources.
Cite this work
@article{ paper_meet_code,
title={ "Paper, Meet Code": A Deep Learning Approach to Linking Scholarly Articles With GitHub Repositories },
author={ Prahyat Puangjaktha and Morakot Choetkiertikul and Suppawong Tuarob },
journal={ IEEE Access },
year={ 2024 },
doi={ 10.1109/ACCESS.2024.3399767 },
url={ https://prayat-pu.github.io/mike-lab/publications/paper-meet-code-a-deep-learning-approach-to-linking-scholarly-articles-with-github-repositories/ }
}