Discovering Vulnerabilities and Patches for Open Source Security

Gunkel, Tamara; Hupperich, Thomas


Zusammenfassung
Open source software is used in numerous systems and security vulnerabilities in such software often affect many targets at once. Hence, it is crucial to find security vulnerabilities as soon as possible. A convenient method to check software for vulnerabilities is executing a static code analysis tool before deployment. However, for verifying the reliability of such tools, real-world data including labeled non-vulnerable and vulnerable code is required.
This paper introduces an approach to automatically create and enhance a labeled data set of open source projects. The ground truth of vulnerabilities is extracted from up-to-date CVEs. We identify repositories related to known vulnerabilities, select vulnerable versions and take patch commits into account. In this context, we utilize Gradient Boosting based on regression trees as a meta classifier for associating patch commits to CWE categories. With a high precision of this matching, we give insights about the impact of certain vulnerabilities and a general overview of open source code security. Our findings may be used for future studies, such as the impact of certain code design criteria, e.g. clean code, on the prevalence of vulnerabilities.

Schlüsselwörter
Web Security, Data Set Generation, Commit Classification



Publikationstyp
Forschungsartikel in Sammelband (Konferenz)

Begutachtet
Ja

Publikationsstatus
Veröffentlicht

Jahr
2022

Konferenz
International Conference of Software Technologies (ICSOFT)

Konferenzort
Lisbon

Buchtitel
Proceedings of the 17th International Conference on Software Technologies

Herausgeber
SciTePress

Verlag
SciTePress

Ort
Lisbon, Portugal

Sprache
Englisch

ISSN
2184-2833