The code used for this experiment is in the multipleCPEexperiment.py file.
A sample of the data is in the MultipleCPEresults.json file
Experiment 1: Use the full list of CPEs from the NVD-Join-OSV database, that correspond to a specifc, versionless PURL
Test method: Given a purl, get from the purl2cpe_mapping.csv the respective list of cpes. Then collect from NVD the CVEs related to these CPEs; The hypothesis is that we get the same set of vulnerabilities we would ge from searching OVS entries for the purl.
Result: Not a vaible solution;
- We do not get the same set of vulnerabilities.
- the CVE list generated from NVD is huge - we get far to much false positive
- the set intersection of the cves per cpe sets is empty - false negative.
Explanation:
- The cpes refer to different products. these products may have many different CVEs. Thus the union of these CVEs is large and contains many false positives.
- The full list of CPEs corresponds to different versions if the package (purl) that may have a different set of CVEs. Thus, the union of CVEs is large (it is accross different versions).
- The intersection of CVEs is empty, because the CPEs refer to different products and versions of the package (purl) that have different CVEs. Some of these are relevant to the purl related CVEs, but those groups do not necassarily overlap.
Experiment idea: similar to experiment 1, but overcoming the problems that arise from using an unversioned purl.
- Test method: did the analysis manually.
- Package info:
- PURL: pkdg://pypi/cryptogrphy
- I will select a recent vulnerable version: 39.0.0
- vulnerable to CVE-2023-23931, CVE-2023-0286
- cyptogpraphy cpe: cpe:2.3:a:cryptography_project:cryptography::::::python::*
- openssl cpe: cpe:2.3:a:openssl:openssl::::::::
Result: Not a vaible solution;
Explanation:
- the CVE-2023-0286 is related to the openssl product, and CVE-2023-23931 is related to the cryptography package.
- Quering NVD for the cves of openssl results in a huge number of vulnerabilities, most of them not relevant to the cryptography package.
- There is no way to understand from NVD that the CVE-2023-0286 is related to the cryptography package.
- There is no overlap: quering NVD for the cryptography related cpe does not point to CVE-2023-0286, thus intersacting the sets of CVEs is empty.
This is an interesting case: The OSV has GHSA-x4qr-2fvf-3mr5 which is a cryptography package vulnerability, because it inludes openssl. So here GH (through OSV) plays the role of NVD - it publishes the vulnerable product (and not only the vulnerable package). The ramification is that a package vulnerability may also be a vulnerability of one of it's dependencies, which makes it more complicated to rely on the OSV data.
Note: openssl is not a dependency of the python code of cryptography, it is part of the built-distribution - the wheel.
- A simplistic approach of taking all the cpes that relate to a purl, when the relation is generated by the ovs-nvd join does not work.
- This approach has limitations:
- cpes are about products, not packages. These products may have many vulnerabilities.
- both NVD and the OSV sources (in our example - GHSA) have different (and maybe incosisntent) approaches for associating CVEs with products\packages. Thus there seems to be inherint limitations to this approach.
- Even if we would find a way to create an exact fit between a subset of cpes and a purl, this fit may not serve for future evaluations; the products may have more CVEs discovered.