Downloading and analyzing NVD CVE feed. In previous post “New National Vulnerability Database visualizations and feeds” I mentioned JSON NVD feed.
Let’s see what data it contains, how to download and analyse it. First of all, we need to download all files with CVEs from NVD database and save them to some directory.
Unfortunately, there is no way to download all the content at once. Only one year archives. We need to get urls first. Url looks like this: https://static.nvd.nist.gov/feeds/json/cve/1.0/nvdcve-1.0-2017.json.zip. Then we will download them all.
import requests import re r = requests.get('https://nvd.nist.gov/vuln/data-feeds#JSON_FEED') for filename in re.findall("nvdcve-1.1-[0-9]*\.json\.zip",r.text): print(filename) url = "https://nvd.nist.gov/feeds/json/cve/1.1/" + filename print(url) r_file = requests.get(url, stream=True) with open("nvd/" + filename, 'wb') as f: for chunk in r_file: f.write(chunk)
upd. 11.07.2022 Updated the code for feed version 1.1
upd. 19.02.2022 Fixed the url
Output:
nvdcve-1.1-2023.json.zip https://nvd.nist.gov/feeds/json/cve/1.1/nvdcve-1.1-2023.json.zip nvdcve-1.1-2022.json.zip https://nvd.nist.gov/feeds/json/cve/1.1/nvdcve-1.1-2022.json.zip nvdcve-1.1-2021.json.zip https://nvd.nist.gov/feeds/json/cve/1.1/nvdcve-1.1-2021.json.zip ... nvdcve-1.1-2002.json.zip https://nvd.nist.gov/feeds/json/cve/1.1/nvdcve-1.1-2002.json.zip
Ok, now when we have files in nvd/ directory we can easily parse and analyse them.
from os import listdir from os.path import isfile, join import zipfile import json files = [f for f in listdir("nvd/") if isfile(join("nvd/", f))] files.sort() for file in files: archive = zipfile.ZipFile(join("nvd/", file), 'r') jsonfile = archive.open(archive.namelist()[0]) cve_dict = json.loads(jsonfile.read()) jsonfile.close()
All necessary content will be in cve_dict and if we make print(cve_dict.keys())
, we get:
[u'CVE_data_timestamp', u'CVE_data_version', u'CVE_Items', u'CVE_data_format', u'CVE_data_numberOfCVEs', u'CVE_data_type']
CVE data is placed in cve_dict['CVE_Items']
list, and other parameters are for information only:
print("CVE_data_timestamp: " + str(cve_dict['CVE_data_timestamp'])) print("CVE_data_version: " + str(cve_dict['CVE_data_version'])) print("CVE_data_format: " + str(cve_dict['CVE_data_format'])) print("CVE_data_numberOfCVEs: " + str(cve_dict['CVE_data_numberOfCVEs'])) print("CVE_data_type: " + str(cve_dict['CVE_data_type']))
Output for nvdcve-1.0-2017.json.zip:
CVE_data_timestamp: 2017-09-30T07:02Z CVE_data_version: 4.0 CVE_data_format: MITRE CVE_data_numberOfCVEs: 7583 CVE_data_type: CVE
Ok. now let’s see how the CVE item looks withprint(json.dumps(cve_dict['CVE_Items'][0], sort_keys=True, indent=4, separators=(',', ': ')))
{ "configurations": { "CVE_data_version": "4.0", "nodes": [ { "cpe": [ { "cpe23Uri": "cpe:2.3:a:microsoft:word:2016:*:*:*:*:*:*:*", "cpeMatchString": "cpe:/a:microsoft:word:2016", "vulnerable": true } ], "operator": "OR" } ] }, "cve": { "CVE_data_meta": { "ID": "CVE-2017-0019" }, "affects": { "vendor": { "vendor_data": [ { "product": { "product_data": [ { "product_name": "word", "version": { "version_data": [ { "version_value": "2016" } ] } } ] }, "vendor_name": "microsoft" } ] } }, "data_format": "MITRE", "data_type": "CVE", "data_version": "4.0", "description": { "description_data": [ { "lang": "en", "value": "Microsoft Word 2016 allows remote attackers to execute arbitrary code or cause a denial of service (memory corruption) via a crafted document, aka \"Microsoft Office Memory Corruption Vulnerability.\" This vulnerability is different from those described in CVE-2017-0006, CVE-2017-0020, CVE-2017-0030, CVE-2017-0031, CVE-2017-0052, and CVE-2017-0053." } ] }, "problemtype": { "problemtype_data": [ { "description": [ { "lang": "en", "value": "CWE-119" } ] } ] }, "references": { "reference_data": [ { "url": "http://www.securityfocus.com/bid/96042" }, { "url": "http://www.securitytracker.com/id/1038010" }, { "url": "https://portal.msrc.microsoft.com/en-US/security-guidance/advisory/CVE-2017-0019" } ] } }, "impact": { "baseMetricV2": { "cvssV2": { "accessComplexity": "MEDIUM", "accessVector": "NETWORK", "authentication": "NONE", "availabilityImpact": "COMPLETE", "baseScore": 9.3, "confidentialityImpact": "COMPLETE", "integrityImpact": "COMPLETE", "vectorString": "(AV:N/AC:M/Au:N/C:C/I:C/A:C)" }, "exploitabilityScore": 8.6, "impactScore": 10.0, "obtainAllPrivilege": false, "obtainOtherPrivilege": false, "obtainUserPrivilege": false, "severity": "HIGH", "userInteractionRequired": true }, "baseMetricV3": { "cvssV3": { "attackComplexity": "LOW", "attackVector": "LOCAL", "availabilityImpact": "HIGH", "baseScore": 7.8, "baseSeverity": "HIGH", "confidentialityImpact": "HIGH", "integrityImpact": "HIGH", "privilegesRequired": "NONE", "scope": "UNCHANGED", "userInteraction": "REQUIRED", "vectorString": "AV:L/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H" }, "exploitabilityScore": 1.8, "impactScore": 5.9 } }, "lastModifiedDate": "2017-07-12T01:29Z", "publishedDate": "2017-03-17T00:59Z" }
Well, I am interested in formalized data about vulnerable software products (CPEs) and criticality description (CVSS).
Talking about vulnerable software products, you can see that this information exist both in “configurations” and “cve->affects”. Probably the “configurations” is more like detection criteria and “cve->affects” are the lists of vulnerable software.
Anyway, this is a good example why you can’t use this data for vulnerability detection in many cases. Let’s say cpe:/a:microsoft:word:2016 is vulnerable. Patched version of MS Word will also have this cpe id cpe:/a:microsoft:word:2016. It’s good to know what the software may be affected by the CVE, but for detection it simply won’t be enough.
But sometimes it’s pretty good. Like this one for Skype vulnerability:
[{u'operator': u'OR', u'cpe': [{u'cpe23Uri': u'cpe:2.3:a:microsoft:skype:7.2:*:*:*:*:*:*:*', u'cpeMatchString': u'cpe:/a:microsoft:skype:7.2', u'vulnerable': True}, {u'cpe23Uri': u'cpe:2.3:a:microsoft:skype:7.35:*:*:*:*:*:*:*', u'cpeMatchString': u'cpe:/a:microsoft:skype:7.35', u'vulnerable': True}, {u'cpe23Uri': u'cpe:2.3:a:microsoft:skype:7.36:*:*:*:*:*:*:*', u'cpeMatchString': u'cpe:/a:microsoft:skype:7.36', u'vulnerable': True}]}]
Let’s see how many CVEs have information about products (filtering CVEs with “** REJECT **” in description)
year,with_cpe,without_cpe 2002,6540,127 2003,1496,3 2004,2632,10 2005,4613,1 2006,6983,0 2007,6442,0 2008,6988,0 2009,4858,0 2010,4928,2 2011,4382,1 2012,5134,1 2013,5616,1 2014,7649,7 2015,7026,59 2016,8027,4 2017,7315,191
As you can see, there are not CPEs for some very old vulnerabilities and those that are currently in work.
But for the majority of vulnerabilities CPE data is somehow presented.
We can also see the situation with CVSS. How many CVEs have only CVSS v2, both CVSS v2 and v3, no CVSS data at all. Looking on “baseMetricV2” and “baseMetricV3” in item['impact']
:
year, CVSS v2, CVSS v2 and v3, no CVSS 2002,6665,2,0 2003,1498,1,0 2004,2641,1,0 2005,4613,1,0 2006,6981,2,0 2007,6436,6,0 2008,6986,2,0 2009,4853,5,0 2010,4913,15,2 2011,4370,12,1 2012,5110,24,1 2013,5575,42,0 2014,7275,374,7 2015,5434,1592,59 2016,206,7821,4 2017,0,7315,191
As you can see, switching to CVSS v3 goes well.
It may be also interesting to have a look on site references in CVE items and to classify these sites. But I will probably do it next time.
Hi! My name is Alexander and I am a Vulnerability Management specialist. You can read more about me here. Currently, the best way to follow me is my Telegram channel @avleonovcom. I update it more often than this site. If you haven’t used Telegram yet, give it a try. It’s great. You can discuss my posts or ask questions at @avleonovchat.
А всех русскоязычных я приглашаю в ещё один телеграмм канал @avleonovrus, первым делом теперь пишу туда.
Pingback: CWEs in NVD CVE feed: analysis and complaints | Alexander V. Leonov
Is there a possibility to convert json to a csv or xlsx file with a format like cve mitre?:
-> Name, ID, Description,…,
-> Name,ID, Description,…
…
Pingback: What’s wrong with patch-based Vulnerability Management checks? | Alexander V. Leonov
hello the content above is very good, but how can i use it for scanning source code, if i have zipped source code file in my local system and i want to check how vulnerable the source code is.
let’s say i downloaded source code file form github of any opensource software such as notepad++, then how is can scan the folders and show the result to me.
Plz help thanks in advance.
Hi, when you said that “Probably the “configurations” is more like detection criteria and “cve->affects” are the lists of vulnerable software.”, did you deduce this information or is it an official statement by NIST or NVD guys?
How would you retrieve just the ID and description from cve_dict[‘CVE_Items’]?
I mean, like ID, CVE-2019-xxxxx,
description, the description,
Or something showing key/value?
I know this is an old post, but hopefully you’re still looking.
Thanks!
print(cve_dict[‘CVE_Items’][0][‘cve’][‘description’][“description_data”][0][‘value’])
print(cve_dict[‘CVE_Items’][0][‘cve’][‘CVE_data_meta’][“ID”])
Pingback: Linux Kernel CVE Data Analysis (updated) – TuxCare
Pingback: Vulchain Scanner: 5 basic principles | Alexander V. Leonov
Pingback: Linux Kernel CVE Data Analysis (Updated) | TuxCare.com
Pingback: Linux Kernel CVE Data Analysis (Updated)