Downloading and analyzing NVD CVE feed

In previous post “New National Vulnerability Database visualizations and feeds” I mentioned JSON NVD feed.

Let’s see what data it contains, how to download and analyse it. First of all, we need to download all files with CVEs from NVD database and save them to some directory.

Unfortunately, there is no way to download all the content at once. Only one year archives. We need to get urls first. Url looks like this: https://static.nvd.nist.gov/feeds/json/cve/1.0/nvdcve-1.0-2017.json.zip. Then we will download them all.

import requests
import re

r = requests.get('https://nvd.nist.gov/vuln/data-feeds#JSON_FEED')
for filename in re.findall("nvdcve-1.1-[0-9]*\.json\.zip",r.text):
    print(filename)
    url = "https://nvd.nist.gov/feeds/json/cve/1.1/" + filename
    print(url)
    r_file = requests.get(url, stream=True)
    with open("nvd/" + filename, 'wb') as f:
        for chunk in r_file:
            f.write(chunk)

upd. 11.07.2022 Updated the code for feed version 1.1
upd. 19.02.2022 Fixed the url

Output:

nvdcve-1.1-2023.json.zip
https://nvd.nist.gov/feeds/json/cve/1.1/nvdcve-1.1-2023.json.zip
nvdcve-1.1-2022.json.zip
https://nvd.nist.gov/feeds/json/cve/1.1/nvdcve-1.1-2022.json.zip
nvdcve-1.1-2021.json.zip
https://nvd.nist.gov/feeds/json/cve/1.1/nvdcve-1.1-2021.json.zip
...
nvdcve-1.1-2002.json.zip
https://nvd.nist.gov/feeds/json/cve/1.1/nvdcve-1.1-2002.json.zip

Ok, now when we have files in nvd/ directory we can easily parse and analyse them.

from os import listdir
from os.path import isfile, join
import zipfile
import json

files = [f for f in listdir("nvd/") if isfile(join("nvd/", f))]
files.sort()
for file in files:
    archive = zipfile.ZipFile(join("nvd/", file), 'r')
    jsonfile = archive.open(archive.namelist()[0])
    cve_dict = json.loads(jsonfile.read())
    jsonfile.close()

All necessary content will be in cve_dict and if we make print(cve_dict.keys()), we get:

[u'CVE_data_timestamp', u'CVE_data_version', u'CVE_Items', u'CVE_data_format', u'CVE_data_numberOfCVEs', u'CVE_data_type']

CVE data is placed in cve_dict['CVE_Items'] list, and other parameters are for information only:

print("CVE_data_timestamp: " + str(cve_dict['CVE_data_timestamp']))
print("CVE_data_version: " + str(cve_dict['CVE_data_version']))
print("CVE_data_format: " + str(cve_dict['CVE_data_format']))
print("CVE_data_numberOfCVEs: " + str(cve_dict['CVE_data_numberOfCVEs']))
print("CVE_data_type: " + str(cve_dict['CVE_data_type']))

Output for nvdcve-1.0-2017.json.zip:

CVE_data_timestamp: 2017-09-30T07:02Z
CVE_data_version: 4.0
CVE_data_format: MITRE
CVE_data_numberOfCVEs: 7583
CVE_data_type: CVE

Ok. now let’s see how the CVE item looks with
print(json.dumps(cve_dict['CVE_Items'][0], sort_keys=True, indent=4, separators=(',', ': ')))

{
    "configurations": {
        "CVE_data_version": "4.0",
        "nodes": [
            {
                "cpe": [
                    {
                        "cpe23Uri": "cpe:2.3:a:microsoft:word:2016:*:*:*:*:*:*:*",
                        "cpeMatchString": "cpe:/a:microsoft:word:2016",
                        "vulnerable": true
                    }
                ],
                "operator": "OR"
            }
        ]
    },
    "cve": {
        "CVE_data_meta": {
            "ID": "CVE-2017-0019"
        },
        "affects": {
            "vendor": {
                "vendor_data": [
                    {
                        "product": {
                            "product_data": [
                                {
                                    "product_name": "word",
                                    "version": {
                                        "version_data": [
                                            {
                                                "version_value": "2016"
                                            }
                                        ]
                                    }
                                }
                            ]
                        },
                        "vendor_name": "microsoft"
                    }
                ]
            }
        },
        "data_format": "MITRE",
        "data_type": "CVE",
        "data_version": "4.0",
        "description": {
            "description_data": [
                {
                    "lang": "en",
                    "value": "Microsoft Word 2016 allows remote attackers to execute arbitrary code or cause a denial of service (memory corruption) via a crafted document, aka \"Microsoft Office Memory Corruption Vulnerability.\" This vulnerability is different from those described in CVE-2017-0006, CVE-2017-0020, CVE-2017-0030, CVE-2017-0031, CVE-2017-0052, and CVE-2017-0053."
                }
            ]
        },
        "problemtype": {
            "problemtype_data": [
                {
                    "description": [
                        {
                            "lang": "en",
                            "value": "CWE-119"
                        }
                    ]
                }
            ]
        },
        "references": {
            "reference_data": [
                {
                    "url": "http://www.securityfocus.com/bid/96042"
                },
                {
                    "url": "http://www.securitytracker.com/id/1038010"
                },
                {
                    "url": "https://portal.msrc.microsoft.com/en-US/security-guidance/advisory/CVE-2017-0019"
                }
            ]
        }
    },
    "impact": {
        "baseMetricV2": {
            "cvssV2": {
                "accessComplexity": "MEDIUM",
                "accessVector": "NETWORK",
                "authentication": "NONE",
                "availabilityImpact": "COMPLETE",
                "baseScore": 9.3,
                "confidentialityImpact": "COMPLETE",
                "integrityImpact": "COMPLETE",
                "vectorString": "(AV:N/AC:M/Au:N/C:C/I:C/A:C)"
            },
            "exploitabilityScore": 8.6,
            "impactScore": 10.0,
            "obtainAllPrivilege": false,
            "obtainOtherPrivilege": false,
            "obtainUserPrivilege": false,
            "severity": "HIGH",
            "userInteractionRequired": true
        },
        "baseMetricV3": {
            "cvssV3": {
                "attackComplexity": "LOW",
                "attackVector": "LOCAL",
                "availabilityImpact": "HIGH",
                "baseScore": 7.8,
                "baseSeverity": "HIGH",
                "confidentialityImpact": "HIGH",
                "integrityImpact": "HIGH",
                "privilegesRequired": "NONE",
                "scope": "UNCHANGED",
                "userInteraction": "REQUIRED",
                "vectorString": "AV:L/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H"
            },
            "exploitabilityScore": 1.8,
            "impactScore": 5.9
        }
    },
    "lastModifiedDate": "2017-07-12T01:29Z",
    "publishedDate": "2017-03-17T00:59Z"
}

Well, I am interested in formalized data about vulnerable software products (CPEs) and criticality description (CVSS).

Talking about vulnerable software products, you can see that this information exist both in “configurations” and “cve->affects”. Probably the “configurations” is more like detection criteria and “cve->affects” are the lists of vulnerable software.

Anyway, this is a good example why you can’t use this data for vulnerability detection in many cases. Let’s say cpe:/a:microsoft:word:2016 is vulnerable. Patched version of MS Word will also have this cpe id cpe:/a:microsoft:word:2016. It’s good to know what the software may be affected by the CVE, but for detection it simply won’t be enough.

But sometimes it’s pretty good. Like this one for Skype vulnerability:

[{u'operator': u'OR', u'cpe': [{u'cpe23Uri': u'cpe:2.3:a:microsoft:skype:7.2:*:*:*:*:*:*:*', u'cpeMatchString': u'cpe:/a:microsoft:skype:7.2', u'vulnerable': True}, {u'cpe23Uri': u'cpe:2.3:a:microsoft:skype:7.35:*:*:*:*:*:*:*', u'cpeMatchString': u'cpe:/a:microsoft:skype:7.35', u'vulnerable': True}, {u'cpe23Uri': u'cpe:2.3:a:microsoft:skype:7.36:*:*:*:*:*:*:*', u'cpeMatchString': u'cpe:/a:microsoft:skype:7.36', u'vulnerable': True}]}]

Let’s see how many CVEs have information about products (filtering CVEs with “** REJECT **” in description)

year,with_cpe,without_cpe
2002,6540,127
2003,1496,3
2004,2632,10
2005,4613,1
2006,6983,0
2007,6442,0
2008,6988,0
2009,4858,0
2010,4928,2
2011,4382,1
2012,5134,1
2013,5616,1
2014,7649,7
2015,7026,59
2016,8027,4
2017,7315,191

As you can see, there are not CPEs for some very old vulnerabilities and those that are currently in work.

But for the majority of vulnerabilities CPE data is somehow presented.

We can also see the situation with CVSS. How many CVEs have only CVSS v2, both CVSS v2 and v3, no CVSS data at all. Looking on “baseMetricV2” and “baseMetricV3” in item['impact']:

year, CVSS v2, CVSS v2 and v3, no CVSS
2002,6665,2,0
2003,1498,1,0
2004,2641,1,0
2005,4613,1,0
2006,6981,2,0
2007,6436,6,0
2008,6986,2,0
2009,4853,5,0
2010,4913,15,2
2011,4370,12,1
2012,5110,24,1
2013,5575,42,0
2014,7275,374,7
2015,5434,1592,59
2016,206,7821,4
2017,0,7315,191

As you can see, switching to CVSS v3 goes well.

It may be also interesting to have a look on site references in CVE items and to classify these sites. But I will probably do it next time.

Alexander Leonov

Hi! My name is Alexander and I am a Vulnerability Management specialist. You can read more about me here. Currently, the best way to follow me is my Telegram channel @avleonovcom. I update it more often than this site. If you haven’t used Telegram yet, give it a try. It’s great. You can discuss my posts or ask questions at @avleonovchat.

А всех русскоязычных я приглашаю в ещё один телеграмм канал @avleonovrus, первым делом теперь пишу туда.

12 thoughts on “Downloading and analyzing NVD CVE feed”

Pingback: CWEs in NVD CVE feed: analysis and complaints | Alexander V. Leonov
Mike November 15, 2017 at 4:15 pm

Is there a possibility to convert json to a csv or xlsx file with a format like cve mitre?:
-> Name, ID, Description,…,
-> Name,ID, Description,…
…

Reply ↓
Pingback: What’s wrong with patch-based Vulnerability Management checks? | Alexander V. Leonov
rinku March 8, 2019 at 10:06 am

hello the content above is very good, but how can i use it for scanning source code, if i have zipped source code file in my local system and i want to check how vulnerable the source code is.
let’s say i downloaded source code file form github of any opensource software such as notepad++, then how is can scan the folders and show the result to me.

Plz help thanks in advance.

Reply ↓
Ramansh June 21, 2019 at 12:12 pm

Hi, when you said that “Probably the “configurations” is more like detection criteria and “cve->affects” are the lists of vulnerable software.”, did you deduce this information or is it an official statement by NIST or NVD guys?

Reply ↓
Michele January 10, 2020 at 12:49 am

How would you retrieve just the ID and description from cve_dict[‘CVE_Items’]?

Reply ↓
1. Michele January 10, 2020 at 12:51 am
  
  I mean, like ID, CVE-2019-xxxxx,
  description, the description,
  
  Or something showing key/value?
  
  I know this is an old post, but hopefully you’re still looking.
  
  Thanks!
  
  Reply ↓
  1. abha January 4, 2022 at 10:23 pm
    
    print(cve_dict[‘CVE_Items’][0][‘cve’][‘description’][“description_data”][0][‘value’])
    print(cve_dict[‘CVE_Items’][0][‘cve’][‘CVE_data_meta’][“ID”])
    
    Reply ↓
Pingback: Linux Kernel CVE Data Analysis (updated) – TuxCare
Pingback: Vulchain Scanner: 5 basic principles | Alexander V. Leonov
Pingback: Linux Kernel CVE Data Analysis (Updated) | TuxCare.com
Pingback: Linux Kernel CVE Data Analysis (Updated)

Alexander V. Leonov

Vulnerability Management and more

Downloading and analyzing NVD CVE feed

12 thoughts on “Downloading and analyzing NVD CVE feed”

Leave a Reply Cancel reply