How to create a NVD Scraper within your Obsidian-Vault

Alright, before delving into this write-up it is important to mention a few requirements for this setup.

Basic Python knowledge it is expected, and that it is already installed on your system. In addition, you must have Obsidian installed.

  1. An API Key is also required, as this is something you will need for all the scrapers: NVD — Developers
  2. This can be achieved by running simple code within Python. The code is given below however you need to replace your Obsidian-Vault Path and API key value for this to work.

Once you have these in place read the code to identify which items you need to install using the Pip command.

pip install os
pip install time
pip install Path
pip install requests
pip install json
pip install datetime
#!/usr/bin/python3
import requests
import json
import datetime
import os
import time
from pathlib import Path

# Obsidian vault path
VAULT_PATH = r"C:\Users\Bob\Obsidian"
BASE_NVD_FOLDER = os.path.join(VAULT_PATH, "01_Vulnerability Data", "NVD")
CANVAS_FOLDER = os.path.join(VAULT_PATH, ".obsidian", "canvas")

# Your NVD API key
API_KEY = "c5503692-ac85-4bd8-a842-05e12e1e76c9"

def get_cvss_info(metrics):
    """Extract CVSS information"""
    cvss_v31 = metrics.get("cvssMetricV31", [{}])[0] if metrics.get("cvssMetricV31") else {}
    cvss_v30 = metrics.get("cvssMetricV30", [{}])[0] if metrics.get("cvssMetricV30") else {}
    cvss_v2 = metrics.get("cvssMetricV2", [{}])[0] if metrics.get("cvssMetricV2") else {}
    
    # Get v3 data (prefer v3.1 over v3.0)
    v3_data = cvss_v31.get("cvssData", {}) or cvss_v30.get("cvssData", {})
    v2_data = cvss_v2.get("cvssData", {})
    
    return {
        "v3_score": v3_data.get("baseScore", "N/A"),
        "v3_severity": v3_data.get("baseSeverity", "N/A"),
        "v3_vector": v3_data.get("vectorString", "N/A"),
        "v3_attack_vector": v3_data.get("attackVector", "N/A"),
        "v3_attack_complexity": v3_data.get("attackComplexity", "N/A"),
        "v3_privileges_required": v3_data.get("privilegesRequired", "N/A"),
        "v3_user_interaction": v3_data.get("userInteraction", "N/A"),
        "v3_scope": v3_data.get("scope", "N/A"),
        "v3_confidentiality": v3_data.get("confidentialityImpact", "N/A"),
        "v3_integrity": v3_data.get("integrityImpact", "N/A"),
        "v3_availability": v3_data.get("availabilityImpact", "N/A"),
        "v2_score": v2_data.get("baseScore", "N/A"),
        "v2_vector": v2_data.get("vectorString", "N/A"),
    }

def get_severity(score):
    """Determine severity based on CVSS score"""
    if score == "N/A":
        return "Medium"  # Default to Medium if no score
    score = float(score)
    if score >= 9.0:
        return "Critical"
    elif score >= 7.0:
        return "High"
    elif score >= 4.0:
        return "Medium"
    else:
        return "Low"

def create_folder():
    """Create folder structure"""
    os.makedirs(BASE_NVD_FOLDER, exist_ok=True)
    return BASE_NVD_FOLDER

def create_dashboard():
    """Create main dashboard"""
    dashboard_path = os.path.join(BASE_NVD_FOLDER, "dashboard.md")
    now = datetime.datetime.now()
    
    with open(dashboard_path, 'w', encoding='utf-8') as f:
        f.write(f"""---
type: dashboard
date: {now.strftime('%Y-%m-%d')}
tags: [nvd, dashboard]
---

# 🎯 Vulnerability Dashboard

## 📊 Severity Overview

```dataview
TABLE 
    length(rows) as Count
FROM "01_Vulnerability Data/NVD"
WHERE type = "vulnerability"
GROUP BY severity
SORT severity ASC
```

## 🔴 Critical Vulnerabilities

```dataview
TABLE 
    file.link as CVE,
    cvss_v3_score as CVSS,
    published as Published
FROM "01_Vulnerability Data/NVD"
WHERE type = "vulnerability" AND severity = "Critical"
SORT cvss_v3_score DESC
```

## 🟠 High Severity Vulnerabilities

```dataview
TABLE 
    file.link as CVE,
    cvss_v3_score as CVSS,
    published as Published
FROM "01_Vulnerability Data/NVD"
WHERE type = "vulnerability" AND severity = "High"
SORT cvss_v3_score DESC
```

## 📋 Action Items
- [ ] Review Critical Vulnerabilities
- [ ] Assess High Severity Items
- [ ] Update Security Advisories
- [ ] Schedule Patch Review

Last Updated: {now.strftime('%Y-%m-%d %H:%M:%S')}
""")

def create_canvas():
    """Create vulnerability canvas"""
    os.makedirs(CANVAS_FOLDER, exist_ok=True)
    canvas_path = os.path.join(CANVAS_FOLDER, "Vulnerability_Intel.canvas")
    
    canvas_data = {
        "nodes": [
            {
                "id": "dashboard",
                "x": 0,
                "y": 0,
                "width": 500,
                "height": 500,
                "type": "file",
                "file": "01_Vulnerability Data/NVD/dashboard.md"
            },
            {
                "id": "critical",
                "x": -600,
                "y": 0,
                "width": 400,
                "height": 400,
                "type": "text",
                "text": "# Critical Vulnerabilities\n\n```dataview\nTABLE cvss_v3_score as CVSS FROM \"01_Vulnerability Data/NVD\" WHERE severity = \"Critical\" SORT cvss_v3_score DESC\n```"
            },
            {
                "id": "high",
                "x": 600,
                "y": 0,
                "width": 400,
                "height": 400,
                "type": "text",
                "text": "# High Severity\n\n```dataview\nTABLE cvss_v3_score as CVSS FROM \"01_Vulnerability Data/NVD\" WHERE severity = \"High\" SORT cvss_v3_score DESC\n```"
            }
        ],
        "edges": [
            {
                "id": "e1",
                "fromNode": "dashboard",
                "fromSide": "left",
                "toNode": "critical",
                "toSide": "right"
            },
            {
                "id": "e2",
                "fromNode": "dashboard",
                "fromSide": "right",
                "toNode": "high",
                "toSide": "left"
            }
        ]
    }
    
    with open(canvas_path, 'w', encoding='utf-8') as f:
        json.dump(canvas_data, f, indent=2)

def create_date_range_folder():
    """Create a folder with the date range"""
    now = datetime.datetime.now()
    seven_days_ago = now - datetime.timedelta(days=7)
    folder_name = f"{seven_days_ago.strftime('%Y-%m-%d')}_to_{now.strftime('%Y-%m-%d')}"
    folder_path = os.path.join(BASE_NVD_FOLDER, folder_name)
    os.makedirs(folder_path, exist_ok=True)
    
    # Create a README for the folder
    readme_path = os.path.join(folder_path, "_README.md")
    with open(readme_path, 'w', encoding='utf-8') as f:
        f.write(f"""---
type: nvd-collection
date_range: {folder_name}
start_date: {seven_days_ago.strftime('%Y-%m-%d')}
end_date: {now.strftime('%Y-%m-%d')}
tags: [nvd, collection]
---

# NVD Vulnerabilities: {folder_name}

## 📊 Collection Statistics

```dataview
TABLE 
    length(rows) as "Total",
    length(filter(rows, (r) => r.severity = "Critical")) as "Critical",
    length(filter(rows, (r) => r.severity = "High")) as "High",
    length(filter(rows, (r) => r.severity = "Medium")) as "Medium",
    length(filter(rows, (r) => r.severity = "Low")) as "Low"
FROM "{folder_name}"
WHERE type = "vulnerability"
```

## 🔴 Critical Vulnerabilities

```dataview
TABLE 
    cvss_v3_score as "CVSS",
    published as "Published"
FROM "{folder_name}"
WHERE type = "vulnerability" AND severity = "Critical"
SORT cvss_v3_score DESC
```

## Collection Details
- **Start Date:** {seven_days_ago.strftime('%Y-%m-%d')}
- **End Date:** {now.strftime('%Y-%m-%d')}
- **Generated:** {now.strftime('%Y-%m-%d %H:%M:%S')}
""")
    
    return folder_path

def fetch_nvd():
    """Fetch vulnerabilities with debug logging"""
    try:
        print("🔄 Starting vulnerability collection...")
        folder_path = create_date_range_folder()
        print(f"📁 Created collection folder: {os.path.basename(folder_path)}")
        
        url = 'https://services.nvd.nist.gov/rest/json/cves/2.0'
        now = datetime.datetime.now()
        seven_days_ago = now - datetime.timedelta(days=7)
        
        headers = {
            'apiKey': API_KEY
        }
        
        params = {
            'pubStartDate': seven_days_ago.strftime("%Y-%m-%dT%H:%M:%S.000"),
            'pubEndDate': now.strftime("%Y-%m-%dT%H:%M:%S.000"),
            'resultsPerPage': 100
        }
        
        print("🔄 Fetching vulnerabilities from NVD...")
        response = requests.get(url, headers=headers, params=params)
        
        if response.status_code != 200:
            print(f"❌ API Error: {response.status_code}")
            return
        
        data = response.json()
        vulns = data.get("vulnerabilities", [])
        print(f"📊 Retrieved {len(vulns)} vulnerabilities")
        
        # Create severity subfolders
        severity_folders = {
            "Critical": os.path.join(folder_path, "Critical"),
            "High": os.path.join(folder_path, "High"),
            "Medium": os.path.join(folder_path, "Medium"),
            "Low": os.path.join(folder_path, "Low")
        }
        
        for folder in severity_folders.values():
            os.makedirs(folder, exist_ok=True)
        
        for vuln in vulns:
            try:
                cve_data = vuln['cve']
                metrics = cve_data.get("metrics", {})
                cvss_v31 = metrics.get("cvssMetricV31", [{}])[0] if metrics.get("cvssMetricV31") else {}
                cvss_v30 = metrics.get("cvssMetricV30", [{}])[0] if metrics.get("cvssMetricV30") else {}
                
                v3_score = "N/A"
                if cvss_v31.get("cvssData"):
                    v3_score = cvss_v31["cvssData"].get("baseScore", "N/A")
                elif cvss_v30.get("cvssData"):
                    v3_score = cvss_v30["cvssData"].get("baseScore", "N/A")
                
                severity = get_severity(v3_score)
                
                content = f"""---
id: {cve_data["id"]}
type: vulnerability
severity: {severity}
cvss_v3_score: {v3_score}
published: {cve_data.get("published", "N/A")}
lastModified: {cve_data.get("lastModified", "N/A")}
status: needs-triage
tags: [nvd, vulnerability, severity/{severity.lower()}]
---

# {cve_data["id"]} - {severity} Severity

## Overview
**CVSS Score:** {v3_score}
**Severity:** {severity}

## Description
{cve_data["descriptions"][0]["value"]}

## References
{chr(10).join(['- ' + ref["url"] for ref in cve_data.get("references", [])])}

## Timeline
- **Published:** {cve_data.get("published", "N/A")}
- **Last Modified:** {cve_data.get("lastModified", "N/A")}
- **Added to Database:** {now.strftime('%Y-%m-%d %H:%M:%S')}

## Action Items
- [ ] Initial Assessment
- [ ] Impact Analysis
- [ ] Patch Availability Check
- [ ] Risk Assessment
"""
                
                # Save to severity subfolder
                file_path = os.path.join(severity_folders[severity], f"{cve_data['id']}.md")
                with open(file_path, 'w', encoding='utf-8') as f:
                    f.write(content)
                
                severity_icon = "🔴" if severity == "Critical" else "🟠" if severity == "High" else "🟡" if severity == "Medium" else "🟢"
                print(f"{severity_icon} Created note for {cve_data['id']} ({severity})")
                
            except Exception as e:
                print(f"❌ Error processing {cve_data.get('id', 'unknown')}: {e}")
                continue
        
        print(f"✅ Completed vulnerability collection in {os.path.basename(folder_path)}")
        
    except Exception as e:
        print(f"❌ Error: {e}")

def main():
    """Main execution function"""
    try:
        print("🔄 Starting NVD vulnerability collection...")
        fetch_nvd()
        print("✅ Collection complete!")
    except Exception as e:
        print(f"❌ Fatal error: {e}")

if __name__ == "__main__":
    main()

When you run the scraper, it will provide the last 7 days of vulnerabilities stored in a folder in Obsidian.

3. If the code is run within the terminal it should return the values of the vulnerabilities published and import those into your Obsidian Vault.

                                     Screenshot by author

Required Plugins

  • Templates (for standardized note creation)
  • HTML Reader (for embedding external content)
  • Dataview (for creating dynamic views and queries)
  • Calendar (for temporal analysis)
  • Maps (for geographic visualization)
  • Canvas (for creating interactive dashboards)

Template Example:

#Ransomware Groups:

---
type: ransomware_group
name: 
active_since: 
status: 
locations: 
targets:
  - 
ttps:
  - 
tags:
---

## Overview

## Known Attacks

## TTPs

## Infrastructure

## Notes

## Sources 

# Vulnerability

---
type: vulnerability
cve: 
cvss: 
affected_systems:
discovery_date: 
status: 
tags:
---

## Description

## Technical Details

## Impact

## Mitigation

## References 

# Dashboard Setup

Create a main canvas in the newly created folder 00_Dashboard that includes:

  • Latest threat intel
  • Geographic distribution of threats
  • Embedded attack maps
  • Recent vulnerability counts
  • Active ransomware groups

Bonus: Popular vulnerabilities databases

  • https://www.exploit-db.com
  • /https://github.com/advisories
  • https://www.cisa.gov/known-exploited-vulnerabilities-catalog-print

 

Similar Posts

  • Encoding and encryption

    Encoding and encryption are used interchangeably. Encryption involves encoding data for confidentiality and security, while encoded data is not necessarily confidential. What is encoding? Encoding is converting data from one format to another so that a different system can safely use it. This might involve sending binary (1s and 0s) data via email or displaying…

  • Defense against Ransomware

    Cyber-attacks, malware, and aspects of ransomware Cyber-attacks and data breaches are becoming more common, with perpetrators becoming increasingly skilled and motivated. While the growth of technology has benefited our lives, threats have surged significantly over the past two decades. This has also increased the vulnerability of systems, infrastructure, networks, and applications. The increasing digitalization and…

  • 9 Common Malware Behaviors

    Threat actors Threat actors use viruses, worms, and Trojan horses to carry their payloads for other malicious reasons. The most prevalent malware is ransomware, which denies access to the infected computer system or its data. The cybercriminals then demand payment to release the computer system. Ransomware has evolved to become the most profitable malware type in history….

  • What is the Cyber Kill Chain?

    What is the Cyber Kill Chain in cyber security? Investing in cybersecurity analysis and strategy has never been more important in an era of rampant ransomware and other malicious cyberattacks. The Cyber Kill Chain Explained The cyber kill chain model is an adaptive, step-by-step process that helps organizations track, understand, anticipate, and stop cyber threats…