Sigma Rules Automation with n8n
Project Overview
This project automates the process of generating, validating, and storing Sigma rules for new Common Vulnerabilities and Exposures (CVEs) using the n8n automation platform. Sigma rules are standardized detection rules used in cybersecurity to identify threats in logs, and this project streamlines their creation for a given set of CVEs. The workflow does the following:
Input:
Takes a list of CVEs in JSON format, each containing details like CVE ID, description, vendor, product, and severity.
Processing:
Generates Sigma rules for each CVE using an AI model (Together AI Llama 3.3 70B Free).
Validates the generated Sigma rules for syntax, completeness, and logical accuracy using another AI model (DeepSeek R1 Distill Llama 70B Free).
Filters the results to keep only valid Sigma rules.
Output:
Appends the valid Sigma rules to a Google Sheet named “Sigma Rules Log” with columns for CVE ID, Sigma Rule, and Validation Reason.
Stores each Sigma rule as a .yml file in a GitHub repository named sigma-rules-repo under the sigma-rules/ directory, with filenames based on CVE IDs (e.g., CVE-2023-5852.yml).
Purpose
The goal is to automate the creation of Sigma rules, which would otherwise be a manual and time-consuming task for cybersecurity professionals. By leveraging AI and automation, this project reduces human effort, ensures consistency in rule generation, and provides a centralized storage solution for easy access and collaboration.
Tools and Services Used
n8n: The automation platform that orchestrates the workflow.
Together AI: Provides the Llama 3.3 70B model for generating Sigma rules and DeepSeek R1 Distill Llama 70B model for validating Sigma rules.
Google Sheets: Stores the Sigma rules in a spreadsheet for easy viewing and analysis.
GitHub: Hosts the Sigma rules as .yml files in a repository for version control and sharing
Walkthrough: Step-by-Step Guide
Below is a detailed walkthrough of the n8n workflow, explaining each node’s purpose, configuration, and role in the process. Follow along to understand how the workflow operates and how to replicate it.
1- Schedule Trigger Node

Purpose
The "Schedule Trigger" node serves as the entry point for the workflow, initiating the entire process on a predefined schedule. This automation ensures that the Sigma rule generation and validation process runs daily without manual intervention, making it ideal for keeping up with new CVEs.
Configuration
Node Name: Schedule Trigger
Trigger Rules:
Trigger Interval: Days
Days Between Triggers: 1 (the workflow runs once every day)
Trigger at Hour: Midnight (the workflow triggers at 00:00)
Trigger at Minute: 0 (exactly at the start of the hour)
Output: The node doesn’t produce any specific data output; it simply triggers the next node in the workflow at the specified time.
The schedule can be adjusted based on your needs (e.g., run every 2 days or at a different time) by modifying the "Days Between Triggers" or "Trigger at Hour" settings.
If you prefer manual execution for testing, you can temporarily disable the schedule and use the "Execute Workflow" button in n8n.
2- Fetch Recent CVE Commits Node

Purpose
The "Fetch Recent CVE Commits" node retrieves a list of recent commits from the CVEProject/cvelistV5 GitHub repository. This repository contains CVE data in JSON format, and by fetching recent commits, the workflow identifies which CVE files have been updated in the last 24 hours. This ensures that the workflow processes only the latest or modified CVEs, keeping the Sigma rules up to date.
Configuration
Method: GET
URL: https://api.github.com/repos/CVEProject/cvelistV5/commits
Authentication:
Type: Predefined Credential Type (you can define your own credentials before start building the workflow)
Credential Type: GitHub API
Credential Name: GitHub CVE Access (this credential uses a GitHub personal access token with the repo scope to access public repositories; no write access is needed here since this is a read-only operation).
Send Query Parameters:
Specify Query Parameters: Using Fields Below
Query Parameters:
Name: since
Value: {{ new Date(Date.now() - 24 * 60 * 60 * 1000).toISOString() }}
This expression calculates the timestamp for 24 hours ago, ensuring the API returns commits made in the last day. For example, if the current time is 2025-05-01T03:01:13.146Z, this evaluates to 2025-04-30T03:01:13.146Z.
Name: per_page
Value: 100
This sets the number of commits returned per page to 100, the maximum allowed by the GitHub API, to minimize pagination.
Send Headers:
Specify Headers: Using Fields Below
Header Parameters:
Name: Accept
Value: application/vnd.github.v3+json
This header specifies that the API should return data in the GitHub API v3 JSON format.
Send Body: Disabled (no body is needed for a GET request).
Output: The node returns a JSON array of commit objects, each containing details like the commit SHA, message, and files changed.
Authentication: The GitHub API requires authentication even for public repositories to avoid rate limits (unauthenticated requests are limited to 60 requests per hour; authenticated requests allow 5,000 per hour). The “GitHub CVE Access” credential must use a personal access token with at least the public_repo scope (included in the repo scope).
3- Parse Commit Messages for CVEs Node

Purpose
This node processes the commit messages retrieved by the "Fetch Recent CVE Commits" node to identify new and updated CVEs. The commit messages in the CVEProject/cvelistV5 repository follow a structured format where the second line lists new CVEs and the third line lists updated CVEs. This node extracts these lines, categorizes them as "new" or "updated," and prepares the data for further processing.
Configuration
Mode: Run Once for All Items (the code processes all commit items at once, rather than iterating over each item individually).
Language: JavaScript
Code:
Explanation of the Code:
Input Handling: The node takes the list of commits from the previous node (items), where each item has a json.commit.message field containing the commit message.
Validation: Checks if the commit message is a string. If not, it returns an error object.
Message Parsing: Splits the commit message into lines using split('\n').
Line Extraction:
newCVEsLine: The second line (index 1) of the commit message, which lists new CVEs (if present).
updatedCVEsLine: The third line (index 2), which lists updated CVEs (if present).
Output Creation:
For each non-empty newCVEsLine, creates an item with status: 'new', the line text, and the commit SHA.
For each non-empty updatedCVEsLine, creates an item with status: 'updated', the line text, and the commit SHA.
Flattening: Uses flat() to combine the arrays of new and updated CVE items into a single list.
Output: A list of items, each representing a line of CVE data (new or updated).
This node assumes the commit messages follow the expected format (new CVEs on the second line, updated CVEs on the third). If the format changes, the code will need to be adjusted.
4- Filter for Actual CVEs Node

Purpose
This node filters the output of the "Parse Commit Messages for CVEs" node to keep only the commit message lines that actually contain CVE updates. The previous node extracts lines that may indicate new or updated CVEs, but some lines might be empty or not contain actual CVE data (e.g., "0 new CVEs"). This node ensures that only lines with a non-zero number of new or updated CVEs are passed to the next step, avoiding unnecessary processing of irrelevant data.
Configuration
Conditions:
Condition 1:
Expression: {{ $json.status === 'new' ? parseInt($json.text.match(/(\d+) new CVEs/)[1]) : 0 }}
Operator: is greater than
Value: 0
Explanation: This condition checks if the item has a status of "new". If true, it uses a regex ((\d+) new CVEs) to extract the number of new CVEs from the text field (e.g., "5 new CVEs" extracts "5"). The parseInt function converts the extracted number to an integer. If the number is greater than 0, the item passes the filter. If the status isn’t "new", the expression evaluates to 0, failing this condition.
OR
Condition 2:
Expression: {{ $json.status === 'updated' ? parseInt($json.text.match(/(\d+) updated CVEs/)[1]) : 0 }}
Operator: is greater than
Value: 0
Explanation: This condition checks if the item has a status of "updated". If true, it uses a regex ((\d+) updated CVEs) to extract the number of updated CVEs from the text field (e.g., "3 updated CVEs" extracts "3"). The parseInt function converts the extracted number to an integer. If the number is greater than 0, the item passes the filter. If the status isn’t "updated", the expression evaluates to 0, failing this condition.
Output: A filtered list of items where either the "new CVEs" or "updated CVEs" count is greater than 0.
Regex Dependency: The filtering relies on the text field following a specific format (e.g., "5 new CVEs"). If the format changes (e.g., "5 CVEs added"), the regex patterns (\d+) new CVEs and (\d+) updated CVEs will need to be updated.
5- Extract CVE IDs and File Paths Node

Purpose
This node processes the filtered commit message lines from the "Filter for Actual CVEs" node to extract individual CVE IDs and construct the file paths for their corresponding JSON files in the CVEProject/cvelistV5 repository. This node identifies all CVE IDs mentioned in the text field (e.g., "CVE-2023-5852 CVE-2023-5854") and generates the file paths (e.g., cves/2023/58xxx/CVE-2023-5852.json) needed to fetch the CVE details in the next step.
Configuration
Mode: Run Once for All Items (the code processes all items at once, rather than iterating over each item individually).
Language: JavaScript
Code:
Explanation of the Code:
Input Handling: The node takes the filtered items from the previous node (items), where each item has a json.text field (e.g., "CVE-2023-5852 CVE-2023-5854"), a json.status field ("new" or "updated"), and a json.commitSha.
Validation:
Checks if the status field exists. If not, returns an error item with the message "Status is missing".
Uses a regex (/CVE-\d{4}-\d{4,7}/g) to extract all CVE IDs from the text field. The pattern matches strings like "CVE-2023-5852" (CVE, year, 4-7 digit ID). The || [] ensures an empty array if no matches are found.
If no CVE IDs are found, returns an error item with the message "No CVE ID found".
CVE Processing:
For each matched CVE ID, extracts the year (e.g., "2023" from "CVE-2023-5852") and the ID part (e.g., "5852").
Constructs a directory name by taking the first digit(s) of the ID part and appending "xxx" (e.g., "58xxx" for "5852").
Builds the file path in the format cves/{year}/{dir}/{cve}.json (e.g., cves/2023/58xxx/CVE-2023-5852.json).
Output Creation:
Creates an item for each CVE ID with fields: cveId (the CVE ID), filePath (the constructed path), commitSha (from the input item), and sourceIndex (the index of the input item for tracking purposes).
Flattening: Uses flat() to combine the arrays of CVE items into a single list.
Output: A list of items, each representing a single CVE ID with its corresponding file path and metadata.
6- Fetch CVE Files Node

Purpose
This node retrieves the CVE JSON files from the CVEProject/cvelistV5 GitHub repository using the file paths generated by the "Extract CVE IDs and File Paths" node. Each file contains detailed information about a specific CVE (e.g., description, vendor, product, severity), which is necessary for generating Sigma rules in the next steps.
Configuration
Method: GET
URL:https://api.github.com/repos/CVEProject/cvelistV5/contents/{{ $json.filePath }}
Explanation: The URL dynamically incorporates the filePath field from the previous node (e.g., cves/2023/58xxx/CVE-2023-5852.json), forming a complete API endpoint like https://api.github.com/repos/CVEProject/cvelistV5/contents/cves/2023/58xxx/CVE-2023-5852.json. This endpoint returns metadata about the file, including a download URL for the file’s content.
Authentication:
Type: Predefined Credential Type
Credential Type: GitHub API
Credential Name: GitHub CVE Access (this credential uses the same GitHub personal access token as the "Fetch Recent CVE Commits" node, with the repo scope, which includes public_repo access for reading public repositories).
Send Query Parameters: Disabled (no query parameters are needed)
Send Headers:
Specify Headers: Using Fields Below
Header Parameters:
Name: Accept
Value: application/vnd.github.v3+json
Explanation: This header ensures the GitHub API returns data in the v3 JSON format.
Send Body: Disabled (no body is needed for a GET request).
Options: Response Format: JSON (the response is expected to be in JSON format).
Output: A list of items, each containing the GitHub API response for a CVE file. The response includes metadata about the file, such as its name, path, and a download_url to fetch the raw content.
7- Content Decoding Node

Purpose
This node decodes the Base64-encoded content of the CVE JSON files fetched from the CVEProject/cvelistV5 repository, parses the content as JSON, and extracts key information such as the CVE ID, vendors, and products. This node prepares the CVE data in a structured format for Sigma rule generation in the subsequent steps.
Configuration
Mode: Run Once for All Items (the code processes all items at once, rather than iterating over each item individually).
Language: JavaScript
Code:
Explanation of the Code:
Input Handling: The node takes items from the previous node, expecting each item to have a json.content field (Base64-encoded CVE JSON data), a json.path (file path), a json.cveId, and a json.commitSha.
Validation: Checks if the content field exists. If not, returns an error item with the message "No content found".
Base64 Decoding:
Decodes the Base64-encoded content using Buffer.from(encodedContent, 'base64').toString('utf-8').
If decoding fails, returns an error item with the message "Failed to decode Base64" and the error details.
JSON Parsing:
Parses the decoded content into a JavaScript object using JSON.parse(decodedContent).
If parsing fails, returns an error item with the message "Failed to parse JSON", including the decoded content and error details.
Data Extraction:
Extracts the affected array from cveData.containers?.cna?.affected (if it exists, otherwise defaults to an empty array).
Maps the affected array into a list of vendor-product pairs, defaulting to "unknown" if vendor or product fields are missing.
Output Creation:
Returns a new item with fields: cveId (from the parsed data or fallback to the input cveId), filePath, commitSha, vendorsAndProducts (list of vendor-product pairs), and cveData (the full parsed CVE data).
Output: A list of items, each containing the decoded and parsed CVE data along with extracted metadata.
Data Structure Dependency: The extraction of vendorsAndProducts relies on the CVE JSON structure (containers.cna.affected). If the structure of the CVE JSON files in the CVEProject/cvelistV5 repository changes, the code will need to be updated.
8- Filter by Vendor Node

Purpose
This node filters the CVE data to keep only those CVEs associated with specific vendors of my interest: Cisco, Fortinet, Palo Alto, and Google. This node ensures that the workflow focuses on CVEs relevant to these vendors, avoiding unnecessary processing of CVEs from other vendors.
Configuration
Conditions:
Condition 1:
Expression: {{ $json.vendorsAndProducts[0]?.vendor?.toLowerCase() || 'unknown' }}
Operator: is equal to
Value: cisco
Explanation: This condition checks the vendor field of the first entry in the vendorsAndProducts array (from the previous node). The ?. operator safely accesses nested fields, and toLowerCase() ensures case-insensitive comparison (e.g., "Cisco" or "CISCO" will match). If the field is missing, it defaults to 'unknown'. The item passes if the vendor matches "cisco".
OR
Condition 2:
Expression: {{ $json.vendorsAndProducts[0]?.vendor?.toLowerCase() || 'unknown' }}
Operator: is equal to
Value: fortinet
Explanation: Similar to the first condition, but checks if the vendor matches "fortinet".
OR
Condition 3:
Expression: {{ $json.vendorsAndProducts[0]?.vendor?.toLowerCase() || 'unknown' }}
Operator: is equal to
Value: paloalto
Explanation: Similar to the first condition, but checks if the vendor matches "paloalto".
OR
Condition 4:
Expression: {{ $json.vendorsAndProducts[0]?.vendor?.toLowerCase() || 'unknown' }}
Operator: is equal to
Value: google
Explanation: Similar to the first condition, but checks if the vendor matches "google".
Output: A filtered list of items where the first vendor in vendorsAndProducts matches one of the specified vendors (Cisco, Fortinet, Palo Alto, or Google).
Extensibility: If you need to add more vendors (e.g., "Microsoft"), you can add another OR condition with the new vendor name.
9- Log CVE Items (Debug) Node

Purpose
This node logs the input CVE items to the console for debugging purposes and passes the items through unchanged. This node serves as a debugging checkpoint to inspect the filtered CVE data before proceeding to the next steps, such as generating Sigma rules.
Configuration
Mode: Run Once for All Items.
Language: JavaScript
Code:
Explanation of the Code:
Logging:
console.log('Input items:', JSON.stringify($input.all(), null, 2)): Logs all input items in a formatted JSON string. $input.all() returns an array of all items passed to the node, and JSON.stringify(..., null, 2) formats the output with indentation for readability.
console.log('First item:', JSON.stringify($input.first(), null, 2)): Logs the first item in the input array in a formatted JSON string. $input.first() returns the first item, useful for quickly inspecting the structure of the data.
Output: return $input.all() passes all input items through unchanged, ensuring the workflow continues with the same data.
Output: The same list of items received as input, unchanged.
Debugging Purpose: The node is primarily for debugging, as indicated by the console.log statements. You can view the logs in the n8n execution logs (accessible via the n8n UI or logs on the server). Once the workflow is stable, you might consider removing this node or disabling the logs to reduce noise.
10- Preprocess CVE Data Node

Purpose
This node simplifies the CVE data by extracting and formatting key fields (cveId, description, vendor, product, and severity) into a streamlined structure. This node prepares the data in a concise format for the next steps, such as generating Sigma rules, by reducing the complexity of the raw CVE JSON data.
Configuration
Mode: Run Once for All Items.
Language: JavaScript
Code:
Explanation of the Code:
Input Handling: Retrieves all items from the previous node ($input.all()) and extracts the json field from each item using map(item => item.json).
Debug Logging:
Logs the input data with console.log('Input to Preprocess CVE Data:', JSON.stringify(items, null, 2)) for debugging, formatting the JSON with indentation for readability.
Data Processing:
For each item, extracts key fields with fallback values:
cveId: Taken from item.cveId, defaults to "unknown".
description: Extracted from item.cveData.containers.cna.descriptions[0].value using nested checks, defaults to "No description available".
vendor: Taken from the first entry of item.vendorsAndProducts[0].vendor, defaults to "unknown".
product: Taken from the first entry of item.vendorsAndProducts[0].product, defaults to "unknown" .
severity: Extracted from item.cveData.containers.cna.metrics[0].cvssV3_1.baseSeverity, defaults to "MEDIUM".
Creates a new item with a simplified structure containing only these fields.
Debug Logging: Logs the processed output with console.log('Output from Preprocess CVE Data:', JSON.stringify(processedItems, null, 2)).
Output Creation: Returns the array of processed items in the required n8n format (each item wrapped in a { json: {...} } object).
Output: A list of items, each containing a simplified version of the CVE data with only the key fields.
Data Structure Dependency: The extraction of description and severity relies on the CVE JSON structure (containers.cna.descriptions and containers.cna.metrics.cvssV3_1). If the structure of the CVE JSON files in the CVEProject/cvelistV5 repository changes, the code will need to be updated to match the new structure.
Debugging: The console.log statements are useful for debugging, allowing you to inspect both the input and output data in the n8n execution logs. Once the workflow is stable, you might consider removing these logs to reduce noise.
11- Loop Over CVEs Node

Purpose
This node iterates over the preprocessed CVE items one at a time, enabling individual processing of each CVE for Sigma rule generation. This node ensures that each CVE is handled independently, which is useful for making API calls (e.g., to an AI model for Sigma rule generation) while avoiding rate limits or ensuring isolated error handling.
Configuration
Batch Size: 1 (processes one CVE item at a time per iteration).
Output: Outputs one item at a time to the nodes within the loop.For example, in the first iteration:
This continues until all items from the "Preprocess CVE Data" node are processed.
12- Prepare Sigma Rule Request Body Node

Purpose
This node constructs the request body for an API call to Together AI’s LLaMA-3.3-70B model to generate a Sigma rule in YAML format for a single CVE. This node formats the CVE data into a structured prompt, includes detailed instructions for the AI model, and prepares the request for the next node (HTTP Request node) to send to the AI model.
Configuration
Mode: Run Once for All Items (the code processes all items at once, but since it’s inside the loop with a batch size of 1, it effectively processes one CVE item per iteration).
Language: JavaScript
Code:
Explanation of the Code:
Input Handling: The node takes the input from the loop ($input.all()), which contains a single CVE item per iteration due to the batch size of 1. It extracts the json field (item.json) containing the preprocessed CVE data (cveId, description, vendor, product, severity).
Validation:
Checks for the presence of required fields (cveId, description, vendor, product, severity).
If any field is missing, logs an error with the CVE data and throws an exception (throw new Error("Missing required fields in CVE data")), causing the loop iteration to fail for that item.
Request Body Construction:
Creates a requestBody object for the Together AI API with the following structure:
model: Specifies the AI model (meta-llama/LLaMA-3.3-70B-Instruct-Turbo-Free).
messages: An array of two messages:
A system message with detailed instructions for the AI model on how to generate a Sigma rule in YAML format, including rules for fields like title, id, description, logsource, detection, and tags.
A user message containing the CVE data in a structured format, including a dynamically constructed referenceUrl based on the CVE ID (e.g., https://github.com/CVEProject/cvelistV5/blob/main/cves/2023/58xxx/CVE-2023-5852.json) and the current date ($now.toFormat('yyyy-MM-dd'), which evaluates to 2025-05-01 on the current date).
max_tokens: Limits the AI response to 500 tokens to ensure the generated Sigma rule is concise.
Debug Logging: Logs the constructed requestBody for debugging with console.log("Request body for CVE", cveData.cveId, ":", JSON.stringify(requestBody, null, 2)).
Output Creation: Returns the requestBody wrapped in the required n8n format ({ json: requestBody }).
Output: A single item containing the request body for the Together AI API.
Prompt Instructions: The system message provides detailed instructions for the AI model, which ensures consistency in the generated Sigma rules. However, the quality of the generated rules depends on the AI model’s ability to follow these instructions. I am using these models cause it is the best available option for me.
13- Generate Sigma Rules via Together AI Node

Purpose
This node sends an API request to Together AI’s LLaMA-3.3-70B model to generate a Sigma rule in YAML format for a single CVE, uses the request body prepared by the "Prepare Sigma Rule Request Body" node to make a POST request to the Together AI API, retrieving the AI-generated Sigma rule for further processing (e.g., validation and storage).
Configuration
Method: POST
URL: https://api.together.xyz/v1/chat/completions
Authentication: None (authentication is handled via the Authorization header).
Send Query Parameters: Disabled (no query parameters are needed).
Send Headers:
Specify Headers: Using Fields Below
Header Parameters:
Name: Authorization
Value: Bearer a*********************************
Explanation: This header provides the API key for Together AI, authenticating the request. The value is a placeholder (partially masked for security); replace it with your actual Together AI API key.
Name: Content-Type
Value: application/json
Explanation: This header specifies that the request body is in JSON format, as required by the Together AI API.
Send Body:
Body Content Type: JSON
Specify Body: Using JSON
JSON: {{ $json }}
Explanation: This expression dynamically inserts the JSON request body prepared by the previous node ("Prepare Sigma Rule Request Body"). The body includes the model name, messages (system and user prompts), and max tokens, formatted as shown in the previous node's output.
Options: No properties (default options are used, such as expecting a JSON response).
Output: A single item containing the API response from Together AI, which includes the generated Sigma rule in YAML format within the response.
14- Delay after Generation Node

Purpose
This node introduces a 10-second delay after generating a Sigma rule via the Together AI API. This delay helps manage API rate limits by ensuring that requests to the Together AI API (or subsequent APIs in the loop) are spaced out, reducing the risk of hitting rate limit errors (e.g., HTTP 429 Too Many Requests).
Configuration
Mode: Run Once for All Items.
Language: JavaScript
Code:
Explanation of the Code:
Delay Implementation: await new Promise(resolve => setTimeout(resolve, 10000)) creates a promise that resolves after 10,000 milliseconds (10 seconds), effectively pausing the execution of the node for that duration. The await keyword ensures the delay is asynchronous, allowing n8n to handle the pause without blocking other processes.
Output: return $input.all() passes all input items through unchanged after the delay. Since the node is inside a loop with a batch size of 1, it processes one item at a time, meaning the delay applies to each iteration of the loop.
Output: The same item received as input, unchanged, after a 10-second delay.
Alternative Approach: Instead of using a JavaScript node, you could use n8n’s built-in "Wait" node to introduce a delay.
15- Prepare Sigma Rule Validation Inputs Node

Purpose
This node extracts and formats the CVE description and the generated Sigma rule for validation. It prepares a structured output containing the CVE description and the Sigma rule in YAML format, which will be used by a subsequent node ( to validate the Sigma rule using a diffrent AI model).
Configuration
Mode: Run Once for All Items.
Language: JavaScript
Code:
Explanation of the Code:
Input Handling: The node takes the input from the "Delay after Generation" node ($input.all()), which contains a single item per iteration due to the loop’s batch size of 1. The item includes the API response from Together AI, with the generated Sigma rule in item.json.choices[0].message.content.
Debug Logging: Logs the structure of the input item using console.log("Item structure:", JSON.stringify(item, null, 2)) to help with debugging and confirm the data structure.
Sigma Rule Extraction: Extracts the generated Sigma rule from item.json.choices[0].message.content, which contains the YAML-formatted Sigma rule generated by Together AI via the first model.
Original Input Access:
Attempts to access the original CVE data via item.json.input, which should contain the preprocessed CVE data from earlier nodes (e.g., "Preprocess CVE Data").
Falls back to item.json if item.json.input is not defined, indicating a potential issue with how the original input is passed through the loop.
CVE Description Extraction: Extracts the CVE description from originalInput.cveData?.containers?.cna?.descriptions?.[0]?.value using optional chaining to safely navigate the nested structure. If the description is not found, defaults to "Description not found".
Output Creation: Returns a new item with a simplified structure containing two fields:
cve_description: The extracted CVE description (or the fallback value).
sigma_rule_yaml: The generated Sigma rule in YAML format.
Output: A single item containing the CVE description and the Sigma rule, formatted for validation.
16- Validate Sigma Rule via DeepSeek Node

Purpose
This node sends an API request to Together AI’s DeepSeek R1 Distill Llama 70B Free model to validate the Sigma rule generated for a single CVE. It uses the CVE description and Sigma rule prepared by the "Prepare Sigma Rule Validation Inputs" node to construct a prompt, asking the AI model to verify the rule’s syntax, completeness, and logical accuracy, ensuring the rule is usable for threat detection.
Configuration
Method: POST
URL: https://api.together.xyz/v1/chat/completions
Authentication: None (authentication is handled via the Authorization header).
Send Query Parameters: Disabled (no query parameters are needed).
Send Headers:
Specify Headers: Using Fields Below
Header Parameters:
Name: Authorization
Value: Bearer a*********************************
Explanation: This header provides the API key for Together AI, authenticating the request. The value is a placeholder (partially masked for security); replace it with your actual Together AI API key. Note that this is the same API key used in the "Generate Sigma Rules via Together AI" node, and it’s recommended to use predefined credentials for better security).
Name: Content-Type
Value: application/json
Explanation: This header specifies that the request body is in JSON format, as required by the Together AI API.
Send Body:
Body Content Type: JSON
Specify Body: Using JSON
JSON: {{ $json }}
Explanation: This expression dynamically inserts the JSON request body prepared by the previous node ("Prepare Sigma Rule Validation Inputs"). The body typically includes the model name, a system prompt instructing the AI to validate the Sigma rule, and a user prompt containing the CVE description and Sigma rule in YAML format.
Options: No properties (default options are used, such as expecting a JSON response).
Output: A single item containing the API response from Together AI, which includes the validation result for the Sigma rule.
You can use Predefined Credentials for Together AI API Key
The "Generate Sigma Rules via Together AI" and "Validate Sigma Rule via DeepSeek" nodes both use the same Together AI API key, hardcoded in the Authorization header (Bearer a*********************************). Hardcoding API keys in multiple nodes is not recommended because:
Security Risk: If the workflow is shared or exposed, the API key could be compromised.
Maintenance Overhead: If the API key changes, you’ll need to update it in multiple places, increasing the risk of errors.
A better approach is to use n8n’s predefined credentials feature to securely store the API key and reuse it across nodes.
17-Parse Sigma Rule Validation Output Node

Purpose
This node processes the validation response from the "Validate Sigma Rule via DeepSeek" node to extract a JSON object containing the validation result. It removes extraneous formatting (such as <think> tags and code block markers) from the DeepSeek API response, parses the embedded JSON string, and formats the result for further processing in the workflow.
Configuration
Mode: Run Once for All Items.
Language: JavaScript
Code:
Explanation of the Code:
Input Handling: The node takes the input from the "Validate Sigma Rule via DeepSeek" node ($input.all()), which contains a single item per iteration due to the loop’s batch size of 1. The item includes the API response from Together AI, with the validation feedback in item.json.choices[0].message.content.
Raw Content Extraction: Extracts the raw content from item.json.choices[0].message.content, which contains the DeepSeek model’s response, expected to include a JSON object wrapped in additional text (e.g., <think> tags or code block markers like ```json).
JSON Extraction:
Finds the start of the JSON object using rawContent.indexOf('{') to locate the first opening brace.
Finds the end of the JSON object using rawContent.lastIndexOf('}') + 1 to locate the last closing brace (plus 1 to include the brace itself).
Extracts the JSON string using rawContent.substring(jsonStart, jsonEnd), which removes any text before the opening brace and after the closing brace (e.g., <think> tags or code block markers).
JSON Parsing: Parses the extracted JSON string into a JavaScript object using JSON.parse(jsonString).
Output Creation: Returns a new item with the parsed validation result in the json field.
Output: A single item containing the parsed validation result as a JSON object.
18- Delay after Validation Node

Purpose
This node introduces a 5-second delay after parsing the Sigma rule validation output. This delay helps manage API rate limits by ensuring that requests to the Together AI API (used in the "Generate Sigma Rules via Together AI" and "Validate Sigma Rule via DeepSeek" nodes) are spaced out across loop iterations, reducing the risk of hitting rate limit errors (e.g., HTTP 429 Too Many Requests).
Configuration
Node Type: Wait
Resume: After Time Interval
Wait Amount: 5.00
Wait Unit: Seconds
Output: The node passes through the input items unchanged after the 5-second delay.
This node connects back to the "Loop Over CVEs for Sigma Rule Generation" node to complete the current iteration.
19- Filter Valid Sigma Rules Node

Purpose
This node filters the output of the "Loop Over CVEs for Sigma Rule Generation" loop to keep only the Sigma rules that were marked as valid during validation. It uses the isValid field from the validation result to determine which rules to retain, ensuring that only high-quality, validated Sigma rules proceed to the next steps (e.g., storage in Google Sheets and GitHub).
Configuration
Mode: Run Once for All Items.
Language: JavaScript
Code:
Explanation of the Code:
Input Handling: The node takes the output of the "Loop Over CVEs for Sigma Rule Generation" loop, which is a list of items where each item represents the validation result for a single CVE. Each item has a json field containing the parsed validation result (from the "Parse Sigma Rule Validation Output" node), with fields like isValid, reason, and suggestions.
Filtering Logic: The items.filter() method keeps only the items where item.json.isValid is true, discarding any Sigma rules that failed validation (i.e., where isValid is false).
Output Creation: Returns a new list containing only the items that pass the filter (i.e., valid Sigma rules).
Output: A filtered list of items, containing only the validation results for Sigma rules that are valid.
Alternative Approach: Instead of using a JavaScript node, you could achieve the same result using n8n’s built-in "Filter" node
20- Append Valid Sigma Rules to Sheet Node

Purpose
This node appends the validated Sigma rules, along with their associated metadata, to a Google Sheet(already existed). It extracts the CVE ID from the Sigma rule, the Sigma rule itself, and the validation reason, then adds each rule as a new row in the specified sheet. This node ensures that the generated and validated Sigma rules are persistently stored for further analysis or use.
Configuration
Credential to Connect With: Google Sheets account



Resource: Sheet Within Document
Operation: Append Row
Document:
Selection Method: By ID
Document ID: 1cGc1IOpV-jWg49QdiOXxbEbRWbsj0nc_SA1TQTcf7UQ
Explanation: This is the unique ID of the Google Sheet, found in the sheet’s URL (e.g., https://docs.google.com/spreadsheets/d/1cGc1IOpV-jWg49QdiOXxbEbRWbsj0nc_SA1TQTcf7UQ/edit). It identifies the target spreadsheet for appending the Sigma rules.
Sheet:
Selection Method: By Name
Sheet Name: Sheet1
Explanation: This specifies the name of the sheet within the Google Sheet document where the rows will be appended. Ensure that "Sheet1" exists in the specified document and has columns labeled "CVE ID", "Sigma Rule", and "Validation Reason" to match the data being sent.
Mapping Column Mode: Map Each Column Manually
Values to Send:
CVE ID:
Expression: {{ $json.sigma_rule.match(/^title: Detection of (CVE-\d{4}-\d{4,5})/)[1] }}
Explanation: Extracts the CVE ID (e.g., CVE-2023-5852) from the Sigma rule’s title using a regular expression. The Sigma rule is expected to be in item.json.sigma_rule, with a title line like title: Detection of CVE-2023-5852 - Remote Code Execution. The regex ^title: Detection of (CVE-\d{4}-\d{4,5}) captures the CVE ID in the first capturing group ([1]), matching the format CVE-YYYY-NNNN or CVE-YYYY-NNNNN.
Sigma Rule:
Expression: {{ $json.sigma_rule }}
Explanation: Provides the full Sigma rule in YAML format, as stored in item.json.sigma_rule. This field contains the rule generated by the "Generate Sigma Rules via Together AI" node.
Validation Reason:
Expression: {{ $json.reason }}
Explanation: Provides the validation reason from the DeepSeek model, stored in item.json.reason (from the "Parse Sigma Rule Validation Output" node), explaining why the Sigma rule was deemed valid.
Options: No properties (default options are used, such as not including a header row since the sheet is assumed to already have headers).
Output: The node appends one row per valid Sigma rule to the Google Sheet and returns the input items unchanged.

The node appends a row to "Sheet1" in the specified Google Sheet with the following values:
CVE ID: CVE-2023-5852 (extracted from the Sigma rule’s title)
Sigma Rule: The full YAML string (as shown above)
Validation Reason: The Sigma rule is syntactically correct and logically sound. The detection patterns align with the CVE description.
21- Create Sigma Rule Files in Repo Node

Purpose
This node creates a new file in a GitHub repository for each validated Sigma rule. It stores the Sigma rule as a YAML file in the specified repository, with the file path based on the CVE ID extracted from the Sigma rule’s title. This node ensures that the generated and validated Sigma rules are version-controlled in a GitHub repository, making them accessible for collaboration, auditing, and integration with SIEM.
Configuration
Credential to Connect With: Create new credential that has access over the repo.



Make sure that the token has access over the repo.

Resource: File
Operation: Create
Repository Owner:
Selection Method: By Name
Owner Name: your_username
Explanation: Specifies the GitHub user or organization that owns the target repository. Ensure that the authenticated GitHub account has write access to repositories owned by that organization.
Repository Name:
Selection Method: By Name
Repository Name: sigma-rules-repo
Explanation: Specifies the target repository where the Sigma rule files will be created. Ensure that the repository sigma-rules-repo exists under the owner and that the authenticated account has write permissions.
File Path:
Expression: {{ "sigma-rules/" + $json.sigma_rule.match(/^title: Detection of (CVE-\d{4}-\d{4,5})/)[1] + ".yml" }}
Explanation: Dynamically constructs the file path for the Sigma rule file. For example, if the Sigma rule’s title is title: Detection of CVE-2023-5852 - Remote Code Execution, the regex ^title: Detection of (CVE-\d{4}-\d{4,5}) extracts CVE-2023-5852. The file path becomes sigma-rules/CVE-2023-5852.yml, placing the file in a sigma-rules directory within the repository.
Binary File: Not used (the file content is provided as text, not binary data).
File Content:
Expression: {{ $json.sigma_rule }}
Explanation: Provides the full Sigma rule in YAML format, stored in item.json.sigma_rule. This content is written to the file in the repository.
Commit Message:
Expression: Add Sigma rule for {{ $json.sigma_rule.match(/^title: Detection of (CVE-\d{4}-\d{4,5})/)[1] }}
Explanation: Constructs a commit message based on the CVE ID extracted from the Sigma rule’s title. For example, if the CVE ID is CVE-2023-5852, the commit message will be Add Sigma rule for CVE-2023-5852.
Output: The node creates a new file in the GitHub repository for each Sigma rule and returns the input items unchanged.



The node creates a file at sigma-rules/CVE-2023-5852.yml in the hla-7/sigma-rules-repo repository with the Sigma rule content, commits it with the message Add Sigma rule for CVE-2023-5852

Key Achievements
Automation of Sigma Rule Generation: The workflow successfully automates the creation of Sigma rules for CVEs, reducing manual effort for cybersecurity professionals.
Robust Validation: By leveraging DeepSeek’s AI model, the workflow ensures that only syntactically correct and logically sound Sigma rules are stored, enhancing their reliability for threat detection.
Dual Storage: Valid Sigma rules are stored in both Google Sheets and a GitHub repository, providing flexibility, redundancy, and accessibility for further analysis or integration.
Rate Limit Management: The inclusion of delay nodes ("Delay after Generation" and "Delay after Validation") prevents API rate limit issues, ensuring the workflow runs smoothly even with a large number of CVEs.
Error Handling and Debugging: Throughout the workflow, notes on error handling, debugging, and data propagation (e.g., preserving the Sigma rule through the loop) help maintain robustness and transparency.
Last updated