In this blog post, we’ll explore how Azure Data Explorer (ADX) can store and query logs from Azure Firewall and other similar sources. The information is based on a recent implementation at a leading global manufacturing company that is using Azure Sentinel, Azure Log Analytics and ADX to store and process large volumes of Azure Firewall logs cost effectively.
As your cloud footprint grows, the need to monitor traffic among Azure services and VMs with cyber security lens becomes imperative. Companies must do this while also keeping costs under control and supporting the business’ growth needs. Azure Firewall logs, as well as other logs, can quickly pile-up and generate significant ingestion, storage, and query costs. These may be considered cash pits as most logs do not generate direct value. Azure Monitor Log Analytics and Microsoft Sentinel are Azure’s built-in log management and security monitoring solution, which are built on Azure Data Explorer; this makes Azure Data Explorer a great companion solution for customers looking for long term, large scale, cost effective log storage and analytics solution.
ADX persists data in hot tier blob storage, supports Kusto Query Language (KQL) and can connect to various sources including Event Hub, Event Grid (Blob storage), IoT Hub or Logstash. Azure Sentinel and Log Analytics are built as SaaS like solutions featuring automated data engineering activities which provides out of the box queries and parsers that make logs easily available for querying. When choosing ADX as a companion solution, customers will need to invest in a small amount of data engineering efforts to parse the logs and get these ready for querying for your network and security monitoring. Customers will also need to maintain these connectors in the future, since any change in the log formats will require a rework of the connectors.
Log Analytics Workspace (LAW) is a prerequisite to leverage Microsoft Sentinel, Azure’s SIEM/SOAR solution, which may position LAW as the preferred logs management solution for Azure Firewall logs. However, a second look at Azure Firewall logs reveals that some log categories present a security interest, while others are purely operational:
Depending on your requirements, “AZFWThreatIntel” and “AZFWIdpsSignature” can be defined as security-relevant logs to be sent to Sentinel, and all other log types as operational, directed to ADX. Please note: Microsoft Sentinel is a cloud native SIEM solution that generates security insights and alerts based on multiple types of logs, not only those qualified as security-relevant. Therefore, reducing the set of logs in Sentinel may reduce the insights it generates. You may choose different ways of splitting logs based on your needs.
For a description of Azure Firewall log categories, please consult:
The most effective cost reduction strategy is to send security-relevant logs to LAW/Sentinel, and operations-relevant logs to ADX. Please note: operational logs represent the vast majority of the logs’ volume, especially the type “AzureFirewallNetworkRule.” As such, this strategy provides a serious cost reduction potential.
Proposed solution: Firewall logs are split between Log Analytics (security) and ADX (operational)
The proposed solution is to split logs using two different diagnostic settings at the source (e.g. Azure Firewall), and send operational logs to ADX via Event Hub. This architecture differs from the one documented in Azure Log Analytics Log Management using Azure Data Explorer – Microsoft Tech Community, as it:
Splits logs at the source
Considers each data pipeline independently
Uses ADX as a full-fledged replacement to Log Analytics for a subset of logs, not only as an archiving solution
Additionally, data can be mutually cross-queried between ADX and Log Analytics.
Costs of Proposed ADX Solution
The client is set to ingest 1 TB of logs daily (roughly 35 million events per month) in the North Europe region, which generates four main costs:
Projected monthly cost (USD)
Event Hub Throughput Units
Event Hub ingress events processing
Create an Event Hub
Follow the steps in “Azure Quickstart – Create an event hub using the Azure portal – Azure Event Hubs | Microsoft Docs” to create an Event Hub. Be mindful of the SKU (Standard or Premium) and the required processing capacity; throughput units for Standard, and processing units for Premium. For a PoC, we recommend using the Standard SKU and enabling the auto-inflate feature (only available in Standard SKU), with min 1 and max 40 TUs to have full elasticity and discover the peak capacity needed.
Create an ADX Cluster, Tables, Data Connection and an Update Policy
Follow these steps to create an ADX cluster and a database: Quickstart: Create an Azure Data Explorer cluster and database | Microsoft Docs
Run these KQL queries in ADX by locating your ADX cluster > Databases > select your database (previously created) > Query > copy paste code in the query interface (using Kusto Explorer or ADX Web Explorer) and run each block: azure-demos/adx-update-policy-fw-all-formats.kql at main · gbeaud/azure-demos (github.com)
Running the KQL queries results in the creation of two tables: rawFirewallLogs (used as the destination for raw logs from Event Hub) and the structured consumer-ready table, networkFirewallLogs, which is updated by an update policy every time new data lands in the raw table. The function ExtractMyLogs() is used by the update policy to parse the content of the raw logs’ nested JSON using parsing functions to turn them into well defined, typed and query-enabled columns in the table networkFirewallLogs.
After running the above Kusto code, your ADX setup should look like this:
You now need to connect the database to your Event Hub: Ingest data from event hub into Azure Data Explorer | Microsoft Docs. Your connection settings should look like this:
Enable Diagnostic Settings on the Source
In your Azure Firewall (or other service), create a new diagnostic setting targeting the Event Hub you previously created. Send operations-relevant logs to it, and you may remove these log types from the other diagnostic setting targeting Log Analytics to avoid duplication (especially costs duplication). Your final setup should look like this, with one diagnostic setting sending security-relevant logs to LAW, and operations-relevant logs to Event Hub:
Optional step: set a retention policy to manage logs lifecycle
Operational logs can be retained in ADX for a defined period and then removed using a retention policy: Kusto retention policy controls how data is removed – Azure Data Explorer | Microsoft Docs. This is the counterpart process of the Archive tier in LAW.
Test the End-to-End Solution
It may take up to 20 minutes for the first logs to traverse the entire pipeline; then, running the query “networkFirewallLogs” in ADX should present the results below, where each column has been properly parsed from the raw logs:
If no logs are present in the networkFirewallLogs table, try inspecting the raw table to see if the problem may come from the update policy by running “rawFirewallLogs” in ADX, which should show raw logs in the table:
If no logs at all are showing up in ADX after an hour, you may look into Event Hub metrics to see if events are flowing through it or not:
Azure Data Explorer as a companion solution to Azure Monitor Log Analytics workspace and Azure Sentinel provides a cost effective and scalable operational log monitoring solution to complement your NOC/SOC/SIEM/SOAR processes and solutions. With ADX, some one-off data engineering steps are needed to parse raw logs into tables with strongly typed columns using an update policy.
This will allow you to run rich, type aware, Kusto queries on your firewall logs in ADX while reducing your overall Azure bill!
Special thanks to Devang Shah (ADX Principal Program Manager) and Fernando Merino (Cloud Solution Architect) for the support