top of page

Log Pipelines - logstash and testing

  • brencronin
  • Oct 26, 2023
  • 9 min read

Logstash configurations have three main components:


  1. Input

  2. Filter

  3. Output




See below an example Logstash configuration file.


# /etc/logstash/conf.d/syslog-to-siem.conf

# ─────────────────────────────────────────
# INPUT — where Logstash receives log data
# ─────────────────────────────────────────
input {
  syslog {
    port => 514          # listen for rsyslog UDP/TCP on port 514
    type => "syslog"     # tag all events with a type for routing later
  }
}

# ─────────────────────────────────────────
# FILTER — parse, enrich, or drop events
# ─────────────────────────────────────────
filter {
  if [type] == "syslog" {
    # parse the syslog message into structured fields
    grok {
      match => {
        "message" => "%{SYSLOGTIMESTAMP:event_time} %{HOSTNAME:source_host} %{GREEDYDATA:log_message}"
      }
    }
    
# convert the timestamp string into a real date field
    date {
      match => [ "event_time", "MMM  d HH:mm:ss", "MMM dd HH:mm:ss" ]
      target => "@timestamp"
    }
    # drop noisy/irrelevant events before they reach Sentinel
    if [log_message] =~ /healthcheck/ {
      drop {}
    }
    # add a custom field for easier filtering in Sentinel
    mutate {
      add_field => { "environment" => "production" }
    }
  }
}

# ─────────────────────────────────────────
# OUTPUT — where Logstash sends the events
# ─────────────────────────────────────────
output {
  if [type] == "syslog" {
    microsoft-sentinel-logstash-output-plugin {
      client_app_Id          => "your-app-id-here"
      client_app_secret      => "your-secret-here"
      tenant_id              => "your-tenant-id-here"
      data_collection_endpoint => "https://your-dce.ingest.monitor.azure.com"
      dcr_immutable_id       => "DCR-your-dcr-id"
      dcr_stream_name        => "Custom-YourTable_CL"
    }
  }
  # optional: write to local file for debugging (remove in production)
  # file {
  #   path => "/tmp/logstash-debug.log"
  # }
}


Visual walk through examples


Below is an example of a firewall sending syslog on port 514 to a logstash server.




Syslog uses port 514, which means a single Logstash input can simultaneously receive logs from many different device types, firewalls, switches, Linux servers, Windows hosts, and more. This is convenient, but it creates a parsing challenge: each vendor formats its log messages differently, so the filter block needs separate grok patterns for each log source.


The difficulty is that vendor log formats often share structural similarities. A Cisco firewall log and a Palo Alto firewall log may both open with a timestamp, a hostname, and a severity level before diverging in their middle and tail sections. When formats are too similar, grok patterns can match the wrong log type, silently misparsing fields or routing events to the wrong output. Getting the patterns specific enough to be mutually exclusive, without becoming so rigid that they break on minor format variations, is one of the trickier aspects of managing a multi-source Logstash pipeline.




One effective solution is to configure each device type to send syslog traffic to the centralized server on a different port. This allows you to define completely separate Logstash input blocks, each with its own dedicated filter and output, so logs from each source type are parsed independently with no risk of pattern collision.


For example:


  • Port 514 — Firewall syslog

  • Port 515 - ESXI syslog

  • Port 5140 — Cisco network devices syslog


There are two practical hurdles to account for when implementing this approach. The first is on the source side: administrators of each device type need to reconfigure the syslog destination port away from the default. This is supported on virtually all modern systems, but it requires coordination with the relevant team and can take time to roll out across a large estate.


The second hurdle is the network path. Firewalls between the source devices and the Logstash server may only permit port 514 by default. Any non-standard ports will need explicit firewall rules opened, which typically requires a change request through your network team. Neither issue is a blocker, but both should be identified and planned for early, they are the most common reasons this approach stalls in practice.




If reconfiguring the source device's destination port is not an option, a practical workaround is port mangling using FirewallD directly on the Logstash server. FirewallD is a host-based firewall that intercepts traffic as it arrives at the server, and can redirect incoming packets to a different destination port before Logstash ever sees them, based on conditions such as the source IP address or subnet.


This means a device locked to sending on port 514 can be transparently redirected to port 5141, where a dedicated Logstash input is waiting for that specific device type. From the source device's perspective, nothing changes.


The trade-off is administrative overhead. The source-to-port mapping lives in FirewallD rules rather than in the end devices sending syslog themselves, so the two must be kept in sync. As new device types are onboarded or source IPs change, the FirewallD rules need to be updated alongside the pipeline configuration, something that is easy to overlook and can cause silent routing failures if not managed carefully.




Another option is to place a load balancer in front of the Logstash server. In most production deployments this is already a natural fit, since load balancers are commonly introduced to provide high availability across two or more redundant Logstash instances, ensuring log ingestion continues if one node goes down.


Beyond high availability, load balancers such as NGINX have traffic routing capabilities that can solve the port separation problem without touching source device configurations or host firewall rules. Incoming syslog traffic can be inspected and redirected to different destination ports based on conditions like the source IP or subnet, effectively performing the same port mangling as the FirewallD approach but at the network layer and with the added benefit of being centrally managed outside of the Logstash servers themselves.

This makes the load balancer option attractive in environments where one is already present, the port routing logic can be layered in with relatively low additional complexity. In environments without an existing load balancer, the overhead of introducing one purely for port routing is likely harder to justify unless high availability is also a requirement.




A more resource-intensive option is to deploy a dedicated Logstash instance, either a separate server or a container, for each log type. Each instance handles only one source type, giving it a completely isolated pipeline with no shared filter logic or risk of pattern collision. A load balancer can optionally sit in front to distribute traffic and provide high availability across instances of the same type.


The significant trade-off is operational overhead. Each instance needs to be deployed, monitored, patched, and maintained independently. In environments with many log source types, this can quickly multiply the infrastructure footprint. Containerizing Logstash with something like Docker or Kubernetes mitigates some of that burden by making instances easier to spin up and manage at scale, and is worth considering if your organization already operates a container platform. Without that, managing discrete server instances per log type is generally only justifiable when a particular source generates enough volume or has enough parsing complexity to warrant complete isolation.





A geographically distributed deployment model takes the multi-instance approach a step further. In this architecture, a dedicated Logstash cluster is deployed within each geographic region, processing logs from source devices local to that region and forwarding them to a destination, such as an Elasticsearch cluster, that also resides within the same region.


The primary motivation for this design is data residency. Many regulatory frameworks and organizational security policies require that log data not leave a defined geographic boundary. By keeping the entire ingestion pipeline, from source device through Logstash to the destination store, within a single region, this architecture satisfies those requirements without relying on cross-region data filtering or redaction after the fact.


A secondary benefit is latency. Log sources send to a nearby Logstash cluster rather than traversing a wide-area network to a centralized server, which reduces the risk of dropped events under high load and keeps ingestion performance consistent regardless of where the source device is located.







The if [type] == "syslog" guards in both the filter and output blocks are important on a production server. If you ever add a second pipeline or input source, those guards prevent events from accidentally flowing into the wrong output.

The drop {} for healthchecks illustrates a common real-world need — verbose but low-value log lines (heartbeats, health endpoints, scheduled job pings) would otherwise inflate your Sentinel ingestion costs since Azure charges per GB ingested.

The date filter is easy to overlook but critical. Without it, @timestamp will be set to the time Logstash received the event, not when it occurred — which breaks time-based queries in Sentinel.

The commented-out file output at the bottom mirrors the debugging technique from the troubleshooting guide — it's ready to uncomment quickly if you need to isolate whether events are reaching the output stage.



Logstash Troubleshooting Guide


Since this is a production server, the approach is organized from least to most invasive — read-only verification first, then targeted config inspection, before touching anything live.

Stage 1 — Verify logs are reaching Logstash from rsyslog

Before blaming Logstash config, confirm the data is actually arriving at the OS level.

Check if the syslog port is open and listening:

bash

ss -tulnp | grep -E '514|5044'
# 514 = standard syslog (UDP/TCP), 5044 = Beats/rsyslog TCP input
# confirm Logstash or rsyslog is bound to the expected port

Watch live incoming traffic on the wire:

bash

sudo tcpdump -i any port 514 -n -c 50
# or if using TCP 5044
sudo tcpdump -i any port 5044 -n -c 50

If you see packets here, the source server is sending. If silent, the problem is upstream (rsyslog on the source, firewall, network routing).

Check rsyslog on the Logstash server (if rsyslog is the receiver, forwarding to Logstash):

bash

sudo systemctl status rsyslog
sudo tail -f /var/log/syslog        # Debian/Ubuntu
sudo tail -f /var/log/messages      # RHEL/CentOS

Verify rsyslog is actually forwarding to Logstash:

bash

sudo cat /etc/rsyslog.conf
sudo ls /etc/rsyslog.d/
# Look for lines like:
# *.* @@127.0.0.1:5044   (TCP forward to Logstash)
# *.* @127.0.0.1:5044    (UDP forward)

Stage 2 — Verify Logstash process health

bash

sudo systemctl status logstash
sudo journalctl -u logstash -n 100 --no-pager
# Look for: pipeline errors, plugin load failures, OOM, connection refused

Check Logstash's own log:

bash

sudo tail -100 /var/log/logstash/logstash-plain.log
sudo grep -i "error\|warn\|fatal\|exception" /var/log/logstash/logstash-plain.log | tail -50

Key things to look for in the log:

  • Pipeline started — confirms the pipeline actually loaded

  • connection refused or unable to connect — output to Sentinel is failing

  • mapping error or serialization error — events are being dropped at output

  • filter error — a filter plugin is crashing and potentially dropping events

Stage 3 — Use the Logstash Monitoring API (non-invasive, read-only)

This is the most powerful non-invasive tool. The API exposes live pipeline metrics without touching any config.

Check overall node stats:

bash

curl -s http://localhost:9600/?pretty
# Returns: version, pipeline names, JVM info

Pipeline-level event counters — this is your key diagnostic:

bash

Look specifically at these counters per pipeline:

json

"events": {
  "in": 15400,       // events received by input
  "filtered": 15400, // events that passed through filters
  "out": 15380,      // events sent to output
  "duration_in_millis": ...,
  "queue_push_wait_in_millis": ...
}

If in >> out, events are being dropped somewhere in the pipeline. If in = 0, logs are not reaching Logstash at all.

Check individual plugin stats (input, filter, output):

bash

curl -s http://localhost:9600/_node/stats/pipelines/<pipeline_name>?pretty
# Replace <pipeline_name> with your pipeline id (e.g. "main")

Check the persistent queue if enabled:

bash

curl -s http://localhost:9600/_node/stats/pipelines?pretty | grep -A5 "queue"
# queue.events_count > 0 and growing = output is backing up (Sentinel connection issue)

Check JVM and memory pressure:

bash

curl -s http://localhost:9600/_node/stats/jvm?pretty | grep -E "heap|gc"
# heap_used_percent > 85% means GC pressure is likely causing dropped events

Stage 4 — Inspect the pipeline configuration

bash

sudo ls /etc/logstash/conf.d/
sudo cat /etc/logstash/conf.d/*.conf

Verify the input block is listening on the right port/protocol:

input {
  syslog { port => 514 }          # direct syslog input
  # or
  tcp { port => 5044 codec => json_lines }
  # or
  beats { port => 5044 }          # if rsyslog uses the beats output plugin
}

Check filter blocks for drop conditions:

bash

sudo grep -n "drop\|discard\|if.*drop" /etc/logstash/conf.d/*.conf

A misconfigured if condition with a drop {} action silently discards events — this is a common cause of missing logs.

Verify the output block points to the right Sentinel workspace:

output {
  microsoft-sentinel-logstash-output-plugin {
    client_app_Id => "..."
    client_app_secret => "..."
    tenant_id => "..."
    data_collection_endpoint => "https://<dce>.ingest.monitor.azure.com"
    dcr_immutable_id => "DCR-..."
    dcr_stream_name => "Custom-..."
  }
}

Confirm the data_collection_endpoint, dcr_immutable_id, and dcr_stream_name values match exactly what is configured in your Azure Data Collection Rule.

Stage 5 — Test the output path to Sentinel independently

Verify network connectivity from Logstash to the DCE endpoint:

bash

curl -v https://<your-dce>.ingest.monitor.azure.com
# Expect a 403 or 200 — a connection refused or timeout means a network/firewall issue

Check if the Microsoft Sentinel plugin itself is installed:

bash

sudo /usr/share/logstash/bin/logstash-plugin list | grep sentinel
# Should return: logstash-output-microsoft-sentinel

Test authentication independently (confirms the app registration credentials work):

bash

curl -X POST \
  "https://login.microsoftonline.com/<tenant_id>/oauth2/v2.0/token" \
  -d "client_id=<client_app_id>&client_secret=<secret>&scope=https://monitor.azure.com//.default&grant_type=client_credentials"
# A valid access_token in the response confirms auth is working

If this returns an error, the app registration credentials in the Logstash config are wrong or expired — this is a common silent failure that causes events to be dropped at the output stage with no obvious indicator in the main log.

Stage 6 — Safe live traffic test (production-safe)

If you still can't pinpoint where events are being lost, you can temporarily add a file output to the pipeline alongside the existing Sentinel output, without removing it. This writes events to a local file so you can confirm events are reaching the output stage:

output {
  microsoft-sentinel-logstash-output-plugin { ... }   # existing — leave untouched
  file {
    path => "/tmp/logstash-debug-%{+YYYY-MM-dd}.log"
  }
}

Reload Logstash config (sudo systemctl reload logstash or via API: curl -X POST http://localhost:9600/_node/reload), then watch the file:

bash

tail -f /tmp/logstash-debug-$(date +%Y-%m-%d).log

If events appear here, the pipeline is working up to the output stage and the issue is specifically in the Sentinel plugin or connectivity. If the file stays empty, the problem is in the input or filter stage.

Important: Remove the file output and reload after you've confirmed — writing to disk at production volume will fill your disk quickly.



Log Pipeline Testing


synthetic log generator


Comments


Post: Blog2_Post
  • Facebook
  • Twitter
  • LinkedIn

©2021 by croninity. Proudly created with Wix.com

bottom of page