William Patterson 3 weeks agoSeptember 3, 2025

Monitor Packet Drops with eBPF

You feel the pressure when network flows stall, and packet drop monitoring eBPF is the fastest way to surface the kernel’s reason for loss so you can act with confidence.

We outline a practical path—attach an ebpf program to a tracepoint, capture events at the linux kernel, and export clear flow-level data to dashboards.

This approach moves you from guessing to precise root cause work—showing real drop reasons across TCP, UDP, SCTP, and ICMP families.

Table of Contents

Key Takeaways

Hooking tracepoints lets you capture loss reasons right where they occur in the kernel.
We send event data from ebpf programs to user space and expose metrics for dashboards.
OpenShift Network Observability maps these signals into filters, graphs, and topology views.
RHEL 9.2+ kernels support the drop-reason API; privileged agents are required.
You’ll get concrete steps, code paths (Go exporter and Python/BCC), and test methods to validate results.

What you’ll build and why packet drop insights matter for modern networks

I’ll walk you through building a lightweight exporter that turns kernel tracepoints into clear, actionable signals for teams. The exporter attaches a small program to tcp_retransmit_skb and skb/kfree_skb, captures events, and enriches each record with TCP state and flags.

The goal is practical: expose Prometheus metrics over HTTP on port 2112 and provide a simple endpoint for dashboards and alerts. That gives you not just counts, but reasons—NO_SOCKET, PKT_TOO_SMALL, and similar causes—so ops can triage faster.

Attach tracepoint, filter in-kernel, push compact data to user space.
Export metrics and labels for flow-level analysis and topology views.
Keep object files and config files organized with consistent naming.

Tracepoint	What it shows	Typical output
tcp_retransmit_skb	Retransmit events and TCP state	Retransmit rate, flow id, flags
skb/kfree_skb	Free/reason for free	Cause labels (NO_SOCKET, PKT_TOO_SMALL)
Exporter	Prometheus endpoint	HTTP :2112, metrics and basic status

For a quick start with tools and examples, see this guide to BCC tooling. In OpenShift, these signals map to UI filters—fully vs containing cases, TCP state filters, and top cause graphs—so teams jump from metric to root cause without guesswork.

Requirements, kernel support, and environment setup

Start by confirming your linux kernel features and access model — this keeps the setup safe and reproducible.

First, verify the kernel baseline. RHEL 9.2+ exposes the standardized drop-reason tracepoint that gives high-fidelity loss reasons. Older kernels will ignore that API and deliver less detail.

Privileges and safe access

Production hosts need deliberate privilege choices. In OpenShift, enable PacketDrop by creating a FlowCollector with ebpf.privileged: true and features: PacketDrop.

Limit privileged runs to validated nodes and set resource requests to protect other services. On Ubuntu 21.10+ and 22.04+ unprivileged BPF is disabled by default; re-enable with sysctl kernel.unprivileged_bpf_disabled=0 only for short-term dev work.

Tools and file layout

Choose a toolchain that fits your team. For Go builds, generate vmlinux.h with bpftool and compile C code using clang/LLVM. For rapid testing, install python3-bpfcc, bpfcc-tools, and libbpfcc for Python/BCC workflows.

Place object and header files in a stable folder (src/bpf/, build/, config/).
Plan port exposure for metrics (default exporter port 2112) and confirm user space can reach exported data.
Document which tracepoints and functions your programs attach to for reproducibility.

Area	Action	Why it matters
Kernel check	Confirm RHEL 9.2+ or equivalent	Enables standardized drop-reason tracepoints for better data
Privileges	Use ebpf.privileged: true only after review	Protects security posture and resource budgets
Toolchain	clang/LLVM, bpftool, Prometheus, Go or Python/BCC	Compiles code, generates headers, and exports metrics

How it works: eBPF programs, maps, and user space pipelines

Let’s unpack how tracepoints and maps cooperate to move structured data from kernel space into a running exporter.

We attach small programs to two tracepoints — skb/kfree_skb for drop events and tcp_retransmit_skb for retransmissions. That captures the event where the kernel records a reason or retransmit action.

Inside the kernel, these programs write compact records into eBPF maps. A perf buffer or ring buffer then moves those records into user space with minimal overhead.

Portable builds and data flow

We use BPF CO-RE with a generated vmlinux.h header so the same code runs across linux kernel variants. That keeps the program portable and maintainable.

Maps stage structured fields — addrs, ports, reasons, and tcp state.
Perf vs ring buffers — choose ring buffers on newer kernels for throughput and lower syscall cost.
User space reads via a non-blocking loop, decodes events, and updates Prometheus metrics on :2112.

Component	Role	Notes
Tracepoints	Emit event when kernel frees skb or retransmits TCP	skb/kfree_skb and tcp_retransmit_skb
eBPF maps	Stage structured event data	Used for aggregation and short-term state
Perf/Ring buffer	Transport events to user space	Ring buffer preferred on modern kernels
User space exporter	Decode, label, and expose metrics	HTTP :2112 for Prometheus scrapes

We keep object files, program code, and config files together to simplify CI and rollout. The user space loop decodes events quickly, avoids blocking, and handles clean shutdowns so no data is lost during upgrades.

packet drop monitoring eBPF in OpenShift with Network Observability

Turning on PacketDrop in OpenShift surfaces kernel signals so you can separate host-stack issues from OVS pipeline behavior.

Enabling PacketDrop: FlowCollector spec with privileged eBPF

Enable PacketDrop by applying a FlowCollector with ebpf.privileged: true and features: PacketDrop. This single change turns on tracepoint-driven programs that feed labeled events into the flow pipeline.

Drop categories: core subsystem vs OVS-based reasons

Two reason families appear: core subsystem (SKB_DROP_REASON) and OVS-based reasons on supported kernels. Seeing both side-by-side helps you tell host issues from pipeline actions.

OCP UI enhancements and overview panels

The console adds filters—Fully dropped, Containing drops, Without drops, and All—plus selectors for Packet drop TCP state and latest cause. Overview panels show total dropped rate, top states, and top causes.

Topology edges with drops render in red for quick spotting.
Export file snapshots let you attach evidence to postmortems.
Expect ~22% vCPU and ~9% memory uplift for the flowlogs-pipeline process.

Feature	What you get	Notes
FlowCollector flag	Privileged tracepoint programs	Set `ebpf.privileged: true`
Reason categories	Core subsystem vs OVS	Consistent tracepoint naming
UI & graphs	Filters, top rates, causes	Red topology edges; exportable file views

Hands-on path with Go: load eBPF, attach tracepoints, expose Prometheus

We’ll use Go and cilium/ebpf to load verified bytecode, bind tracepoints, and turn events into labeled metrics you can chart.

Loading and verifying bytecode

Compile your C into an object file and generate vmlinux.h with bpftool:

bpftool btf dump file /sys/kernel/btf/vmlinux format c > vmlinux.h

In Go call ebpf.LoadCollectionSpec and NewCollectionWithOptions with verbose verifier logs. That gives early feedback if the kernel rejects a function or map.

Attaching tracepoints and reading events

Attach via link.Tracepoint("tcp", "tcp_retransmit_skb", ...). Create a perf reader with perf.NewReader(coll.Maps["events"], os.Getpagesize()).

In user space run a non-blocking loop: read raw records, decode fields, and update Prometheus metrics with meaningful labels — source, destination, tcp state, and reason name.

Prometheus scrape config and HTTP exporter

Serve metrics with promhttp on :2112. Add a simple scrape job in prometheus.yml that targets 127.0.0.1:2112. Use descriptive metric names and source/dest labels for easy queries.

Step	What	Why
Build	Compile C → object file	Deterministic artifacts for CI
Load	ebpf.LoadCollectionSpec + NewCollectionWithOptions	Verifier logs surface issues early
Attach	link.Tracepoint & perf reader	Low-overhead event flow to user space

Structure source code and files so CI produces the same object for dev and prod.
Test tcp behavior with tc netem to inject loss/delay and validate counters rise.
Checklist: load without warnings, attach tracepoints, confirm events flow, validate Prometheus collection.

Alternative path with Python/BCC: rapid prototyping and DNS visibility

I like to prototype with Python/BCC when I want quick feedback. It’s a fast way to join network analytics and process context without a full build pipeline.

Use socket filters to capture DNS (UDP 53) traffic and attach kprobes to functions like execve to learn which process generated an event. Emit structured records with BPF_PERF_OUTPUT and read them in user space to correlate queries with PIDs and names.

Socket filters, kprobes, and mapping to processes

Attach a packet filter that parses IP/UDP headers and applies a berkeley packet filter expression for efficiency.

In the same program add kprobes that capture process context. Use helpers—bpf_get_current_pid_tgid and bpf_get_current_comm—to join network data with the running process.

Privilege considerations and enabling unprivileged BPF

On Ubuntu 21.10+ and 22.04 LTS, unprivileged BPF is disabled by default. Enable it temporarily with:

sudo sysctl kernel.unprivileged_bpf_disabled=0

Only use that setting for short-term dev work—revert it for production to preserve security posture.

I recommend installing python3-bpfcc, bpfcc-tools, libbpfcc, and linux-headers-$(uname -r).
Keep prototype source code and header files together for quick edits before porting to compiled code.
Read perf buffers in a loop and handle backpressure to avoid lost events.

Action	What it captures	Why it helps
Socket filter	DNS UDP 53 packets	Low-cost filtering with berkeley packet filter syntax
Kprobe (execve)	Process context at syscall	Maps network events to process name and PID
BPF_PERF_OUTPUT	Structured events	Reliable transport to user space for correlation

Analyzing and visualizing drops across flows, tables, and topology

When red edges appear in a topology view, they point you straight to the resource that needs attention. I begin in the Traffic flows table where sent bytes and packets show as green and failed counts show in red.

Open a side panel for any flow to see cause labels, TCP state, and links to documentation. That panel ties flow-level data to the application and the destination so you can act fast.

Traffic flows: bytes/packets vs drops and side-panel details

Compare high-throughput flows with those that have elevated failure rates. Use labels — cause, state, source, destination — to answer application-level questions without writing queries.

Topology view: highlighting edges with red drop indicators

Topology marks failing edges in red so you can follow the path upstream or to a destination process. That visual cue reduces time to isolate issues across resources.

Example panels: top dropped rate, top causes, and top TCP states for quick trend spotting.
Process context helps separate app restarts from infra faults.
Capture screenshots or export views to share evidence during incident reviews.

View	What to check	Why it helps
Traffic table	Bytes, packets, failed counts	Find flows with mismatched throughput vs success
Side panel	Cause, state, docs link	Root-cause context without extra queries
Topology	Red edges to destination/process	Quickly identify upstream or downstream issues

Workflow: spot red signals, open the side panel, correlate with process and time, then apply corrective action — configuration, code change, or resource adjustment.

Testing, troubleshooting, and practical scenarios

A quick lab test can tell you whether the exporter, kernel hooks, or network are at fault.

Simulate loss and latency

Use netem to inject controlled loss and delay and validate that TCP counters and metrics move as expected.

Example:
tc qdisc add dev eth0 root netem loss 10% delay 100ms — be careful: high values can break SSH. Roll back with tc qdisc del.

Common causes and what they reveal

NO_SOCKET signals an unreachable destination port — generate traffic to a closed port and confirm the reason appears in metrics.

PKT_TOO_SMALL usually points to parsing or MTU issues. OVS_DROP_LAST_ACTION (RHEL 9.2+) indicates policy or pipeline decisions.

Performance footprint and system impact

Expect the flowlogs-pipeline (FLP) to use ~22% more vCPU and ~9% more memory with PacketDrop enabled; other components usually rise

Validate tracepoint attachments by checking file configs and logs for permission or name errors.
Avoid high label cardinality — keep metric names stable and filter labels that cause explosion.
If the program loop lags, raise buffer sizes or simplify per-event work to prevent lost events.
Isolate faults by toggling one variable at a time: exporter, netem, or kernel setting.

Scenario	Indicator	Quick remediation
Closed destination port	NO_SOCKET in metrics and UI	Confirm service, adjust firewall or port config
Small frames / parsing	PKT_TOO_SMALL logged	Check MTU, fragment handling, and parser code
OVS policy	OVS_DROP_LAST_ACTION seen on RHEL 9.2+	Review OVS flows and ACL rules; test with policy off

Next steps to deepen your eBPF monitoring practice

Start small and iterate. Add DNS socket filters or process-mapping probes with Python/BCC to broaden visibility beyond TCP. Prototype quickly, then port stable logic to CO‑RE builds so your code stays portable across the linux kernel.

Script vmlinux.h generation and keep a tidy file layout—object files, headers, and configs—so builds remain reproducible. Measure performance as you add probes and track system resource use in OpenShift.

Document alerts, runbooks, and clear names for each metric and function. Use dashboards for rising retransmissions or specific reasons and tune thresholds with real incident data.

Finally, review results in retrospectives—what worked, what cost too much—and prioritize the next iteration. That close-the-loop habit scales knowledge and reduces time to repair.

FAQ

What is the goal of Monitor Packet Drops with eBPF?

I want to show how kernel-level observability lets you capture and classify lost network traffic in real time. That includes attaching programs to tracepoints, collecting reasons from kernel subsystems and Open vSwitch, and exporting labeled metrics for dashboards and alerts.

Why do drop insights matter for modern networks?

Visibility into lost traffic helps you separate congestion, socket issues, or policy drops from application bugs. This improves incident triage, capacity planning, and security investigations — and it helps owners prioritize fixes based on real impact.

What kernel features and versions do I need?

Use a recent Linux kernel that exposes drop-reason tracepoints — RHEL 9.2+ and comparable kernels include the relevant hooks. You also benefit from CONFIG_BPF and tracepoint support compiled into the kernel for reliable data collection.

What privileges and safety practices are required to enable this on production hosts?

You need elevated privileges to load eBPF programs and manage maps. Prefer policies that limit capabilities, use verified bytecode, run collectors with least privilege, and enable runtime seccomp or SELinux profiles. Consider using privileged DaemonSets only where policy allows.

Which tools are recommended for building the pipeline?

I typically use clang/LLVM for compiling, bpftool for inspection, and cilium/ebpf when writing Go. Python with BCC is great for quick prototyping. For metrics and visualization, use Prometheus and Grafana and expose an HTTP exporter on a known port such as 2112.

How do tracepoints like skb/kfree_skb and tcp_retransmit_skb help?

These tracepoints capture lifecycle events for sk_buffs and TCP retransmits — giving you context about why memory was freed or why data was resent. Attaching programs there yields granular reason codes and metadata to label each event.

How do bpf maps, perf buffers, and ring buffers move data to user space?

Maps store counters and state, while perf and ring buffers stream event data efficiently to user space consumers. Maps are good for aggregated metrics; perf/ring buffers suit high-frequency event delivery to exporters or analyzers.

What is BPF CO-RE and why use vmlinux.h?

CO-RE (Compile Once — Run Everywhere) makes eBPF bytecode portable across kernel versions. Using a vmlinux.h or BTF-enabled headers lets the loader resolve kernel structure offsets safely without recompiling for each kernel build.

How do I turn kernel events into Prometheus metrics and labels?

Read events in your user-space exporter, map reason codes and metadata to metric labels (namespace, pod, source/destination, port, protocol), and increment counters or histograms. Ensure cardinality stays controlled to protect Prometheus performance.

How does OpenShift enable PacketDrop with Network Observability?

OpenShift’s FlowCollector can enable privileged eBPF probes when PacketDrop is selected. The collector configures kernel attachments and aggregates reasons across nodes, integrating results with the OCP UI and observability panels.

What are common drop categories I’ll see in OpenShift?

Expect core kernel categories and OVS-specific reasons. Examples include socket-level rejections, MTU or malformed packet issues, and Open vSwitch decisions such as policy or action-based drops. Each category maps to a defined reason code.

What UI features help explore drop data in OCP?

Filters, TCP state breakdowns, and the latest-cause field let you drill into incidents. Overview panels show dropped rate, top states, and top causes so teams can quickly find hotspots and affected workloads.

How do I load eBPF bytecode with Go and cilium/ebpf?

Compile CO-RE bytecode with clang/LLVM, embed or load the ELF with cilium/ebpf, verify maps and program types, and attach to tracepoints. Then read events from perf or ring buffers and translate them into Prometheus metrics.

How do I attach to tracepoints and label metrics in Go?

Use the cilium/ebpf library to create links to tracepoints and start a polling loop for perf events. Enrich each record with metadata — pod name, namespace, source IP, dest port — then increment labeled counters exposed to Prometheus.

What Prometheus configuration do I need for scraping?

Run an HTTP exporter on a stable port (commonly 2112), expose a /metrics endpoint, and add a scrape job in Prometheus targeting node or pod endpoints. Use relabeling to add cluster or node labels if needed.

When should I choose Python/BCC instead of Go?

Use Python/BCC for rapid prototyping, ad hoc debugging, or when you need quick DNS and process-level visibility. It simplifies kprobe/socket filter creation but is less suited for production-grade exporters than Go-based tooling.

How can socket filters and kprobes help with visibility and DNS issues?

Socket filters let you inspect traffic at the socket layer; kprobes instrument specific kernel functions to attribute packets to processes or syscalls. Together they reveal which process or DNS lookup led to a drop or retransmit.

What privilege trade-offs exist when enabling unprivileged BPF?

Unprivileged BPF lowers the need for root but restricts which program types you can load. It’s safer for shared environments but may block tracepoints or map types needed for full observability. Evaluate against your security policy.

How do I analyze drops across flows, tables, and topology?

Correlate metrics by flow identifiers and topology metadata. Use side panels to show per-flow bytes, packets, and drops. Highlight edges with high drop rates in topology maps to find chokepoints or misconfigurations.

How can I simulate loss or latency to validate monitoring?

Use tc netem to inject delay, packet loss, or reordering on test nodes. Run traffic generators and verify that your probes report the expected reasons and rates. This helps confirm instrumentation correctness before production rollout.

What are frequent kernel and OVS drop causes I should expect?

You’ll commonly see rejections like NO_SOCKET, PKT_TOO_SMALL, and OVS_DROP_LAST_ACTION. Each maps to a specific subsystem reason — use kernel docs or OVS source to map numeric codes to human-friendly labels.

What performance footprint should I expect from PacketDrop collectors?

Properly tuned probes add modest CPU and memory overhead. On constrained hosts you may see small FLP vCPU and memory deltas. Measure in staging and use sampling, aggregation, and map eviction to limit resource use.

What are practical next steps to deepen an eBPF monitoring practice?

Start with a small footprint collector in a dev namespace, validate traces with tc netem, iterate on label cardinality, and integrate telemetry into Prometheus/Grafana. Expand to OpenShift flows once you’ve validated correctness and performance.

Tagged eBPF Packet Inspection, eBPF technology, Linux Kernel Monitoring, Network Performance Monitoring, Network Security Monitoring, Network traffic analysis, Networking Troubleshooting, Packet Drops Analysis, Packet Loss Detection, Real-Time Monitoring Tools

WilliamPatterson

I'm a lifelong tech enthusiast who is particularly passionate about Linux and open-source software. My journey into the world of technology began in college, when I first discovered the power of open-source communities. I graduated with a degree in Computer Science, having spent countless hours tinkering with Linux distributions and learning about the Linux Kernel.