LegionTrap

Legion

A reference to the French Foreign Legion, where Stefan served. Beyond that personal history, a legion implies structure, discipline, and a systematic approach to difficult problems. That operational mindset — preparation, precision, and methodical execution — shapes how this project is built and how investigations are approached.

Trap

In cybersecurity, a trap is a mechanism for detection, observation, and understanding. Traps do not simply block threats — they reveal how threats operate. The goal here is not only to defend, but to investigate: to understand attacker behaviour, expose hidden patterns, and build knowledge from direct engagement with real security problems.

Together, the name reflects the project's character: disciplined investigation into how threats operate.

Why it exists

The Indicator Problem

Most threat intelligence is organised around indicators: IP addresses, domain names, file signatures. The operational logic is simple — observe a bad thing, add it to a list, block it.

The cost of changing has collapsed. A new IP address is essentially free. A new domain costs a few dollars. A new server in a different country takes minutes. AI tooling now lets attackers rotate infrastructure, regenerate attack variants, and cycle credential lists at industrial scale. The useful life of any given indicator — how long it stays relevant — is getting shorter every year.

Defenders end up on a treadmill. Block the IP. The attacker rotates. Block the next one. The list-based model has a structural problem: it tracks exactly the things that are cheapest to change.

Imagine trying to identify a burglar who keeps hitting houses in your neighbourhood. The traditional approach is to write down their licence plate. That works once. Then they get a new car, and the licence plate is useless.

The alternative is to describe how they operate: they prefer corner houses, arrive between 2 and 3 in the morning, use a specific technique. That description survives the new car — and the one after that.

LegionTrap builds the behavioural description. Not the licence plate.

Behavioural patterns change slowly because they reflect real investment. The tools an attacker has refined, the sequences they have learned, the timing their infrastructure produces — changing all of that costs real time and money. A behavioural fingerprint built from months of observation stays useful for months or years. A list of bad IP addresses can be worthless in hours.

LegionTrap has two structurally isolated paths. The ingest path runs on every event and builds behavioural intelligence automatically. The reasoning path runs on operator request. Removing the reasoning path leaves the ingest path fully functional.

Honeypot Observation

A honeypot is a deliberately exposed server with no legitimate users. Anyone who connects is doing something they shouldn't. LegionTrap ingests everything honeypots observe over HTTP.

Behavioural Fingerprinting

For every source observed, LegionTrap builds a fingerprint across five dimensions: timing, probe sequence, protocol behaviour, credential patterns, and target selection. This describes how an attacker operates, not where they came from.

Campaign Clustering

Fingerprints that share behavioural characteristics are grouped into campaigns using a deterministic similarity algorithm. The same data always produces the same result. The reasoning behind every assignment is stored and auditable.

Campaign Lifecycle

Campaigns move through states: active, dormant, reactivated, historical. When a dormant campaign's behavioural pattern reappears with new infrastructure, the platform detects the reactivation. Intelligence accumulates over time.

Actor Intelligence

Operators create actor profiles and link campaigns to presumed responsible parties. The system suggests connections based on fingerprint similarity. Attribution decisions are always made by the operator — never assigned automatically.

AI Reasoning

On operator request, an AI layer reads structured, deterministic data and produces natural-language campaign summaries and threat briefs. AI is a writer, not a judge. It produces no automatic actions and every output is stored with a full audit trail.

Accept events from honeypots via HTTP API
Batch ingest with schema validation and deduplication
GeoIP enrichment on every event: country, city, ASN
Audit logging for all ingest operations

5-dimension behavioural fingerprints per attacker
Deterministic campaign clustering with lifecycle management
Reactivation detection when dormant campaigns resurface
Behavioural stability scoring and drift alerting
Actor profiles with read-only suggestion engine

Operator-triggered campaign summaries
Multi-campaign threat briefs with time-window filtering
Three backends: Claude API, Ollama (local), or disabled
Every output stored immutably with full audit trail

Firewall blocklists: pf.conf and UFW formats
STIX 2.1: Indicators, Campaign SDOs, Relationship SDOs
ATT&CK Navigator export
Privacy masking: HMAC hashing or octet mask on export

Behaviour over indicators

IP addresses are what attackers use. Behaviour is how they operate. LegionTrap tracks the second, because it survives the rotation of the first. A behavioural fingerprint built from months of observation stays useful long after every observed IP address has been replaced.

Explainability over scores

Every clustering decision is stored with the per-dimension similarity scores that produced it. An analyst who disagrees with a campaign assignment can inspect the exact evidence and reasoning behind it. The system justifies its conclusions — it does not just produce them.

Operator judgment over automation

LegionTrap surfaces information. Operators make decisions. No attribution is assigned automatically. No action is triggered without operator review. AI analysis is generated on request — it never acts on its own. The platform is an intelligence aid, not an autonomous system.

Intelligence compounds over time

The longer LegionTrap runs, the more valuable it becomes. Behavioural history accumulates. Campaign reactivations become recognisable. The institutional memory the platform builds — about specific attackers, their evolution, their dormancy patterns — cannot be purchased from any vendor. It is built by continuing to run.

Honest limitations

What It Cannot Do Yet

Predict future activity The platform tells you what happened and what is happening. It does not yet forecast which dormant campaigns are likely to return, or when a campaign's behaviour is about to shift. The longitudinal data accumulating today is designed to support this when sufficient history exists.
Share intelligence across deployments Each LegionTrap deployment is an island. If two operators are both being targeted by the same group, neither knows. Behavioural federation — sharing fingerprint patterns without sharing raw data — is Phase 8. The design is complete; it awaits pilot operators.
Support multiple users The platform is designed for a single operator. Role-based access, team collaboration, and multi-user dashboards are not currently implemented.
Handle multiple sensor types reliably Designed and tested with one honeypot format. Operators running multiple or different sensor types may encounter rough edges in schema handling and event normalisation.
Generate detection rules for non-firewall defenses The platform generates firewall blocklists. It does not yet generate Sigma rules for SIEM systems, detection rules for intrusion detection systems, or host-based monitor alerts. These are planned.

Long-term vision

Where the Project Is Going

LegionTrap is being built with a specific long-term thesis: as AI makes traditional indicators cheaper to rotate, behavioural intelligence becomes the primary surviving category of threat intelligence. The platform is a bet on that future, built to be the infrastructure layer for it.

Near term — Phase 8: Behavioural federation. The architecture for sharing behavioural fingerprints across independent deployments — without sharing raw events, source IPs, or operator identity — is fully designed. Phase 8 begins when two real operators agree to a pilot bilateral exchange. Each deployment would benefit from the other's observed patterns without surrendering the data sovereignty that defines why they chose a self-hosted platform.

Medium term: Predictive intelligence. The longitudinal fingerprint history the platform is accumulating today is specifically designed to eventually support forecasting — which dormant campaigns are likely to return, when a campaign's behaviour is about to shift. This is not yet built. The data structures that will make it possible are already being populated.

Long term: Compounding sovereign intelligence. A LegionTrap deployment that has been running for two or three years holds institutional memory about specific attackers, their evolution, and their dormancy patterns that no commercial product can replicate — because no commercial product has access to that specific operator's observations. The longer it runs, the harder it is to replace. That compounding is the long-term moat.

Active Development

Phases 0–7 complete · v0.34.0

1,553 tests across 71 test files. Behavioural fingerprinting, campaign clustering, actor intelligence, AI reasoning, and export pipelines operational.

Current focus

Phase 7 complete Actor intelligence, per-campaign weight calibration, drift alerting — 8 PRs merged
Phase 8 preparation Documenting federation prerequisites — awaiting pilot operators
Deferred exports Sigma rules and MISP packages — planned from Phase 4
Public documentation This project page and repository README

Next milestone

Phase 8 pilot — bilateral federation between two operators

LegionTrap is not being built as a finished product.

It is being built as an evolving record of learning, investigation, and continuous improvement.

The objective is not to appear knowledgeable. The objective is to understand more tomorrow than today.

Back to Projects View all projects

LegionTrap

Why the Name

Legion

Trap

The Indicator Problem

How It Works

Honeypot Observation

Behavioural Fingerprinting

Campaign Clustering

Campaign Lifecycle

Actor Intelligence

AI Reasoning

What It Can Do Today

Ingest and Enrich

Behavioural Intelligence

AI Reasoning

Export and Integration

The Principles Behind It

Behaviour over indicators

Explainability over scores

Operator judgment over automation

Intelligence compounds over time

What It Cannot Do Yet

Where the Project Is Going

Development Status

LegionTrap