Brain & NPU Detection

Brain — Hardware Detection & Unified Inference

The Brain module (core/brain/) provides automatic hardware detection and a unified inference API shared by ALL product tiers.

Hardware Detection

HookProbe automatically detects AI accelerators at startup and selects the optimal inference backend.

Supported Accelerators (13 Types)

High-End ($250-$1,600)

Hardware	TOPS	Engine	Use Case
Apple M4/M4 Pro	38	CoreML + llama.cpp	Full local 70B LLM
NVIDIA Jetson Orin	67	TensorRT	GPU-accelerated detection
Qualcomm QCS8550	48	LiteRT + QNN	Industrial edge AI

Mid Tier ($60-$200)

Hardware	TOPS	Engine	Use Case
RPi 5 + Hailo-8	26	HailoRT	7ms inference, best RPi option
RPi 5 + Hailo-8L	13	HailoRT	Budget RPi AI HAT
Intel NPU (Meteor Lake)	11	OpenVINO	NUC/laptop deployment
Qualcomm QCS6490	12	LiteRT + QNN	Radxa Dragon Q6A

Entry Tier ($35-$100)

Hardware	TOPS	Engine	Use Case
Radxa ROCK 5B+ / Orange Pi 5	6	RKNN	Cheapest NPU path
Google Coral	4	LiteRT + EdgeTPU	USB/M.2 accelerator
BeagleY-AI	4	TIDL	Open hardware
Khadas VIM4	3.2	AML NPU	Amlogic platform
CPU-only	0.5-2	llama.cpp / sklearn	Always available fallback

Detection Priority

The detection function probes in order:

/proc/device-tree/compatible — SoC identification (Jetson, RK3588, TI AM67A)
/dev/hailo0 — Hailo NPU device node
/sys/class/accel/ — Intel NPU driver (x86_64 only)
/dev/rknpu — Rockchip NPU
/dev/apex_0 — Google Coral TPU
CPU fallback with SIMD detection (NEON/AVX2/AVX-512)

Usage

# CLI
hookprobe-ctl hw-info

# Python
from core.brain.hw_detect import detect_hardware
hw = detect_hardware()
print(f"{hw.accelerator.value}: {hw.tops} TOPS")
print(f"Recommended tier: {hw.tier_recommendation}")
print(f"LLM model: {hw.llm_recommendation}")

Mock Mode (Testing)

HOOKPROBE_MOCK_NPU=hailo-8l hookprobe-ctl hw-info

Inference Bridge

The InferenceBridge provides a unified API across all backends:

from core.brain.inference_bridge import InferenceBridge

bridge = InferenceBridge(tier='fortress')

# Anomaly detection (24-dim HYDRA features)
result = bridge.classify(feature_vector)
# → ClassifyResult(score=0.73, label='malicious', backend='cpu-sklearn')

# LLM text generation
text = bridge.generate("Analyze this network anomaly...")
# → GenerateResult(text='...', backend='local_llm')

# RAG embedding
vector = bridge.embed("XDP packet filtering")
# → [0.012, -0.034, ...] (768-dim)

Tier-Aware Backend Selection

Tier	Classification	LLM	Embedding
Sentinel	Rule-based thresholds	None	None
Guardian	CPU sklearn + NPU	SmolLM-135M (80MB)	Cloud (Gemini)
Fortress	CPU/NPU sklearn	TinyLlama-1.1B (670MB)	Local or cloud
Nexus	GPU/NPU sklearn	Phi-3 to Llama-70B	Local

Benchmarks

Run the benchmark suite to measure your hardware:

hookprobe-ctl benchmark --quick
# or
./scripts/run-benchmark.sh --output results.json

Example results (ARM64 CPU, 4 cores):

Latency: 0.002ms median (classification)
Throughput: 469,127 classifications/sec
Memory: 33MB peak RSS