ControlKit
A compiler toolchain for turning control-system policies into deterministic C and Rust artifacts for embedded deployment.
ControlKit is built as a compact compiler workflow: typed IR, policy frontends, code generators, optimization passes, verification reports, closed-loop benchmarks, and a CLI. The deployment step becomes explicit, inspectable, and testable instead of living as an ad hoc translation from notebook math to firmware source.
Control Philosophy
A controller is a small program with physical consequences. ControlKit treats it that way: typed at the boundary, explicit about dimensions, deterministic in generated code, and measured before deployment.
The goal is not to replace modeling tools or hide the math. The goal is to make the deployment boundary boring: preserve the control law, reject impossible shapes early, emit source a firmware engineer can read, and benchmark the artifact that will actually run on target hardware.
Background
Control algorithms are frequently designed in simulation-first environments, then manually translated into embedded code. That translation step is fragile: matrix shapes, actuator bounds, solver loops, and floating-point assumptions all need to survive the move from design artifact to target source.
ControlKit treats that move as a compiler problem. The user gives the tool a controller specification. A frontend validates the policy, lowers it into a typed internal representation, and then target backends generate deployable source. The benchmark runner compares the generated result against a Python reference path and records static estimates.
Architecture
The core pipeline is deliberately small: specs enter through YAML or Python APIs, frontends enforce controller-specific contracts, IR nodes preserve shape and semantic information, verification checks reject unsafe shapes or unstable feedback, optimization passes rewrite safe expression trees, and backends emit deterministic C or Rust.
The important design choice is that policy semantics do not leak directly into backend string generation. LQR, MPC-lite, and RL policies each get validated structures before code generation begins.
Compiler Contracts
ControlKit's main technical bet is that embedded control code should be generated from explicit contracts rather than from loose Python objects. Each compiler layer adds a narrow boundary: frontend specs validate user intent, IR nodes validate dimensions, verification checks stability and constraints, backends validate target support, and benchmarks validate runtime behavior against a reference.
Shape contract
Matrices, vectors, gain matrices, bounds, costs, and neural-network layers carry dimensions that are checked before code generation.
Policy contract
LQR lowers to feedback control, MPC lowers to a finite-horizon projected-gradient solver, and RL lowers to fixed-shape MLP inference.
Target contract
C and Rust backends expose the same control_step(input, output)
surface while preserving target-specific implementation details.
Validation contract
Controllers can be verified for dimensions, closed-loop stability, constraints, numerical robustness, and generated-code consistency hooks.
Intermediate Representation
The first serious layer is the IR. It knows about scalar constants, vectors, matrices, matrix-vector multiplication, addition, subtraction, clipping, linear systems, control laws, finite-horizon MPC controllers, and fixed-shape RL policies.
IRModule(
name="rl_balance",
policy="rl",
rl_policies=(RlPolicyIR(...),),
)
IR construction validates dimensions immediately. If a gain matrix cannot multiply the state vector, if MPC bounds do not match the control dimension, or if an MLP layer has a ragged weight matrix, the compiler fails before any target code is emitted.
MatVecMul, addition,
subtraction, negation, and clipping as typed expression nodes.
x_dot = Ax + Bu for continuous
systems and x_next = Ax + Bu for discrete-time systems.
# LQR feedback lowered into IR semantics
u = clip(-(K @ x), u_min, u_max)
# MPC-lite generated solver semantics
for iteration in range(solver_iterations):
X[0] = x
X[k + 1] = A @ X[k] + B @ U[k]
grad_u[k] = r * U[k] + B.T @ lambda[k + 1]
U[k] = clip(U[k] - step_size * grad_u[k], u_min, u_max)
# RL policy semantics
for layer in layers:
y = W @ x + b
x = activation(y)
Validation
Most useful compiler errors happen before generation. ControlKit tries to reject invalid
controllers where the mistake is still named in control-system terms: a bad
B matrix, a mismatched gain, a wrong saturation vector, or an incompatible
neural layer.
state_dim.
control_dim.
Frontends
Frontends keep user-facing specs small and policy-specific. LQR specs describe a gain matrix, MPC-lite specs describe discrete dynamics and box constraints, and RL specs point to fixed-shape JSON weights.
u = -Kx, with optional saturation.
x_next = Ax + Bu.
Custom Example
Here is a complete one-input controller workflow for a tiny two-state room model. The
controller reads the current temperature error and error rate, computes
u = -Kx, clips the actuator command, emits C, and benchmarks the generated
artifact.
1. Write the spec
# custom_room_lqr.yaml
# x[0] = temperature error, x[1] = error rate
# u[0] = heater/cooler command clipped to [-1, 1]
policy: lqr
name: room_temperature
state_dim: 2
control_dim: 1
state_name: x
control_name: u
K:
- [0.75, 0.18]
u_min: [-1.0]
u_max: [1.0]
2. Inspect and validate
# Fail early if dimensions or bounds are wrong.
controlkit validate custom_room_lqr.yaml
# Inspect the lowered module before generation.
controlkit inspect custom_room_lqr.yaml
# Expected shape summary:
# policy: lqr
# state_dim: 2
# control_dim: 1
# controllers: 1
3. Compile to C
# Emit deterministic C source and header files.
controlkit compile custom_room_lqr.yaml \
--target c \
--output build/room_c
# Generated artifact surface:
# void control_step(
# const float x[CONTROLKIT_STATE_DIM],
# float u[CONTROLKIT_CONTROL_DIM]
# );
4. Benchmark the artifact
# Compare Python reference latency with generated C.
controlkit benchmark custom_room_lqr.yaml \
--output build/room_bench \
--iterations 1000 \
--warmup-iterations 10
# Report includes:
# python_ns_per_call
# c_ns_per_call
# operation_count
# memory_footprint_bytes
Aerospace Case Study
Consider a small satellite attitude loop where a reaction wheel applies torque around a single body axis. The controller state is angle error and angular-rate error; the output is a bounded wheel torque command. This is an illustrative deployment example, not a flight-certified controller.
Plant abstraction
# State vector:
# x[0] = attitude error, radians
# x[1] = angular-rate error, rad/s
#
# Control:
# u[0] = reaction-wheel torque command
#
# Control law:
# u = clip(-Kx, -0.025, 0.025)
Controller spec
policy: lqr
name: reaction_wheel_axis
state_dim: 2
control_dim: 1
state_name: attitude_error
control_name: wheel_torque
K:
- [0.042, 0.318]
u_min: [-0.025]
u_max: [0.025]
Generated boundary
# Firmware calls one deterministic function.
void control_step(
const float x[2],
float u[1]
);
# No heap allocation.
# Fixed loop bounds.
# Saturation emitted inline.
What gets checked
# K must be 1 x 2.
# u_min/u_max must have length 1.
# generated C is benchmarked against Python.
# report includes latency and operation count.
Backends
The C backend emits standalone .h and .c files. The Rust
backend emits fixed-array source that keeps a #![no_std]-compatible style
where possible. Both expose the same basic control surface:
void control_step(
const float x[CONTROLKIT_STATE_DIM],
float u[CONTROLKIT_CONTROL_DIM]
);
Generated C uses straightforward loops and float arrays. Generated Rust
uses array references and mutable output buffers. RL Tanh is target-specific: C calls
tanhf, while Rust emits a small deterministic approximation to avoid a
standard-library math dependency.
for (int row = 0; row < CONTROLKIT_CONTROL_DIM; row++) {
float acc = 0.0f;
for (int col = 0; col < CONTROLKIT_STATE_DIM; col++) {
acc += K[row][col] * x[col];
}
u[row] = -acc;
}
#![no_std]
pub fn control_step(
x: &[f32; CONTROLKIT_STATE_DIM],
u: &mut [f32; CONTROLKIT_CONTROL_DIM],
) {
// generated fixed-shape controller body
}
The generated code favors boring loops over cleverness. That is intentional: embedded controller code should be inspectable, deterministic, and easy to compare against the mathematical controller that produced it.
Optimization
The optimizer performs symbolic simplification over the expression IR. The pass is small, but it establishes the compiler shape: rewrite only when the transformation preserves controller semantics and improves generated work.
Constant folding
Pure constant expressions collapse before backend emission.
Algebraic identities
Rules like x + 0 -> x, x * 1 -> x, and
x * 0 -> 0 reduce noise.
Dead computation
Unused expression nodes can be pruned from generated work.
Loop unrolling
Small controllers can opt into expanded loops when the target benefits.
before: add(matvec(K, x), vector([0.0, 0.0]))
after: matvec(K, x)
Benchmarks
ControlKit now includes a closed-loop benchmark suite, not just isolated function timing. Each case has a problem statement, deterministic model, controller spec, local runner, expected results, and JSON/Markdown reports.
{
"benchmark_name": "double_integrator_lqr",
"controller_type": "lqr",
"dt": 0.05,
"horizon_steps": 120,
"mean_runtime_us": 1.26,
"max_runtime_us": 5.50,
"p95_runtime_us": 1.66,
"final_state_norm": 0.002575,
"max_state_norm": 1.043344,
"total_control_effort": 1.410541,
"passed": true,
"failure_reason": ""
}
The suite measures controller runtime, final stabilization error, maximum state norm, total control effort, and pass/fail criteria. When generated C is available, the runner also records generated-code runtime beside the Python reference path.
Run one case
# Closed-loop simulation + report output.
controlkit benchmark \
benchmarks/double_integrator_lqr/controller.yaml
Run the suite
# Runs every folder under benchmarks/.
controlkit benchmark --all
Report files
outputs/benchmarks//results.json
outputs/benchmarks//report.md
Verification
Verification adds a static assurance layer before deployment. It checks matrix shapes, closed-loop stability, actuator and state constraints, numerical robustness, and keeps a generated-code consistency hook ready for target execution adapters.
A is square, B matches A, and K is compatible with u = -Kx.
A_cl = A - BK; discrete systems require spectral radius below one, continuous systems require negative real eigenvalues.
# Emit machine-readable and human-readable verification reports.
controlkit verify benchmarks/double_integrator_lqr/controller.yaml
{
"controller_name": "double_integrator_lqr",
"system_type": "discrete",
"passed": true,
"stability_margin": 0.048685,
"warnings": [
"closed-loop eigenvalues are close to the stability boundary"
],
"recommendations": [
"Review warnings before deploying to constrained hardware."
]
}
Takeaways
ControlKit is intentionally not a giant modeling environment. It is a transparent compiler layer for the deployment boundary: validate the controller, preserve its semantics in IR, emit readable source, and benchmark the result.
The project is still pre-alpha, but the shape is now visible: control policies can be treated like programs, and embedded controller deployment can look much more like a compiler workflow.