Skip to content

Quick Stack — Stackfile Reference Guide

A Stackfile is a YAML file that wires together all the components of a DAM deployment: hardware sources and sinks, a policy, guard parameters, safety boundaries, and tasks. You point DAM at a Stackfile and it handles everything else — connection lifecycle, observation assembly, guard orchestration, and hardware dispatch.

No Python code is required for a Tier 1 deployment (built-in guards + built-in adapters). Python callbacks and custom guards are opt-in for Tier 2 and Tier 3 deployments.


Minimal Stackfile

The smallest valid Stackfile that runs the motion guard with a single boundary:

version: "1"

guards:
  - L1: motion
    phase: 0
  - L2: execution
    phase: 1
  - L3: hardware
    always: true

safety:
  control_frequency_hz: 30
  enforcement_mode: enforce

boundaries:
  joint_position_limits:
    layer: L1
    type: single
    nodes:
      - callback: joint_position_limits
        params:
          upper: [1.57, 1.57, 1.57, 1.57, 1.57, 0.08]
          lower: [-1.57, -1.57, -1.57, -1.57, -1.57, 0.0]

tasks:
  demo:
    boundaries: [joint_position_limits]

To run it:

dam run my_stackfile.yaml --task demo

Or in Python:

from dam.runtime.guard_runtime import GuardRuntime

runtime = GuardRuntime.from_stackfile("my_stackfile.yaml")
runtime.register_source("arm", my_source)
runtime.register_policy(my_policy)
runtime.register_sink(my_sink)
runtime.start_task("demo")

for _ in range(100):
    result = runtime.step()

Full Field Reference

Top-level keys accepted by a Stackfile:

Key Type Required Default Description
version string no "1" Stackfile schema version
hardware object no Hardware sources, sinks, and joint presets
policy object no Policy adapter type and parameters
guards object no {} Guard enable flags and parameters
fallbacks object no built-ins Named fallback strategies. Each entry references a registered @dam.fallback type.
boundaries object no {} Named boundary containers
tasks object no {} Named tasks referencing boundary containers
safety object no see below Global safety settings
runtime object no see below Control loop settings
loopback object no MCAP loopback buffer (Phase 2, requires Rust)
risk_controller object no Windowed risk aggregation (Phase 2, requires Rust)
simulation object no Optional simulation/source settings

safety section defaults

Field Type Default Description
always_active string or list[string] [] Boundary container name(s) active in all tasks
no_task_behavior string "emergency_stop" Fallback when no task is active
control_frequency_hz float 30.0 Target control loop frequency
max_obs_age_sec float 0.1 Maximum observation age before stale warning
cycle_budget_ms float 20.0 Per-cycle time budget; excess triggers watchdog

fallbacks section

Fallback implementations are registered in Python with @dam.fallback, independently from boundary callbacks. The Stackfile defines named fallback strategies and boundary nodes reference those names.

fallbacks:
  emergency_stop:
    type: emergency_stop
  hold_position:
    type: hold_position
  slow_payload:
    type: slow_down
    escalate_to: emergency_stop
    escalate_after_seconds: 5.0
    params:
      scale: 0.5

Built-in context types are emergency_stop, hold_position, wait_and_retry, slow_down, and retreat. Fallback Contexts can opt into L3 hardware monitoring with monitors_hardware; emergency_stop is terminal and disables it.

runtime section defaults

Field Type Default Description
mode string "passive" "managed" (built-in loop) or "passive" (caller drives step())
control_frequency_hz float 30.0 Target frequency for managed mode
max_obs_age_sec float 0.1 Stale observation threshold
cycle_budget_ms float 20.0 Cycle budget in managed mode

Guard Builtin Reference

guards.builtin.motion (L1)

Enforces joint position limits, velocity limits, and acceleration limits. Clamps action proposals rather than rejecting them when possible. Rejects when the end-effector is outside workspace bounds.

Field Type Default Description
enabled bool true Enable/disable this guard
upper list[float] required Joint upper limits [rad], one per joint
lower list[float] required Joint lower limits [rad], one per joint
max_velocities list[float] null Per-joint velocity limits [rad/s]
max_acceleration list[float] null Per-joint acceleration limits [rad/s²]
bounds list[list[float]] null [[xmin, xmax], [ymin, ymax], [zmin, zmax]] in metres
params.velocity_scale float 1.0 Scale factor applied on top of hardware preset limits (Phase 2)

Clamping behaviour: - Joint positions outside limits are clamped to the nearest limit. - Velocities exceeding max_velocities are scaled proportionally (all joints scaled by the same ratio). - Acceleration violations scale the target velocity back so the implied acceleration stays within limits. - Workspace violations always result in REJECT (cannot clamp the end-effector back into bounds).

guards.builtin.ood (L0)

Out-of-distribution gate. Checks the reconstruction error of the full observation against the training distribution. High reconstruction error indicates the robot is in an unfamiliar state and the policy output cannot be trusted.

Field Type Default Description
enabled bool true Enable/disable this guard
params.reconstruction_threshold float 0.05 Maximum allowed reconstruction error
params.temporal_smoothing_frames int 3 Consecutive OOD frames required before rejecting; absorbs lighting, shadow, and brief occlusion false positives

guards.builtin.execution (L2)

Task-level boundary enforcement. Dispatches the active boundary node's L2 callback and enforces node timeouts.

Field Type Default Description
enabled bool true Enable/disable this guard

Checks (in order): 1. callback — calls the node's registered L2 boundary callback; rejects on violation 2. timeout_sec — rejects if the node has been active longer than the timeout

Built-in L2 callbacks include task_joint_speed_limit, task_workspace_bounds, check_gripper_clear, and task_gripper_command_guard (clamps incompatible gripper commands to no-op). For pick-and-place, the default stack uses a left-to-right task_gripper_sequence list: close is allowed only in a 15 cm left pick cube, transfer allows no open/close command, and open is allowed only in a 15 cm right place cube about 20 cm away.

L2 results are emitted as /dam/L2 guard messages with event_class: "task". List containers fan out per active boundary/node name, so replay and the console can show which task phase produced the result.


Boundary Container Types

A boundary defines the safety envelope active during a task. Boundaries consist of nodes grouped in containers of one of three types.

single — one static node

The simplest container. Holds one node that is active for the entire task.

boundaries:
  safe_idle:
    layer: L1
    type: single
    nodes:
      - callback: joint_velocity_limit
        params:
          max_velocities: [0.05, 0.05, 0.05, 0.05, 0.05, 0.05]
        fallback: emergency_stop

list — sequential node progression

Nodes are activated in order. The runtime advances to the next node by calling runtime.advance_container("name"). Useful for multi-phase tasks (reach → grasp → lift → place).

boundaries:
  pick_place_approach:
    layer: L2
    type: list
    loop: false   # if true, wraps back to node 0 after the last node
    nodes:
      - callback: task_workspace_bounds
        params:
          bounds:
            - [-0.35, 0.35]
            - [-0.05, 0.45]
            - [0.01, 0.40]
        fallback: hold_position
        timeout_sec: 15.0

      - callback: task_workspace_bounds
        params:
          bounds:
            - [-0.20, 0.20]
            - [0.05, 0.35]
            - [0.01, 0.15]
        fallback: hold_position
        timeout_sec: 8.0

      - callback: task_joint_speed_limit
        params:
          max_speed: 0.15
        fallback: hold_position
        timeout_sec: 10.0

graph — arbitrary DAG transitions

Nodes form a directed graph. Transitions are triggered programmatically. Requires Python setup (not supported via from_stackfile yet — use list for sequential multi-phase tasks until Phase 3).

boundaries:
  recovery_graph:
    layer: L2
    type: graph
    nodes:
      - callback: task_joint_speed_limit
        params:
          max_speed: 0.3
        fallback: hold_position
      - callback: task_joint_speed_limit
        params:
          max_speed: 0.05
        fallback: emergency_stop

Node fields

Field Type Description
callback string Built-in or registered callback name to evaluate
params object Callback-specific settings such as speed limits or workspace bounds
fallback string Optional named fallback strategy to use when the node violates
timeout_sec float Optional maximum time a node may remain active

Common params fields

Field Type Description
max_speed float Maximum joint velocity norm [rad/s]
max_velocities list[float] Per-joint velocity limit [rad/s]
bounds list[list[float]] [[xmin, xmax], [ymin, ymax], [zmin, zmax]] [m]
upper list[float] Per-joint position upper limit [rad]
lower list[float] Per-joint position lower limit [rad]
max_force_n float Maximum force norm [N]

Fallback strategies (per node)

Name Behaviour
emergency_stop Immediately stop all motion. Sets hardware E-Stop if available.
hold_position Command the robot to hold its current joint positions.
retreat Move at low speed along a predefined retreat trajectory.

Hardware Section

The hardware section declares physical or virtual hardware interfaces.

LeRobot example (SO-ARM101)

hardware:
  preset: so101_follower    # auto-loads joint names and factory limits

  joints:                   # optional calibration overrides
    shoulder_pan:
      limits_rad: [-2.09, 2.09]
    gripper:
      limits_rad: [0.0, 0.044]

  sources:
    follower_arm:
      type: motor
      port: /dev/tty.usbmodem5AA90244141
      robot_type: so101_follower
      id: my_follower_arm

    top_cam:
      type: opencv
      index_or_path: 0
      width: 640
      height: 480
      fps: 30

  sinks:
    follower_command:
      ref: sources.follower_arm   # bidirectional — same robot instance

OpenCV camera options are source-level fields. Put index_or_path, width, height, and fps directly under the camera source; params: is reserved for boundary and guard configuration.

For a LeRobot motor source, robot_type selects the concrete LeRobot robot configuration and its default calibration namespace. For so101_follower, the default calibration directory is ~/.cache/huggingface/lerobot/calibration/robots/so101_follower/, and the configured id selects the JSON file within that directory. Set calibration_path only when using an explicit alternate calibration location.

ROS2 example

hardware:
  sources:
    joint_states:
      type: ros2
      topic: /joint_states
      msg_type: sensor_msgs/JointState
      mapping:
        joint_positions: position
        joint_velocities: velocity

  sinks:
    joint_commands:
      type: ros2
      topic: /joint_trajectory_controller/joint_trajectory
      msg_type: trajectory_msgs/JointTrajectory

Loading a Stackfile in Python

High-level (with LeRobot runner)

from dam.runner.lerobot import LeRobotRunner

# build_from_stackfile automates registry and adapter construction
runner = LeRobotRunner.from_stackfile("examples/stackfiles/so101.yaml")
runner.run("pick_and_place")  # runs managed loop until KeyboardInterrupt

Low-level (GuardRuntime directly)

from dam.runtime.guard_runtime import GuardRuntime

runtime = GuardRuntime.from_stackfile("my_stackfile.yaml")

# Register your adapters (named)
runtime.register_source("arm", my_source_adapter)
runtime.register_policy(my_policy_adapter)
runtime.register_sink(my_sink_adapter)

# Start a task (activates its boundary containers)
runtime.start_task("pick_and_place")

# Step manually (passive mode)
for _ in range(n_cycles):
    result = runtime.step()
    print(result.risk_level, result.was_clamped, result.was_rejected)

runtime.stop_task()

Programmatic construction

Prefer Stackfiles for normal use. They are easier to inspect, validate, review, and share than hand-built runtime objects. Use the low-level Python APIs only when you are writing DAM itself, building a test fixture, or adding a new adapter/guard implementation.


Loopback Logging (MCAP)

Configure continuous recording of cycle records (observations, actions, guard results) to MCAP files for post-mortem analysis and incident investigation.

loopback:
  backend: mcap                    # "mcap" (recommended) or "pickle"
  output_dir: /data/robot/sessions # Directory for session files
  window_sec: 10.0                 # Ring buffer depth for violation context
  capture_on_violation: true       # Always on (controls image capture)
  rotate_mb: 500.0                 # Rotate file every 500 MB
  rotate_minutes: 60.0             # Or every 60 minutes (whichever comes first)
  max_queue_depth: 256             # Records buffered before drop (never blocks main loop)
  capture_images_on_clamp: false   # Also capture images on CLAMP? (expensive; default off)

Key settings:

  • output_dir: Ensure the directory exists and is writable. Paths are created on first write.
  • window_sec: Increases buffer depth for images. Example: 30 s ≈ 1500 frames + cameras at 50 Hz = ~10–30 MB per violation.
  • rotate_mb / rotate_minutes: File rotation policy. E.g. 500 MB + 60 min = whichever comes first.
  • capture_images_on_clamp: Default off because motion clamps can be frequent (boundary probing). Enable only for detailed debugging.

Output: Each session is a single .mcap file containing: - /dam/cycle — control loop summary (pass / clamp / reject / fault) - /dam/obs — sensor state (joint angles, EE pose, force/torque) - /dam/action — proposal and validated action - /dam/L0/dam/L3 — per-layer guard results - /dam/latency — per-layer latency aggregates - /dam/images/{cam} — camera frames (on violation or clamp)

See Loopback Logging for full schema, API, and analysis tools.


Hot Reload

DAM can reload boundary constraints and guard parameters from a modified Stackfile without stopping the control loop. Changes are applied atomically at the start of the next cycle.

from dam.config.hot_reload import StackfileWatcher

watcher = StackfileWatcher(
    path="my_stackfile.yaml",
    on_change=runtime.apply_pending_reload,
    poll_interval_s=0.5,
)
watcher.start()

# Edit my_stackfile.yaml on disk — changes take effect within ~0.5s
# ...

watcher.stop()

Only static config-pool parameters (guard limits, boundary constraints) are reloaded. The guard class structure and task definitions are not changed at runtime.


Stackfile Validation

Validate a Stackfile against the schema without running the control loop:

dam validate my_stackfile.yaml

CI automatically validates all Stackfiles under examples/stackfiles/ on every push.