Two Tracks, One Floor: A Commissioning Framework for Autonomous Mining

43 minutes ago
5 min read

By: Maghee McMullen, Senior Consultant & Partner, mule.Bot

The autonomy industry has a demo problem. Systems that look great in controlled conditions get stamped production-ready and shipped to sites. Then the site happens to them — not as a single dramatic failure, but as a grinding accumulation of edge cases, dispatch timeouts, degraded-mode surprises, and operator confusion. The operation erodes until someone pulls the plug, or someone gets hurt first. The root cause is always the same: commissioning wasn’t a system. It was a checklist somebody made up between mobilization and demo day.

I’ve spent years deploying autonomous haul trucks, from single-truck quarry pilots to multi-fleet mining operations. This article is about what I’ve learned building commissioning systems for ASAMs (per ISO 17757) — taking a deployment from “the machine completes a cycle” to “the system runs a shift.” The thinking is grounded in STPA, SpaceX’s approach to reality as a validation system, and field failures I wish I’d caught earlier. None of this is hypothetical.

The Site Is the Test System

Your sim environment, your lab bench, your factory acceptance test — useful for catching the obvious stuff before the machine touches dirt. But they are not the validation system. The site is. The haul road with its actual grade, dust, mixed traffic, and wireless propagation is the only thing that tells you what your system actually does.

Stage-gated commissioning structures that exposure. Each stage introduces progressively more reality, with defined pass criteria. When a gate fails, you know which variable broke it because you introduced it on purpose. Without gates, failures compound and you’re solving three things at once without knowing which fix worked.

Two Tracks, Fundamentally Different

Safety commissioning answers one question: what is required to prevent harm to people? The standard is binary. Either the system handles a condition safely, or it doesn’t operate in that condition. No negotiation, no schedule pressure.

The part people get wrong: safety commissioning does not require the ASAM to be productive, or even to complete its mission. It requires that off-nominal conditions — machine off path, sensor degraded, comms lost, person in the AOZ — don’t result in harm. A truck that stops dead and needs a manual reset every time it loses RTK fix has failed production commissioning. It has not failed safety commissioning. Different problems, different owners.

Production commissioning asks: is the system useful? Cycle time, spotting accuracy, dispatch reliability, shift availability. These matter for whether the deployment survives economically. But they’re not safety criteria, and a slow machine can be iterated while running safely. A machine with open safety gaps cannot run at all.

Safety Commissioning: Four Threads

Safety criteria come from hazard analysis — STPA or equivalent — not from intuition or importance. Every requirement you call safety-critical inherits full rigor: traceability, independent verification, formal closure. If it doesn’t trace to preventing harm, it doesn’t belong. Misclassification dilutes rigor on what actually matters.

The floor is fixed. The path is iterative: isolate variables, test, understand the failure, fix, re-test. What follows are four commissioning threads, organized by the question each answers.

Four-quadrant graphic titled "Safety commissioning: four threads," presenting the four diagnostic questions that define the safety floor of an autonomous mining commissioning program. Top left: Stop Architecture — can you stop this machine, every time, regardless of software state? Top right: Safe to Approach — can a person walk up to this machine without risk? Bottom left: Degraded-Mode Response — when something breaks, does the machine reach a safe state deterministically? Bottom right: Envelope Enforcement — does the system keep the right things inside the box and wrong things out? — The four threads of safety commissioning. Each thread is a question, and every thread must pass repeatably under production-representative conditions before the safety floor is established.

Stop Architecture

Can you stop this machine, every time, regardless of software state?

Three elements, one thread. The safety relay is the hardwired backbone — de-energize it and the machine stops regardless of what software is doing. Test with the supervisory system disabled. If the relay’s behavior depends on healthy software, it’s not a safety relay. Safe stop performance means specific numbers: maximum stopping distance from maximum speed on maximum grade at maximum payload, park brake applied, no resumption without deliberate human action. A great guideline for this, if you want to have certifiable safety, is ISO 3450. I’ve seen systems never tested on the 12% ramp they operate on, loaded. Test at envelope limits. The auto/manual interlock governs mode transitions — no autonomous commands in manual mode, manual overrides without latency, mode physically indicated so anyone approaching can see it. Ambiguous interlock state is a kill hazard. I don’t use that term loosely.

Safe to Approach

Can a person walk up to this machine without risk?

Distinct from “stopped.” A machine receiving dispatch commands with a watchdog that will resume on timeout is not safe to approach. I’ve watched this happen. Safe to approach requires: mission suspended and not silently resumable, motion commands blocked at the safety controller level, park brake applied, and state verifiable by the approaching person without a radio call – this can be measured against ISO 17757. Test by triggering the mode from every entry condition, then attempt motion via every channel. If anything moves, you’re not commissioned.

Degraded-Mode Response

When something breaks, does the machine reach a safe state deterministically?

GNSS loss, perception sensor loss, comms dropout, actuator fault — each must produce a defined safe state every time. Test by deliberate injection under load, on grade, mid-cycle. Not parked on flat ground. A GNSS dropout on a 10% loaded ramp is a fundamentally different event than one on flat ground at rest — the dynamics, the time-to-safe-state, the machine’s physical state during transition are all different. Any failure mode that produces inconsistent results across repeated injections is not commissioned. Non-deterministic safety behavior is not safety behavior.

Envelope Enforcement

Does the system keep the right things inside the box and wrong things out?

ODD boundaries and access control are one thread. The machine must stop before exiting its authorized envelope — geofence, speed zones, condition thresholds. If boundary logic can be defeated by GPS drift or a path planner edge case, that’s an open gap. I’ve seen geofences fail when an RTK base station resurveys and jumps position 3 meters. Your enforcement has to handle the sensor, not just the geometry. For access control, every unauthorized AOZ entry must produce a defined response. Test at detection limits — a person emerging from behind a berm in site dust conditions, not a high-vis worker standing in front of the sensor suite in daylight.

Safety commissioning is complete when every thread passes, repeatably, under production-representative conditions. Not before.

Production Commissioning

This track is explicitly iterative. Representative criteria for a haul truck ASAM: cycle time within 10–20% of manned baseline at equivalent payload and grade. Spotting accuracy under a meter. Dispatch latency under 2 seconds at peak load. Fewer than 0.5 unplanned interventions per operating hour. Availability above 85% across a month. Each can be improved post-go-live. None are reasons to halt if the safety floor holds.

From Truck Works to Shift Works

The gap between a machine that completes a cycle and a system that runs a shift is not closed by demo hours. It’s closed by structured commissioning — a safety track with a hard floor, a production track that iterates on top of it. A safety floor that’s genuinely established — not assumed, not skimmed because the demo looked good, not deferred because the schedule said go — that’s what lets you build a real operation without worrying about the foundation cracking.

The site is the test system. Commission it like you mean it.