Fleet-Level Position Management for Open Autonomy: Provisioning, Datum Strategy, and Telemetry Done Right

Mar 18
10 min read

By: Aaron Nathan, CEO, Point One Navigation

Open autonomy is a compelling promise: purpose-built robots, vehicles, and machines from different vendors operating together on shared infrastructure, coordinated by open standards rather than locked into a single OEM stack. But that promise often fails when positioning breaks, and in mixed-fleet environments, positioning breaks in ways that are surprisingly easy to miss.

A centimeter-level error that goes unreported doesn't announce itself. A robot continues working. Its telemetry looks fine. The systematic bias in its reported position only becomes visible when it scrapes a wall, drifts into a neighboring work zone, or paints a line two inches off-center for an entire shift. By then, the cause is hard to trace.

This article is a practitioner's guide to preventing that class of failure, from the moment you provision your first device to the architecture decisions that determine whether your fleet scales gracefully or accumulates technical debt. The patterns here apply regardless of which correction provider, receiver hardware, or autonomy stack you're running. They're designed to work across open, mixed-fleet environments where no single vendor owns the full system.

Why Positioning Is the Infrastructure Layer Nobody Owns

In ISO 23725's flexible boundary model for autonomous mobile equipment, the positioning function sits at the intersection of machine, fleet manager, and site; no single layer fully owns it, and that's intentional. What the standard doesn't prescribe is how to make that handoff clean.

In practice, most mixed-fleet deployments end up with positioning managed ad hoc. Each OEM handles corrections delivery for their equipment, datum assumptions go undocumented, and telemetry from different platforms arrives in coordinate systems that almost, but not quite, agree. The result is a positioning architecture that works fine during single-vendor pilots and degrades unpredictably as the fleet grows.

The fix isn't a centralized positioning authority. It's treating the correction delivery, provisioning, datum, and telemetry layers as infrastructure with explicit design decisions, ones you make once, document clearly, and apply consistently across every device on the network.

Part 1: RTK Corrections Delivery, RTCM, NTRIP, and What Actually Matters

Real-time kinematic corrections are what turn a GNSS receiver's meter-level position into centimeter-level accuracy. Getting that correction signal to every device in your fleet, reliably, at low latency, without bandwidth surprises, is the unsexy infrastructure problem that determines whether your autonomy stack works in production.

The RTCM/NTRIP Stack

The industry standard for corrections delivery is RTCM messages over NTRIP (Networked Transport of RTCM via Internet Protocol). RTCM is the message format; NTRIP is the delivery protocol. Together they're well-supported across receiver hardware from virtually every manufacturer, which makes them the right default for mixed-fleet environments.

A few design decisions matter at this layer:

Message type selection. RTCM 3.x MSM (Multiple Signal Messages) are the modern standard. MSM4 provides good carrier phase data for most applications; MSM7 adds full-resolution observables for high-precision or multi-frequency receivers. If your fleet includes a mix of receiver grades, MSM7 is the conservative choice, lower-spec receivers will use what they can, and higher-spec receivers won't be constrained.

Update rate. 1 Hz is the default for most NTRIP streams and is sufficient for most applications. High-speed autonomous platforms (faster than roughly 5 m/s in dynamic environments) benefit from higher update rates, but this is a receiver capability question as much as a network question, make sure to confirm both ends support it before specifying it.

Latency budget. End-to-end correction latency, from the reference station observation to when your rover applies it, should stay below 2–3 seconds for real-time applications. Beyond that, the corrections begin to describe atmospheric conditions that no longer match your rover's environment. Monitor this in production; cellular connectivity and cloud routing can introduce spikes that don't show up in lab testing.

Connection resilience. NTRIP connections drop. Your device firmware or middleware needs a reconnection strategy with exponential backoff, a fallback mountpoint, and clear state management around what the system does during the reconnect interval. An autonomous system that freezes or faults on a 30-second NTRIP outage has a resilience problem; one that gracefully degrades to float or switches to a secondary correction source is more deployable.

Bandwidth Considerations

Full observation-space (OSR) corrections, the kind used in single-baseline and network RTK, typically run 5–15 kbps per device. Reminder: VRS-style corrections run 1–3 kbps. For a fleet of 20 devices on a site with limited cellular backhaul, that difference adds up. If you're deploying in bandwidth-constrained environments (underground, remote, satellite-connected), factor corrections bandwidth into your network architecture before you discover it in the field.

Part 2: Device Provisioning and Fleet Grouping

Corrections delivery is a network problem. Device provisioning is a fleet management problem. They overlap, and handling that overlap cleanly is what separates a 5-robot pilot from a 200-robot fleet operation.

Mountpoints and Credentials at Scale

Each device connecting to an NTRIP caster needs a mountpoint (defining which correction stream to use) and credentials (authentication to use it). In small deployments, these get hardcoded. In larger ones, hardcoded credentials create operational problems: rotating a compromised credential requires touching every device individually, adding a new site requires manual reconfiguration, and auditing who's connected to what requires manual inspection. The results of, for example, using the same credentials on the two devices simultaneously can create wasted work for days or weeks as one device will not have RTK corrected location data.

The better pattern is centralized credential management via API, with devices pulling their mountpoint assignments dynamically at connection time. This decouples the physical device from the correction stream assignment, you can move a device from one site to another (potentially with different mountpoints, different datum configurations, or different correction providers) through a configuration change rather than a firmware update.

Grouping Strategy

Mixed fleets benefit from explicit grouping at the provisioning layer: by site, by work zone within a site, by vehicle class, or by correction type. Grouping enables:

Selective mountpoint assignment. High-precision applications (grading, drilling, line painting) can be directed to denser correction streams; logistics robots with looser accuracy requirements can use broader-area coverage.
Telemetry aggregation. When a positioning anomaly occurs, knowing which devices share the same correction stream and the same atmospheric window is essential for root cause analysis.
Staged rollouts. When you update correction configuration, changing mountpoints, rotating credentials, switching correction providers, group-level rollout lets you validate on a subset before fleet-wide deployment.

Modern correction provider APIs support device grouping natively. If yours doesn't, implement it at your fleet management layer rather than treating every device as an independent entity. Or better yet, use a corrections network that allows you to dynamically provision.

Provisioning Checklist

Credentials issued per-device or per-group, not shared across fleet
Mountpoint assignment managed centrally, overridable per device
Connection state (connected / float / fixed / disconnected) surfaced to fleet manager
Reconnection logic tested under simulated outage conditions
Bandwidth per-device profiled against available site connectivity

Part 3: Datum Strategy, The Source of Fleet Disagreement

Datum is the coordinate reference framework your positioning data lives in. Get it wrong, or get it inconsistently applied across different devices and software layers, and two machines that are both "accurate" can still disagree about where they are by decimeters.

This is the most commonly underspecified layer in mixed-fleet deployments.

The Core Issue

Most GNSS receivers natively output positions in WGS84 (the reference frame GPS is built on). But most site operations work in a local coordinate system, a state plane coordinate system, a site grid, or a project-specific datum. The transformation between WGS84 and your local system involves a datum transformation that, if different components of your system apply differently, introduces a systematic offset that looks like positioning error but isn't.

In a single-vendor system, this usually gets handled internally and invisibly. In a mixed-fleet system, you may have:

Robots outputting in WGS84
A fleet manager working in a local coordinate system
A site survey in a third reference frame
Correction streams configured for yet another datum

Each mismatch is a potential offset. A 10–30 cm datum mismatch, easily achievable if transformation parameters are applied inconsistently, is large enough to cause real operational problems in precision applications.

Practical Datum Decisions

Choose a single site datum and document it. Before deploying any autonomous equipment, establish the site coordinate system, the datum it's tied to, and the transformation parameters (or the grid shift file) needed to convert from GNSS output to site coordinates. This SOP-style document should be part of site setup, not an afterthought.

Push the transformation to a single layer. Ideally, datum transformation happens once, either at the correction provider layer (if using a mountpoint configured in your site datum), at the fleet manager layer, or at a positioning middleware layer. Avoid having multiple components each applying partial transformations.

Verify with ground truth. Before commissioning autonomous operations, drive or walk a device over a set of surveyed control points and verify that reported positions match known positions in your site coordinate system. A 2 cm error at a control point is acceptable, a 15 cm systematic offset indicates a datum problem.

Document mountpoint datum. When using NTRIP, each mountpoint has an associated datum. This is listed in the NTRIP source table. If your fleet software is applying a coordinate transformation but your correction provider is already outputting in a different datum, you may be double-transforming. Check this explicitly.

Datum Checklist

Site coordinate system defined and documented before deployment
All devices and software layers confirmed to use consistent datum
Transformation layer identified (correction provider, fleet manager, or middleware)
Control point verification completed at commissioning
Mountpoint datum confirmed against fleet manager datum assumption

Part 4: Telemetry Architecture, Privacy, Integrity, and Diagnostics

Position telemetry from autonomous fleets is operationally valuable and legally sensitive. Done well, it powers predictive maintenance, root cause analysis, and multi-fleet coordination. Done poorly, it creates privacy liability, generates noise that obscures real signals, and makes post-incident debugging harder than it should be.

GNSS monitoring dashboard displaying satellite skyplots, carrier-to-noise ratios across GPS and Galileo constellations, and multi-frequency signal quality over time — the telemetry layer used to diagnose fleet positioning performance

What to Capture

Not all position data is equally useful. A raw stream of positions at 10 Hz from every device in your fleet creates storage and processing demands that exceed most organizations' capacity to act on it. A more practical architecture captures:

Solution quality metadata alongside position. Fix type (fixed, float, standalone), number of satellites, HDOP/PDOP, correction age, and baseline length are more diagnostically useful than raw positions for most fleet management purposes. A device reporting "float solution for 45 minutes" is a flag; a device reporting consistent "fixed, 1.2 cm" is not.

Event-triggered full telemetry. High-frequency position logs are most valuable when something goes wrong. Designing telemetry to capture full-resolution data around events, a fault condition, an unexpected stop, a position jump, rather than continuously gives you the data you need without the storage overhead.

Correction source traceability. Which mountpoint was in use, which correction provider, at what latency, this should be logged alongside position data. When a positioning anomaly occurs, the first diagnostic question is whether it was a correction delivery problem, an atmospheric event, or a receiver issue. You can't answer that without knowing what corrections were active.

Privacy Architecture

Fleet position telemetry can expose operational patterns, production rates, shift schedules, site layout, that operators may consider confidential. In multi-tenant or contractor environments, one party's telemetry may reveal information another party has a legitimate interest in protecting.

A privacy-preserving telemetry architecture separates:

Operational telemetry (real-time correction status, fix quality, connectivity), needed by fleet managers and correction providers for service monitoring
Historical position logs, needed by site operators for production reporting, but potentially sensitive
Full raw data, needed only for post-incident debugging, should be access-controlled

Data minimization is the right default: capture what you need for operational purposes, retain what you need for diagnostic purposes, and treat high-resolution historical position data as sensitive by default. This is increasingly a regulatory requirement in some jurisdictions, not just a best practice.

Diagnostic Replayability

When a positioning failure occurs, and in long-running autonomous operations, it will eventually occur, the ability to reconstruct exactly what happened is critical. This requires:

Logged raw correction data (or the ability to retrieve it from your provider)
Device logs with fix type, satellite state, and correction metadata at the relevant timestamps
A clear chain from "what correction was delivered" to "what position was reported"

Some correction architectures make this easier than others. Single-baseline RTK, for example, offers a fully deterministic correction chain: you can retrieve the raw observables from the reference station, replay them against your rover's observation log, and exactly reproduce the positioning result. More complex network correction architectures may not offer the same replayability, so it’s worth asking your provider explicitly before you're debugging a field incident at 2 AM.

Telemetry Checklist

Fix quality metadata logged alongside position
Correction source (mountpoint, provider, latency) logged with position
Event-triggered high-resolution logging defined
Data retention policy documented by data type
Post-incident replay capability confirmed with correction provider
Access controls defined for historical position data

Part 5: Scaling from Pilot to Fleet, What Changes

A 5-robot pilot and a 200-robot fleet face the same technical requirements. What changes is the cost of getting them wrong.

Credential management works fine when you're manually configuring five devices but becomes a vulnerability when you're managing two hundred. Automate it early.

Datum inconsistencies that produce a 10 cm offset on one robot are still a 10 cm offset on two hundred robots, but now they're producing 200 bad data points per second instead of five. Establish your datum strategy at pilot, not at scale.

Bandwidth assumptions made during a pilot with five devices and good LTE coverage may not hold at a 50-device deployment in a building with three LTE dead zones. Model bandwidth per device, not just aggregate.

Excursion events, the tail of the accuracy distribution where positioning degrades beyond application tolerance, occur rarely per device but frequently across a large fleet. A correction architecture that produces a positioning excursion 0.5% of the time will produce roughly one excursion per device per 200 operating hours. Across a 100-device fleet running 10 hours/day, that's an excursion somewhere in the fleet roughly every 2 hours. Design your safety architecture accordingly, and evaluate correction providers on 99th-percentile tail performance, not median accuracy.

Network density beneath your correction provider becomes more visible at scale. When one device in a fleet has a bad positioning day, it might be equipment. When five devices in the same geographic area have bad days simultaneously, it's almost certainly atmospheric, and the question is whether your correction network had adequate station density to sample and correct for that atmospheric event. Ask your provider for inter-station spacing in your deployment region, not just coverage area.

A Framework for Mixed-Fleet Positioning Architecture

Putting it together, a positioning architecture that scales across open, mixed-fleet environments rests on five decisions made explicitly rather than by default:

Corrections delivery protocol and quality: RTCM/NTRIP with MSM7, monitored latency, resilient reconnection logic, and bandwidth budgeted per device.
Provisioning model: Per-device or per-group credentials managed via API, mountpoint assignment centralized and auditable, connection state surfaced to fleet manager.
Datum strategy: Single site datum documented before deployment, transformation applied at one layer, verified against control points at commissioning.
Telemetry architecture: Fix quality metadata logged, correction source traced, high-resolution data captured on events, historical data access-controlled.
Network density evaluation: Correction provider assessed on inter-station spacing and 99th-percentile tail performance, not just coverage area or correction type.

None of these decisions require a particular vendor, a particular receiver, or a particular correction type. They're architectural decisions that work across the open autonomy stack, and making them explicitly is what separates positioning infrastructure that scales from positioning infrastructure that surprises you.

Aaron Nathan is CEO of Point One Navigation, which builds precise positioning infrastructure for autonomous systems, including GNSS correction services used in mining, construction, and logistics.