Blog Post / Insight

Scaling an IoT Gateway from One to Three Devices: What Breaks First

Author: Published: Reading time: 12 min read

A practical deep dive into what changes when a Rust-based IoT gateway and audit engine move from a single STM32 device to a 3-device setup: identity, buffering, reconnect behavior, battery constraints, and evidence quality.

A practical deep dive into what changes when a Rust-based IoT gateway and audit engine move from a single STM32 device to a 3-device setup: identity, buffering, reconnect behavior, battery constraints, and evidence quality.

Most IoT demos start with one device and one happy-path flow. That is useful for proving a concept, but it hides the operational complexity that appears the moment a gateway has to handle multiple field devices with unstable connectivity and uneven power conditions.

This post is my working blueprint for a 3-device demo using a Rust IoT gateway and the Combotto audit engine. The current baseline is one battery-powered STM32 device (without case) connected over Wi-Fi and powered from a power bank.

The goal is not a polished product demo. The goal is to validate how quickly reliability and auditability can degrade when moving from one edge node to several.

Why 1 to 3 devices matters

A single device mostly tests correctness. Three devices starts testing operations:

  • Is identity and topic design clean enough to avoid cross-device confusion?
  • Does reconnect logic create duplicates, data gaps, or backpressure spikes?
  • Do battery and network differences change message quality over time?
  • Can the audit engine still produce deterministic evidence when behavior diverges per device?

The first production risks often show up here, not at 100 devices.

Demo setup (target architecture)

Initial setup for the demo slice:

  • 1 gateway service (Rust) for secure ingestion, buffering, replay, and forwarding.
  • 3 field devices publishing telemetry over MQTT with per-device identity.
  • 1 MQTT broker as edge boundary.
  • 1 audit engine run path to turn observed system posture into findings + evidence.
  • Battery-powered operating constraints to surface reconnect and runtime drift behavior.

I currently have one STM32 in this setup and will incrementally add two more device identities to make failure modes visible.

Rust gateway deep dive: what has to change

Moving to multiple devices changes the gateway responsibilities in concrete ways:

1) Device identity and topic namespace

The gateway needs strict topic conventions that encode tenant/site/device identity and message type. If topic design is loose, data from multiple devices becomes hard to trust during incident analysis.

What I will enforce in the demo:

  • Stable per-device IDs in topic path and payload envelope.
  • Explicit message classing (telemetry, health, control-ack).
  • Validation on ingest to reject malformed or ambiguous routes.

2) Buffering and replay correctness

A write-ahead durability path is useful only if replay is deterministic.

At 3 devices, I expect to stress:

  • Ordering guarantees across reconnect windows.
  • Deduplication rules when the same payload is retried.
  • Backpressure behavior when one device reconnects aggressively.

The key output is not raw throughput. It is whether replayed events can be explained and traced under failure.

3) Reconnect behavior under uneven power/network

Battery-powered edge devices will reconnect with different timing patterns. The gateway needs to absorb this without turning transient instability into persistent noise.

For this demo, I will track:

  • Retry cadence per device.
  • Time-to-recovery after broker interruption.
  • Duplicate/replay ratio before and after reconnect events.

Audit engine deep dive: evidence quality at multi-device boundary

The audit engine has to do more than report a binary pass/fail. It needs to preserve evidence quality as topology complexity rises.

1) Per-device evidence traceability

Each finding should map to a specific device identity, timestamp window, and transport path so remediation work can be assigned quickly.

2) Run-to-run delta clarity

As the demo evolves, the most valuable signal is change over time:

  • What regressed from baseline run?
  • What improved after gateway adjustments?
  • Which risk remains unstable because of power/network constraints?

3) Operator-ready reporting

The report output should stay useful for both:

  • Technical remediation (engineers)
  • Priority and risk decisions (leadership)

That split is essential if this demo is used in real customer conversations.

What became harder at 3 devices (expected)

These are the failure surfaces I expect to become visible quickly:

  • Topic hygiene and identity drift
  • Reconnect storm side effects
  • Hidden data-quality differences between devices
  • Audit evidence becoming noisy if normalization is weak
  • Debugging time increasing without strong observability signals

Architecture and flow

I will include a final architecture diagram and telemetry/evidence sequence in the published version.

Placeholder for the section assets:

  • TODO: system context diagram (device -> mqtt -> rust gateway -> storage/forwarding -> audit engine).
  • TODO: sequence flow for reconnect + replay + audit evidence capture.

Demo video walkthrough

I will embed a video walkthrough showing:

  • Live device publishing across 3 identities
  • Gateway ingest + buffering + replay behavior
  • Audit output (findings, evidence, and deltas)

Video placeholder:

  • TODO: add hosted video link/embed.

Lessons learned (living section)

As I build the 3-device slice, I will update this section with concrete outcomes:

  1. What worked immediately
  2. What broke first
  3. What required architecture changes
  4. What should be hardened before production

Why this matters for production IoT teams

The point of this demo is to make early-stage operational risks visible while the system is still cheap to change.

If your team is moving from a single-device proof-of-concept toward real field rollout, this is exactly where an architecture audit and hardening sprint prevents expensive rework later.

If you want help auditing or hardening your gateway path, send your current topology (devices, broker, cloud path) and I can propose a focused scope.

Have a question about this blog post?

If you’re considering an IoT Infrastructure Audit or reliability sprint, send a short message about your devices and current setup. You’ll get a same-day reply with clear next steps.

Phone: +45 22 39 34 91 or email: tb@combotto.io.

Typical response: same business day.