Live demo
A real chain, three peer daemons, one prompt.
The interesting bit of IntelNav is the chain: a transformer split across multiple machines, each holding a contiguous range of layers. Below, three intelnav-node daemons cover layers 6..24 of Qwen 2.5 · 0.5B. The chat client runs layers 0..6 locally and forwards activations through them.
Chat, over the chain
asciicast · ~65s · the TUI in --mode network: type a prompt, hidden states stream through three peers, tokens come back.
Note the banner: peer chain 127.0.0.1:17717,17718,17719 · splits=[6, 12, 18]. The chat client owns layers 0..6 plus the head; each peer owns one slice of the middle. Qwen 0.5B is the smallest model we ship, and its arithmetic is bad (the answer in the cast is 349, which is wrong); the point of the cast is the infrastructure, not the model.
The orchestration, behind the scenes
asciicast · ~80s · setup → start three daemons → status → drive a prompt through the chain → stop.
Run the same demo on your machine
The recording above is one-line reproducible. Build the binaries, run local-swarm.sh setup to prepare the sandbox, start to spawn the three daemons,ask to drive a prompt through the chain.
bash scripts/local-swarm.sh setup && bash scripts/local-swarm.sh start && bash scripts/local-swarm.sh ask 'what is 17 squared?'The script lives at scripts/local-swarm.sh. The protocol on the wire is the production one; the “sandbox” is just that all four parties live on the same machine. Swap the loopback addresses for real hosts and you have a multi-machine swarm.
What you're looking at
local-swarm.sh setup: writes three peer directories under/tmp/intelnav-swarm/peer-{a,b,c}/, each with its own config and akept_ranges.jsonsidecar that declares the layer range it owns.start: spawns threeintelnav-nodeprocesses on ports 17717, 17718, and 17719. Each binds libp2p, registers the control RPC, and brings up a forward listener for chains that hit its slice.ask: runsintelnav --mode network askwith a hard-codedpeerslist (the sandbox skips the DHT lookup). The chat client loads its front slice (layers 0..6), opens TCP sessions to each peer, and runs the chain. Activations flow through every hop, the head samples a token, and the response comes back.- Tokens stream upstream. The demo uses Qwen 2.5 · 0.5B because that's the smallest model we ship; its arithmetic isn't reliable, and that isn't what the cast is for.
stop: sends SIGTERM. The daemons drain any in-flight chains (there are none here) and exit cleanly.
The first-run UX
Fresh install, end to end.
For completeness, here is what installing IntelNav from scratch looks like before any of the network features come into play. Auto-config, contribution gate, /models picker, real download from HuggingFace, real prompt.
asciicast · ~140s real time · loops automatically