Migration: v2.4 → v3.0
TL;DR — for most consumers, no changes are required. v3.0 is a backwards-compatible major-version bump. The major changed because v3.0 introduces a substantial new capability — distributed inference / device offload. The v2.x consumer-facing API surface — new DVAI(config), dvai.initialize(), dvai.baseUrl, the OpenAI HTTP wire — is preserved unchanged.
Set offload: { enabled: true } to opt into the new behaviour, and you'll want to read the Distributed Inference guide. Otherwise, your code keeps working.
What changed
Added
- Distributed inference / device offload. Opt-in via
new DVAI({ offload: { enabled: true, ... } }). The library can discover peer devices on the LAN (mDNS), pair with them via QR scan + a self-hosted rendezvous server for the internet path, and route inference requests to a more-capable peer when the local device's tok/s falls belowminLocalCapability(default 10). - Capability assessment.
await dvai.probeCapability()runs a cold-run probe against the active backend to measure decode tok/s for the current model.await dvai.getCapability(modelId?)returns the cached score — or a heuristic estimate if no probe has run. Cache lives per-runtime (IndexedDB / FS / native equivalents). - Pairing handshake. First-time peer-to-peer offload triggers a user-approval prompt via the host-app-supplied
onPairingRequestcallback. Approved pairings are persisted with HMAC-signed request authentication and a 30-day inactivity TTL. /v1/dvai/*HTTP endpoints.health,capability,peers,probe,handshake,pair-qr,pair-scan. All namespaced — the OpenAI surface (/v1/chat/completionsetc.) is unchanged.X-DVAI-Offloadper-request header.never | prefer (default) | require. Lets consumers override the offload policy per-request — useful for privacy-sensitive prompts (never) or strict-quality requirements (require).@dvai-bridge/corenew dep:@noble/curves ^1.6.0. Small, audited, no native deps — powers the X25519 key exchange in the rendezvous-pairing flow.rendezvous/at monorepo root — self-hostable WebSocket relay server. Deployable independently. See self-hosting rendezvous.
Changed
- Version bump major: 2.x → 3.0. No behavioural change for code that doesn't set
offload. SameDVAIConfigAPI; same OpenAI HTTP contract; same backends. DVAIConfiggains an optionaloffload?: OffloadConfig. Unset = v2.x behaviour exactly.DVAIinstance gainsprobeCapability(),getCapability(modelId?),getPeers()methods. Existing methods are unchanged.- The embedded HTTP server now also serves the
/v1/dvai/*namespace on top of the existing OpenAI endpoints. No collision risk — the namespace is new.
Removed
Nothing. v3.0 is purely additive.
Deprecated
Nothing. Every v2.x API surface remains supported.
Migrating per stack
JavaScript / TypeScript (@dvai-bridge/core + @dvai-bridge/react + @dvai-bridge/vanilla)
// v2.x — unchanged
const dvai = new DVAI({ backend: "auto" });
await dvai.initialize();
// v3.0 — opt into offload
const dvai = new DVAI({
backend: "auto",
offload: {
enabled: true,
discoverLAN: true,
minLocalCapability: 10,
rendezvousUrl: "wss://rendezvous.myapp.com", // optional
onPairingRequest: async (peer) => myAppUiConfirm(peer.deviceName),
},
});
await dvai.initialize();Don't set offload and your v2.x code runs unchanged.
iOS native (@dvai-bridge/ios)
The Swift SDK gains OffloadConfig on StartOptions:
// v3.0
let server = try await DVAIBridge.shared.start(
StartOptions(
backend: .auto,
modelPath: "/path/to/model.gguf",
offload: OffloadConfig(
enabled: true,
discoverLAN: true,
minLocalCapability: 10,
rendezvousUrl: URL(string: "wss://rendezvous.myapp.com")
)
)
)
// onPairingRequest is exposed as an AsyncSequence:
for await request in DVAIBridge.shared.pairingRequests {
let approved = await myUiConfirm(request.peerDeviceName)
request.respond(approved: approved)
}Android native (co.deepvoiceai:dvai-bridge)
Kotlin gains OffloadConfig in StartOptions:
// v3.0 — Application.onCreate(): one-time bootstrap
DVAIBridge.init(applicationContext)
// then anywhere from a coroutine:
val server = DVAIBridge.start(
StartOptions(
backend = BackendKind.Auto,
modelPath = "/path/to/model.gguf",
offload = OffloadConfig(
enabled = true,
discoverLAN = true,
minLocalCapability = 10.0,
rendezvousUrl = "wss://rendezvous.myapp.com",
),
),
)
// pairingRequests is a SharedFlow<PairingRequest>:
lifecycleScope.launch {
DVAIBridge.pairingRequests.collect { req ->
val approved = myUiConfirm(req.peerDeviceName)
req.respond(approved)
}
}React Native (@dvai-bridge/react-native)
// v3.0
const state = await DVAIBridge.start({
backend: BackendKind.Auto,
modelPath: "/path/to/model.gguf",
offload: {
enabled: true,
discoverLAN: true,
minLocalCapability: 10,
rendezvousUrl: "wss://rendezvous.myapp.com",
},
});
DVAIBridge.addListener("pairingRequest", async (req) => {
const approved = await myUiConfirm(req.peerDeviceName);
await DVAIBridge.respondToPairing(req.id, approved);
});Flutter (dvai_bridge)
// v3.0
final state = await DVAIBridge.instance.start(
backend: BackendKind.auto,
modelPath: '/path/to/model.gguf',
offload: OffloadConfig(
enabled: true,
discoverLAN: true,
minLocalCapability: 10,
rendezvousUrl: 'wss://rendezvous.myapp.com',
),
);
DVAIBridge.instance.pairingRequests.listen((req) async {
final approved = await myUiConfirm(req.peerDeviceName);
await req.respond(approved: approved);
});Capacitor (@dvai-bridge/capacitor + @dvai-bridge/capacitor-{llama,foundation,mediapipe,mlx})
// v3.0
import { DVAIBridge } from "@dvai-bridge/capacitor";
const server = await DVAIBridge.start({
backend: "llama",
modelPath: "/path/to/model.gguf",
offload: {
enabled: true,
discoverLAN: true,
minLocalCapability: 10,
rendezvousUrl: "wss://rendezvous.myapp.com",
},
});
await DVAIBridge.addListener("pairingRequest", async (req) => {
const approved = await myUiConfirm(req.peerDeviceName);
await DVAIBridge.respondToPairing(req.id, approved);
});addListener("pairingRequest") needs a successful start() first — the listener is dispatched on the active backend plugin.
.NET (DVAIBridge + DVAIBridge.Desktop / .iOS / .Android)
// v3.0
var server = await DVAIBridge.Shared.StartAsync(new StartOptions
{
Backend = BackendKind.Auto,
ModelPath = "/path/to/model.gguf",
Offload = new OffloadConfig
{
Enabled = true,
DiscoverLAN = true,
MinLocalCapability = 10,
RendezvousUrl = new Uri("wss://rendezvous.myapp.com"),
},
});
await foreach (var req in DVAIBridge.Shared.PairingRequests)
{
var approved = await MyUiConfirm(req.PeerDeviceName);
await req.RespondAsync(approved);
}Self-hosting the rendezvous server (optional)
Want the internet path for offload — devices on different networks pairing via QR scan? You need to self-host a rendezvous server. The server lives in rendezvous/ at the monorepo root. Deployable in 5 minutes via the one-click buttons (Railway / DigitalOcean) or any Node-22+ host.
Skip the deploy — or skip rendezvousUrl in your config — and the internet path is disabled. Your apps fall back to LAN discovery only. That's the right choice for many use-cases.
Operational notes
- Cache locations. First-run probe + pairing data persist per-runtime:
- Browser: IndexedDB under
dvai-bridge:capability:v1andpairings-v1. - Node / Electron:
~/.cache/dvai-bridge/(or%LOCALAPPDATA%\dvai-bridge\). - iOS / Mac Catalyst:
Application Support/dvai-bridge/. - Android:
applicationContext.cacheDir/dvai-bridge/. - .NET Desktop:
Environment.SpecialFolder.LocalApplicationData/dvai-bridge/.
- Browser: IndexedDB under
@dvai-bridge/coreadds one new dependency:@noble/curves. Tiny (~5 KB), audited, no native deps.- mDNS for LAN discovery uses
multicast-dns(Node) — declared as optional. Install it for LAN discovery on the JS-side; native SDKs use platform-native mDNS (NWBrowser / NsdManager / Makaretu). - Browser apps can be offload sources — request inference from a peer — but NOT targets. Browsers can't accept inbound HTTP reliably. LAN discovery is a no-op in browsers; they only see rendezvous-paired peers.
When NOT to upgrade
If your app:
- Runs on a single device class with capability above the model's needs;
- Has no need for cross-device inference;
- Doesn't want to manage even an optional rendezvous server;
then v2.4.x continues to work fine. The v3.0 backwards compatibility guarantee means upgrading later is also painless — you can hold off.
Reporting issues
The v3.0 line is new. Hit an edge case — mDNS-blocked enterprise networks, captive portals, NAT traversal failures via rendezvous, multi-NIC hosts, IPv6 mDNS? File an issue at https://github.com/dvai-global/dvai-bridge/issues with [v3.0] in the title. The 2-device LAN/internet test rig in docs/development/distributed-inference-testing.md is the canonical repro template.
