Skip to content

Migration: v2.4 → v3.0

TL;DR — for most consumers, no changes are required. v3.0 is a backwards-compatible major-version bump. The major changed because v3.0 introduces a substantial new capability — distributed inference / device offload. The v2.x consumer-facing API surface — new DVAI(config), dvai.initialize(), dvai.baseUrl, the OpenAI HTTP wire — is preserved unchanged.

Set offload: { enabled: true } to opt into the new behaviour, and you'll want to read the Distributed Inference guide. Otherwise, your code keeps working.

What changed

Added

  • Distributed inference / device offload. Opt-in via new DVAI({ offload: { enabled: true, ... } }). The library can discover peer devices on the LAN (mDNS), pair with them via QR scan + a self-hosted rendezvous server for the internet path, and route inference requests to a more-capable peer when the local device's tok/s falls below minLocalCapability (default 10).
  • Capability assessment. await dvai.probeCapability() runs a cold-run probe against the active backend to measure decode tok/s for the current model. await dvai.getCapability(modelId?) returns the cached score — or a heuristic estimate if no probe has run. Cache lives per-runtime (IndexedDB / FS / native equivalents).
  • Pairing handshake. First-time peer-to-peer offload triggers a user-approval prompt via the host-app-supplied onPairingRequest callback. Approved pairings are persisted with HMAC-signed request authentication and a 30-day inactivity TTL.
  • /v1/dvai/* HTTP endpoints. health, capability, peers, probe, handshake, pair-qr, pair-scan. All namespaced — the OpenAI surface (/v1/chat/completions etc.) is unchanged.
  • X-DVAI-Offload per-request header. never | prefer (default) | require. Lets consumers override the offload policy per-request — useful for privacy-sensitive prompts (never) or strict-quality requirements (require).
  • @dvai-bridge/core new dep: @noble/curves ^1.6.0. Small, audited, no native deps — powers the X25519 key exchange in the rendezvous-pairing flow.
  • rendezvous/ at monorepo root — self-hostable WebSocket relay server. Deployable independently. See self-hosting rendezvous.

Changed

  • Version bump major: 2.x → 3.0. No behavioural change for code that doesn't set offload. Same DVAIConfig API; same OpenAI HTTP contract; same backends.
  • DVAIConfig gains an optional offload?: OffloadConfig. Unset = v2.x behaviour exactly.
  • DVAI instance gains probeCapability(), getCapability(modelId?), getPeers() methods. Existing methods are unchanged.
  • The embedded HTTP server now also serves the /v1/dvai/* namespace on top of the existing OpenAI endpoints. No collision risk — the namespace is new.

Removed

Nothing. v3.0 is purely additive.

Deprecated

Nothing. Every v2.x API surface remains supported.

Migrating per stack

JavaScript / TypeScript (@dvai-bridge/core + @dvai-bridge/react + @dvai-bridge/vanilla)

ts
// v2.x — unchanged
const dvai = new DVAI({ backend: "auto" });
await dvai.initialize();

// v3.0 — opt into offload
const dvai = new DVAI({
  backend: "auto",
  offload: {
    enabled: true,
    discoverLAN: true,
    minLocalCapability: 10,
    rendezvousUrl: "wss://rendezvous.myapp.com",  // optional
    onPairingRequest: async (peer) => myAppUiConfirm(peer.deviceName),
  },
});
await dvai.initialize();

Don't set offload and your v2.x code runs unchanged.

iOS native (@dvai-bridge/ios)

The Swift SDK gains OffloadConfig on StartOptions:

swift
// v3.0
let server = try await DVAIBridge.shared.start(
  StartOptions(
    backend: .auto,
    modelPath: "/path/to/model.gguf",
    offload: OffloadConfig(
      enabled: true,
      discoverLAN: true,
      minLocalCapability: 10,
      rendezvousUrl: URL(string: "wss://rendezvous.myapp.com")
    )
  )
)

// onPairingRequest is exposed as an AsyncSequence:
for await request in DVAIBridge.shared.pairingRequests {
  let approved = await myUiConfirm(request.peerDeviceName)
  request.respond(approved: approved)
}

Android native (co.deepvoiceai:dvai-bridge)

Kotlin gains OffloadConfig in StartOptions:

kotlin
// v3.0 — Application.onCreate(): one-time bootstrap
DVAIBridge.init(applicationContext)

// then anywhere from a coroutine:
val server = DVAIBridge.start(
  StartOptions(
    backend = BackendKind.Auto,
    modelPath = "/path/to/model.gguf",
    offload = OffloadConfig(
      enabled = true,
      discoverLAN = true,
      minLocalCapability = 10.0,
      rendezvousUrl = "wss://rendezvous.myapp.com",
    ),
  ),
)

// pairingRequests is a SharedFlow<PairingRequest>:
lifecycleScope.launch {
  DVAIBridge.pairingRequests.collect { req ->
    val approved = myUiConfirm(req.peerDeviceName)
    req.respond(approved)
  }
}

React Native (@dvai-bridge/react-native)

ts
// v3.0
const state = await DVAIBridge.start({
  backend: BackendKind.Auto,
  modelPath: "/path/to/model.gguf",
  offload: {
    enabled: true,
    discoverLAN: true,
    minLocalCapability: 10,
    rendezvousUrl: "wss://rendezvous.myapp.com",
  },
});

DVAIBridge.addListener("pairingRequest", async (req) => {
  const approved = await myUiConfirm(req.peerDeviceName);
  await DVAIBridge.respondToPairing(req.id, approved);
});

Flutter (dvai_bridge)

dart
// v3.0
final state = await DVAIBridge.instance.start(
  backend: BackendKind.auto,
  modelPath: '/path/to/model.gguf',
  offload: OffloadConfig(
    enabled: true,
    discoverLAN: true,
    minLocalCapability: 10,
    rendezvousUrl: 'wss://rendezvous.myapp.com',
  ),
);

DVAIBridge.instance.pairingRequests.listen((req) async {
  final approved = await myUiConfirm(req.peerDeviceName);
  await req.respond(approved: approved);
});

Capacitor (@dvai-bridge/capacitor + @dvai-bridge/capacitor-{llama,foundation,mediapipe,mlx})

ts
// v3.0
import { DVAIBridge } from "@dvai-bridge/capacitor";

const server = await DVAIBridge.start({
  backend: "llama",
  modelPath: "/path/to/model.gguf",
  offload: {
    enabled: true,
    discoverLAN: true,
    minLocalCapability: 10,
    rendezvousUrl: "wss://rendezvous.myapp.com",
  },
});

await DVAIBridge.addListener("pairingRequest", async (req) => {
  const approved = await myUiConfirm(req.peerDeviceName);
  await DVAIBridge.respondToPairing(req.id, approved);
});

addListener("pairingRequest") needs a successful start() first — the listener is dispatched on the active backend plugin.

.NET (DVAIBridge + DVAIBridge.Desktop / .iOS / .Android)

csharp
// v3.0
var server = await DVAIBridge.Shared.StartAsync(new StartOptions
{
    Backend = BackendKind.Auto,
    ModelPath = "/path/to/model.gguf",
    Offload = new OffloadConfig
    {
        Enabled = true,
        DiscoverLAN = true,
        MinLocalCapability = 10,
        RendezvousUrl = new Uri("wss://rendezvous.myapp.com"),
    },
});

await foreach (var req in DVAIBridge.Shared.PairingRequests)
{
    var approved = await MyUiConfirm(req.PeerDeviceName);
    await req.RespondAsync(approved);
}

Self-hosting the rendezvous server (optional)

Want the internet path for offload — devices on different networks pairing via QR scan? You need to self-host a rendezvous server. The server lives in rendezvous/ at the monorepo root. Deployable in 5 minutes via the one-click buttons (Railway / DigitalOcean) or any Node-22+ host.

Skip the deploy — or skip rendezvousUrl in your config — and the internet path is disabled. Your apps fall back to LAN discovery only. That's the right choice for many use-cases.

Operational notes

  • Cache locations. First-run probe + pairing data persist per-runtime:
    • Browser: IndexedDB under dvai-bridge:capability:v1 and pairings-v1.
    • Node / Electron: ~/.cache/dvai-bridge/ (or %LOCALAPPDATA%\dvai-bridge\).
    • iOS / Mac Catalyst: Application Support/dvai-bridge/.
    • Android: applicationContext.cacheDir/dvai-bridge/.
    • .NET Desktop: Environment.SpecialFolder.LocalApplicationData/dvai-bridge/.
  • @dvai-bridge/core adds one new dependency: @noble/curves. Tiny (~5 KB), audited, no native deps.
  • mDNS for LAN discovery uses multicast-dns (Node) — declared as optional. Install it for LAN discovery on the JS-side; native SDKs use platform-native mDNS (NWBrowser / NsdManager / Makaretu).
  • Browser apps can be offload sources — request inference from a peer — but NOT targets. Browsers can't accept inbound HTTP reliably. LAN discovery is a no-op in browsers; they only see rendezvous-paired peers.

When NOT to upgrade

If your app:

  • Runs on a single device class with capability above the model's needs;
  • Has no need for cross-device inference;
  • Doesn't want to manage even an optional rendezvous server;

then v2.4.x continues to work fine. The v3.0 backwards compatibility guarantee means upgrading later is also painless — you can hold off.

Reporting issues

The v3.0 line is new. Hit an edge case — mDNS-blocked enterprise networks, captive portals, NAT traversal failures via rendezvous, multi-NIC hosts, IPv6 mDNS? File an issue at https://github.com/dvai-global/dvai-bridge/issues with [v3.0] in the title. The 2-device LAN/internet test rig in docs/development/distributed-inference-testing.md is the canonical repro template.

See also