Quickstart: Capacitor
Run a local LLM inside a Capacitor 6, 7, or 8 app. End to end. By the end of this page, your app talks to an OpenAI-compatible HTTP endpoint — served by a native HTTP server bound to 127.0.0.1.
Prerequisites
- A Capacitor 6, 7, or 8 app.
npx cap doctorshould report green. - Node.js 20+ and
pnpm. npm or yarn work too. - iOS: Xcode 16+ with the iOS 17+ SDK. Apple Foundation Models needs iOS 26+ at runtime — see the backend matrix below.
- Android: Android Studio with NDK r27+ and JDK 21.
compileSdk 35(or the current Capacitor 8 default) on your app module.
1. Install the packages
@dvai-bridge/capacitor is a thin JS routing shim. It does nothing on its own. You also install one or more backend plugins — each ships native code.
# Required: the JS shim + at least one backend.
pnpm add @dvai-bridge/capacitor @dvai-bridge/capacitor-llama
# Optional: framework wrapper.
pnpm add @dvai-bridge/core @dvai-bridge/react # or @dvai-bridge/vanillaThree backend plugins are available:
| Package | Backend | Platforms | Pick when… |
|---|---|---|---|
@dvai-bridge/capacitor-llama | llama.cpp | iOS + Android | You want GGUF support, the broadest model selection, and optional vision via mmproj. |
@dvai-bridge/capacitor-foundation | Apple Foundation Models | iOS 26+ | You want zero-download text inference on Apple silicon devices. |
@dvai-bridge/capacitor-mediapipe | MediaPipe LLM Inference | Android | You want Google's .task runtime — including vision-capable Gemma variants. |
Mix freely. You only start() one backend at a time. But installing both capacitor-llama and capacitor-mediapipe lets you pick at runtime — by platform, or by user setting.
2. cap sync
After installing backend plugins:
npx cap syncThis step is mandatory. It does two things:
- iOS — adds the plugin's
Package.swift/ podspec to your Xcode project. The first build pulls Hummingbird (HTTP server) and swift-nio transitively. Runpnpm cap open iosand let CocoaPods / SwiftPM resolve once. - Android — registers the plugin's Gradle module, merges its
AndroidManifest.xml(which declares thenetwork_security_config.xmlwhitelisting cleartext to127.0.0.1/localhost), and links the prebuiltlibllama.so/ MediaPipe native libs.
You don't need to touch your app's network_security_config.xml. The plugin merges its own.
3. First-run code
Minimal example. Runs from any framework. Assumes you've used @dvai-bridge/capacitor's downloadModel() helper, or shipped a .gguf file through your own download path — see step 4 for the helper.
import { DVAIBridge } from "@dvai-bridge/capacitor";
const { baseUrl, port, modelId } = await DVAIBridge.start({
backend: "llama",
modelPath: "/data/user/0/com.example.app/files/dvai-models/llama-3.2-1b.gguf",
contextSize: 2048,
gpuLayers: 99,
// Optional: override the default port if 38883 is taken.
// httpBasePort: 38883,
});
console.log(`DVAI ready on ${baseUrl} (model=${modelId})`);baseUrl is the local URL — typically http://127.0.0.1:38883/v1. Pass it to any OpenAI-compatible client:
const res = await fetch(`${baseUrl}/chat/completions`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
model: modelId,
messages: [{ role: "user", content: "Why is the sky blue?" }],
stream: false,
}),
});
const data = await res.json();
console.log(data.choices[0].message.content);For streaming, set stream: true and parse SSE:
const res = await fetch(`${baseUrl}/chat/completions`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
model: modelId,
messages: [{ role: "user", content: "Tell me a story." }],
stream: true,
}),
});
const reader = res.body!.getReader();
const decoder = new TextDecoder();
while (true) {
const { value, done } = await reader.read();
if (done) break;
const chunk = decoder.decode(value);
// Each SSE event begins "data: " and is JSON-encoded.
// The final event is "data: [DONE]".
console.log(chunk);
}When the user closes the screen, call DVAIBridge.stop(). It releases the model and frees memory. stop() is idempotent.
4. Downloading a model with downloadModel
Most apps can't ship multi-GB GGUF files inside the bundle. The shim includes a resumable, sha256-verified downloader. It caches into the platform's app-data directory.
import { DVAIBridge } from "@dvai-bridge/capacitor";
const sub = await DVAIBridge.addProgressListener((e) => {
if (e.phase === "loading" && e.percent != null) {
setUiProgress(e.percent);
}
});
const { path } = await DVAIBridge.downloadModel({
url: "https://huggingface.co/<org>/<repo>/resolve/main/llama-3.2-1b-instruct.Q4_K_M.gguf",
sha256: "<lowercase-hex-sha256-of-the-file>",
destFilename: "llama-3.2-1b.gguf",
// Optional: gated HF repos.
// headers: { Authorization: `Bearer ${hfToken}` },
onProgress: (e) => {
if (e.bytesTotal) {
console.log(`${e.bytesReceived}/${e.bytesTotal}`);
}
},
});
await sub.remove();
await DVAIBridge.start({ backend: "llama", modelPath: path });Behavior:
- File already exists with a matching sha256? Returns immediately with
{ cached: true }. - Otherwise streams an HTTP
Rangedownload into<destFilename>.partial, computing sha256 as bytes arrive. - On final mismatch, deletes the partial + final paths and throws
ChecksumMismatchError. Safe to retry. - iOS: marks the file
isExcludedFromBackupKey = trueso it doesn't bloat iCloud backups.
For hosting, multi-file models, and disk-space pre-checks, see Model distribution.
5. Common errors
| Symptom | Cause | Fix |
|---|---|---|
[DVAI] modelPath is required for backend "llama" | Caller didn't pass a path. | Provide modelPath (or use downloadModel first). The foundation backend is the exception — it manages the model itself. |
[DVAI] Failed to bind any port in range 38883..38898 | Another DVAI instance, dev server, or unrelated process is on those ports. | Pass httpBasePort: 49000 (or any free port) and bump httpMaxPortAttempts if you need a wider scan. |
[DVAI] Backend "foundation" selected but the corresponding plugin is not installed | You called start({ backend: "foundation" }) without installing @dvai-bridge/capacitor-foundation. | Install the matching backend package and re-run npx cap sync. |
[DVAI] Apple Foundation Models is iOS-only | Selected foundation on Android. | Branch on Capacitor.getPlatform() and pick llama / mediapipe on Android. |
| Cleartext error on Android emulator (API < 28). | Custom networkSecurityConfig overrides ours with cleartextTrafficPermitted=false. | Either remove your override or merge in <domain includeSubdomains="true">127.0.0.1</domain>. The plugin's manifest entry uses tools:replace but a host-app explicit override still wins. |
iOS does not need any Info.plist keys for loopback HTTP. ATS exempts 127.0.0.1 by default. NSLocalNetworkUsageDescription is unrelated and not needed.
6. Choosing a backend
| Need | Pick |
|---|---|
| Text completion, broadest model choice | llama |
| Vision (image_url content parts) | mediapipe (vision-capable Gemma) or llama + mmproj (Phase 2) |
| Audio (input_audio content parts) | llama with a multimodal GGUF that has a native audio encoder (Phase 2) |
| Zero-download text on iOS 26+ | foundation |
| Embeddings | llama with embeddingMode: true |
| Apple-managed privacy posture | foundation |
See Multimodal for the full per-backend modality matrix and content-part shapes. Tested models has concrete model recommendations per tier.
7. Distributed inference (offload) — v3.0+
Capacitor v3.0+ surfaces the v3.0 distributed-inference configuration. Pass an offload block to start() to enable LAN / internet peer discovery and offload when local hardware can't keep up. See the Distributed Inference guide for the full description.
import { DVAIBridge } from "@dvai-bridge/capacitor";
const server = await DVAIBridge.start({
backend: "llama",
modelPath: "/path/to/model.gguf",
offload: {
enabled: true,
discoverLAN: true,
minLocalCapability: 10,
rendezvousUrl: "wss://rendezvous.myapp.com", // optional, internet path
},
});The JS-side OffloadConfig.onPairingRequest callback can't cross the Capacitor plugin boundary. So inbound pairing requests arrive via an event listener — respond with respondToPairing(requestId, approved):
const handle = await DVAIBridge.addListener("pairingRequest", async (req) => {
const approved = await myUiConfirm(req.peerDeviceName);
await DVAIBridge.respondToPairing(req.id, approved);
});
// Tear down when finished:
await handle.remove();addListener("pairingRequest") needs a successful start() first — the listener is dispatched on the active backend plugin. Without a registered listener, inbound pairing requests are denied after the request's expiresAt deadline.
Next steps
- Model distribution — hosting, sha256, multi-file GGUF + mmproj download patterns, gated HF repos, disk-space pre-checks.
- Multimodal — image / audio content parts, error semantics, per-backend support matrix.
- Tested models — the curated list we exercise in CI and pre-release smoke tests.
- Native backend overview — architecture and migration notes from the deprecated
llama-cpp-capacitorpackage. - Distributed Inference guide — peer discovery, capability scoring, pairing handshake, and the
/v1/dvai/*endpoints.
