OpenAI-Compatible Client

LeapOpenAIClient / leap-openai-client (introduced in v0.10.0) is a small, dependency-light client for any OpenAI-compatible chat-completions endpoint — OpenAI itself, OpenRouter, vLLM, llama-server, or your own proxy. It ships in the same SDK release as LeapSDK, so you can route requests between an on-device LFM and a cloud model from a single app.

When to use it

Hybrid on-device + cloud routing. Run small / fast models on-device with LeapSDK, fall back to a larger cloud model for hard prompts.
Standardised cloud API. Talk to any OpenAI-compatible backend without pulling in a heavier OpenAI SDK.
Streaming first. SSE streaming is the only mode — non-streaming requests aren’t exposed. streamChatCompletion(...) forces stream = true on the outgoing request regardless of the stream field on the ChatCompletionRequest you pass in.

Add the dependency

iOS / macOS (SPM)
Android (Gradle)
JVM (Gradle)
Kotlin/Native (Gradle)

Add the LeapOpenAIClient product to your target. See the Quick Start for the full SPM setup.

dependencies: [
    .package(url: "https://github.com/Liquid4All/leap-sdk.git", from: "0.10.8")
]

targets: [
    .target(
        name: "YourApp",
        dependencies: [
            .product(name: "LeapOpenAIClient", package: "leap-sdk"),
        ]
    )
]

In Swift sources, import LeapOpenAIClient. The Darwin (URLSession) Ktor engine is bundled — no extra HTTP setup needed.

dependencies {
  implementation("ai.liquid.leap:leap-sdk:0.10.8")
  implementation("ai.liquid.leap:leap-openai-client:0.10.8")
}

Bundles an OkHttp-engine Ktor client. No extra HTTP setup needed.

dependencies {
    implementation("ai.liquid.leap:leap-sdk:0.10.8")
    implementation("ai.liquid.leap:leap-openai-client:0.10.8")
}

JVM support landed in v0.10.7 (the jvm slice was absent in the v0.10.0–v0.10.6 cascade). Pure-Maven JVM projects should consume the -jvm classifier directly: ai.liquid.leap:leap-openai-client-jvm:0.10.8. Bundles the CIO Ktor engine.

dependencies {
    implementation("ai.liquid.leap:leap-sdk:0.10.8")
    implementation("ai.liquid.leap:leap-openai-client:0.10.8")
}

Targets linuxX64, linuxArm64, mingwX64 (Windows native), and wasmJs (browser via Ktor Js engine, added in v0.10.7).

Basic usage

Swift (iOS / macOS)
Kotlin (all platforms)

New in v0.10.8. SKIE is now applied to leap-sdk-openai-client, matching LeapSDK / LeapModelDownloader / LeapUI. Swift consumers get a real AsyncSequence, exhaustive onEnum(of:) switching, nested Kotlin class names (ChatCompletionEvent.Delta instead of the previously flattened ChatCompletionEventDelta), and an OpenAiClient(config:) convenience init — no more OpenAiClientKt. prefix. If you need the pre-SKIE manual-collector surface frozen for some reason, pin to 0.10.7; otherwise use the v0.10.8 surface below.

import LeapOpenAIClient

let client = OpenAiClient(
    config: OpenAiClientConfig(
        apiKey: "sk-…",
        baseUrl: "https://api.openai.com/v1"
    )
)

let request = ChatCompletionRequest(
    model: "gpt-4o-mini",
    messages: [
        ChatMessage.System(content: "You are a helpful assistant."),
        ChatMessage.User(content: "What is the capital of Japan?")
    ],
    temperature: 0.7
)

for try await event in client.streamChatCompletion(request: request) {
    switch onEnum(of: event) {
    case .delta(let delta):
        print(delta.content, terminator: "")
    case .done(let done):
        if let usage = done.usage { print("\nTokens: \(usage.totalTokens)") }
    case .error(let err):
        print("\nError: \(err.message)")
    }
}

client.close()  // closes the underlying URLSession-backed HttpClient

onEnum(of:) gives exhaustive switching — the Swift compiler errors if a new ChatCompletionEvent case is added.

SKIE bridges Kotlin Flow to a Swift AsyncSequence with Failure = Never, so transport-level failures (network drop, TLS error, etc.) silently terminate the stream rather than throwing. Only HTTP-level errors (non-2xx response) arrive as in-band .error events; malformed SSE chunks are logged and skipped. If you need to react to silent termination, track whether .done was observed and treat a stream that ended without it as a transport failure.

import ai.liquid.leap.openai.ChatCompletionEvent
import ai.liquid.leap.openai.ChatCompletionRequest
import ai.liquid.leap.openai.ChatMessage
import ai.liquid.leap.openai.OpenAiClient
import ai.liquid.leap.openai.OpenAiClientConfig

val client = OpenAiClient(
    config = OpenAiClientConfig(
        apiKey = "sk-…",
        baseUrl = "https://api.openai.com/v1",
    )
)

val request = ChatCompletionRequest(
    model = "gpt-4o-mini",
    messages = listOf(
        ChatMessage.System("You are a helpful assistant."),
        ChatMessage.User("What is the capital of Japan?"),
    ),
    temperature = 0.7,
)

client.streamChatCompletion(request).collect { event ->
    when (event) {
        is ChatCompletionEvent.Delta -> print(event.content)
        is ChatCompletionEvent.Done  -> event.usage?.let { println("\nTokens: ${it.totalTokens}") }
        is ChatCompletionEvent.Error -> println("\nError: ${event.message}")
    }
}

client.close()

Configuration

OpenAiClientConfig is a Kotlin data class bridged identically on every platform.

data class OpenAiClientConfig(
    val apiKey: String,
    val baseUrl: String = "https://api.openai.com/v1",
    val chatCompletionsPath: String = "/chat/completions",
    val extraHeaders: Map<String, String> = emptyMap(),
)

Field	Default	Notes
`apiKey`	— (required)	Sent as `Authorization: Bearer <apiKey>`.
`baseUrl`	`https://api.openai.com/v1`	Override for OpenRouter, a self-hosted backend, etc.
`chatCompletionsPath`	`/chat/completions`	Appended to `baseUrl`.
`extraHeaders`	`{}`	Merged into every request — e.g. OpenRouter’s `HTTP-Referer`.

OpenRouter

Swift (iOS / macOS)
Kotlin (all platforms)

let client = OpenAiClient(
    config: OpenAiClientConfig(
        apiKey: "sk-or-…",
        baseUrl: "https://openrouter.ai/api/v1",
        extraHeaders: [
            "HTTP-Referer": "https://yourapp.example.com",
            "X-Title": "Your App"
        ]
    )
)

val client = OpenAiClient(
    OpenAiClientConfig(
        apiKey = "sk-or-…",
        baseUrl = "https://openrouter.ai/api/v1",
        extraHeaders = mapOf(
            "HTTP-Referer" to "https://yourapp.example.com",
            "X-Title" to "Your App",
        ),
    )
)

Self-hosted vLLM / llama-server

Swift (iOS / macOS)
Kotlin (all platforms)

let client = OpenAiClient(
    config: OpenAiClientConfig(
        apiKey: "anything",  // Required by config but typically unused
        baseUrl: "http://10.0.0.42:8000/v1"
    )
)

val client = OpenAiClient(
    OpenAiClientConfig(
        apiKey = "anything",
        baseUrl = "http://10.0.0.42:8000/v1",
    )
)

Request shape

ChatCompletionRequest covers standard OpenAI fields plus a few OpenRouter-specific extensions. OpenRouter-only fields are silently ignored by stock OpenAI-compatible APIs.

data class ChatCompletionRequest(
    val model: String,
    val messages: List<ChatMessage>,
    val temperature: Double? = null,
    val topP: Double? = null,
    val maxCompletionTokens: Int? = null,   // Preferred for newer OpenAI versions
    val maxTokens: Int? = null,             // Legacy alias — some custom backends still require it
    val frequencyPenalty: Double? = null,
    val presencePenalty: Double? = null,
    val stop: List<String>? = null,
    val stream: Boolean = true,
    // OpenRouter extensions:
    val topK: Int? = null,
    val repetitionPenalty: Double? = null,
    val minP: Double? = null,
    val topA: Double? = null,
    val transforms: List<String>? = null,
    val models: List<String>? = null,
    val route: String? = null,
    val provider: ProviderPreferences? = null,
)

ChatMessage (the OpenAI-client one, distinct from LeapSDK.ChatMessage) is a sealed type with three cases — System, User, Assistant.

Response shape

streamChatCompletion(request) returns a Flow<ChatCompletionEvent> (Kotlin) — SKIE bridges this as a Swift AsyncSequence since v0.10.8, so Swift consumers can iterate it with for try await event in client.streamChatCompletion(request: ...). Events:

Variant	Meaning
`Delta(content: String)`	Text chunk from the model. May be empty for role-only deltas.
`Done(usage: Usage?)`	Stream finished. `usage` is non-`null` when the API includes token counts.
`Error(message: String)`	Non-2xx HTTP response. Malformed SSE chunks are logged and skipped — they do not emit an `Error` event.

data class Usage(val promptTokens: Int, val completionTokens: Int, val totalTokens: Int)

Hybrid routing example

Route simple prompts to a small on-device LFM; escalate harder prompts to a cloud model.

Swift (iOS / macOS)
Kotlin (Android)
Kotlin (JVM / native)

import LeapModelDownloader
import LeapOpenAIClient

@MainActor
final class HybridChatViewModel: ObservableObject {
    private let onDevice: Conversation
    private let cloud: OpenAiClient

    init(onDevice: Conversation, cloud: OpenAiClient) {
        self.onDevice = onDevice
        self.cloud = cloud
    }

    func send(_ text: String, useCloud: Bool) async throws {
        if useCloud {
            // Cloud path — same SKIE-bridged surface as on-device since v0.10.8.
            let request = ChatCompletionRequest(
                model: "gpt-4o-mini",
                messages: [ChatMessage.User(content: text)]
            )
            for try await event in cloud.streamChatCompletion(request: request) {
                if case let .delta(d) = onEnum(of: event) { appendChunk(d.content) }
            }
        } else {
            // On-device path.
            let userMessage = LeapSDK.ChatMessage(role: .user, textContent: text)
            for try await response in onDevice.generateResponse(message: userMessage) {
                if case let .chunk(c) = onEnum(of: response) { appendChunk(c.text) }
            }
        }
    }

    private func appendChunk(_ text: String) { /* … */ }

    deinit { cloud.close() }
}

import ai.liquid.leap.Conversation
import ai.liquid.leap.message.MessageResponse
import ai.liquid.leap.openai.ChatCompletionEvent
import ai.liquid.leap.openai.ChatCompletionRequest
import ai.liquid.leap.openai.ChatMessage as CloudChatMessage
import ai.liquid.leap.openai.OpenAiClient
import ai.liquid.leap.message.ChatMessage
import ai.liquid.leap.message.ChatMessageContent
import androidx.lifecycle.ViewModel
import androidx.lifecycle.viewModelScope
import kotlinx.coroutines.launch

class HybridChatViewModel(
    private val onDevice: Conversation,
    private val cloud: OpenAiClient,
) : ViewModel() {

    fun send(text: String, useCloud: Boolean) = viewModelScope.launch {
        if (useCloud) {
            val request = ChatCompletionRequest(
                model = "gpt-4o-mini",
                messages = listOf(CloudChatMessage.User(text)),
            )
            cloud.streamChatCompletion(request).collect { event ->
                if (event is ChatCompletionEvent.Delta) appendChunk(event.content)
            }
        } else {
            val message = ChatMessage(
                role = ChatMessage.Role.USER,
                content = listOf(ChatMessageContent.Text(text)),
            )
            onDevice.generateResponse(message).collect { resp ->
                if (resp is MessageResponse.Chunk) appendChunk(resp.text)
            }
        }
    }

    private fun appendChunk(text: String) { /* … */ }

    override fun onCleared() {
        super.onCleared()
        cloud.close()
    }
}

suspend fun hybridSend(
    onDevice: Conversation,
    cloud: OpenAiClient,
    text: String,
    useCloud: Boolean,
) {
    if (useCloud) {
        val request = ChatCompletionRequest(
            model = "gpt-4o-mini",
            messages = listOf(CloudChatMessage.User(text)),
        )
        cloud.streamChatCompletion(request).collect { event ->
            if (event is ChatCompletionEvent.Delta) print(event.content)
        }
    } else {
        onDevice.generateResponse(text).collect { resp ->
            if (resp is MessageResponse.Chunk) print(resp.text)
        }
    }
}

See Cloud AI Comparison for a side-by-side feature breakdown.

Lifecycle

The OpenAiClient(config:) factory (Kotlin fun OpenAiClient(config:) — exported as a SKIE-bundled Swift convenience init since v0.10.8) creates an HttpClient internally and ties it to the returned client — call close() when you’re done.

Swift (iOS / macOS)
Kotlin (all platforms)

deinit { client.close() }

The lower-level constructor that accepts an externally-managed HttpClient is part of the Kotlin/Ktor surface and isn’t a useful entry point from Swift — the Ktor engine machinery isn’t bridged into the public Swift API. Use OpenAiClient(config:) and let the SDK own the session. If multiple consumers share a client, share the OpenAiClient instance and close() once at teardown.

override fun onCleared() {
    super.onCleared()
    client.close()
}

If you need to share an HttpClient across multiple clients (e.g., you already manage one for other Ktor-based code), use the lower-level constructor that takes a pre-built HttpClient — you then own its lifetime and shouldn’t call close() on the OpenAiClient:

val shared = HttpClient(OkHttp)  // your own instance
val client = OpenAiClient(config = config, httpClient = shared)
// Don't call client.close() — you own `shared` and decide when it dies

Leap SDK

Model Bundling Services

OpenAI-Compatible Client

When to use it

Add the dependency

Basic usage

Configuration

OpenRouter

Self-hosted vLLM / llama-server

Request shape

Response shape

Hybrid routing example

Lifecycle

Leap SDK

Model Bundling Services

Documentation Index

​When to use it

​Add the dependency

​Basic usage

​Configuration

​OpenRouter

​Self-hosted vLLM / llama-server

​Request shape

​Response shape

​Hybrid routing example

​Lifecycle

When to use it

Add the dependency

Basic usage

Configuration

OpenRouter

Self-hosted vLLM / llama-server

Request shape

Response shape

Hybrid routing example

Lifecycle