Conceptual Overview
Introduction
Orga AI is a multimodal AI that sees, hears, and speaks in real time, enabling voice and video interactions via WebRTC at https://api.orga-ai.com/v1.
The Orga AI API and SDKs for Web and React Native make it easy to build conversational apps with features like real-time transcription (powered by Whisper) and customizable modalities (audio-only or audio+video).
This page covers the core concepts—sessions, authentication, ICE configuration, and DataChannel events—and explains how the SDK abstracts WebRTC complexity.
Developers must implement a backend proxy to securely handle API calls, while the SDK’s context provider and hooks (e.g., useOrgaAI) streamline client-side integration.
Start here, then dive into the detailed SDK documentation or Quick start with cURL:
React Native SDKLink to a page in the guide
Web SDKLink to a page in the guide
cURL QuickstartCore Concepts
Sessions
A session is a WebRTC-based connection to Orga AI for real-time multimodal interactions (audio-only or audio+video).
Key steps:
-
Exchange an API key for an ephemeral token (
POST /v1/realtime/client-secrets). -
Fetch ICE servers for WebRTC connectivity (
GET /v1/realtime/ice-config). -
Negotiate SDP to start the session (handled by SDK).
-
Send/receive data(e.g., transcriptions) via WebRTC DataChannels.
SDK Simplification: The SDKs abstract WebRTC peer connections and SDP negotiation.
Example:
Authentication
Securely authenticate using ephemeral JWTs to protect API keys. Use a backend proxy to call /v1/realtime/client-secrets then pass the token to /v1/realtime/ice-config.
Why Proxy? Exposing API keys client-side risks abuse (per OWASP API security guidelines). A proxy keeps keys server-side, returning tokens/ICE configs to the client.
SDK Simplification: The SDK’s init function accepts a proxy callback, handling token/ICE retrieval internally.
WebRTC and ICE Configuration
Orga AI uses WebRTC for low-latency communication. ICE servers (STUN/TURN) ensure connectivity across NATs/firewalls, fetched via /ice-config.
SDK Simplification: The SDK passes ICE configs to RTCPeerConnection automatically after calling your proxy function. No manual WebRTC setup needed.
DataChannel Events
Once connected, Orga AI sends real-time events over WebRTC DataChannels, like:
-
session.update: Runtime adjustments (e.g, model, voice, temperature, modalities). -
response.output_item.done: Orga AI’s transcription results. -
conversation.item.input_audio_transcription.completed: User transcription results.
SDK Simplification: Use hook (useOrgaAI) to access events without parsing raw DataChannel messages. See SDK API Reference > Hooks for details.
Secure Proxy Setup
To prevent API key exposure, implement a backend proxy (e.g., Node.js, Python, Go) to call /v1/realtime/client-secrets and /v1/realtime/ice-config. Pass the token and ICE servers to the SDK’s init function.
