Conceptual Overview

Introduction

Orga AI is a multimodal AI that sees, hears, and speaks in real time, enabling voice and video interactions via WebRTC at https://api.orga-ai.com/v1.

The Orga AI API and SDKs for Web and React Native make it easy to build conversational apps with features like real-time transcription (powered by Whisper) and customizable modalities (audio-only or audio+video).

This page covers the core concepts—sessions, authentication, ICE configuration, and DataChannel events—and explains how the SDK abstracts WebRTC complexity.

Developers must implement a backend proxy to securely handle API calls, while the SDK’s context provider and hooks (e.g., useOrgaAI) streamline client-side integration.

Start here, then dive into the detailed SDK documentation or Quick start with cURL:

React Native SDK

Link to a page in the guide

Web SDK

Link to a page in the guide

cURL Quickstart

Core Concepts

Sessions

A session is a WebRTC-based connection to Orga AI for real-time multimodal interactions (audio-only or audio+video).

Key steps:

  • Exchange an API key for an ephemeral token (POST /v1/realtime/client-secrets).

  • Fetch ICE servers for WebRTC connectivity (GET /v1/realtime/ice-config).

  • Negotiate SDP to start the session (handled by SDK).

  • Send/receive data(e.g., transcriptions) via WebRTC DataChannels.

SDK Simplification: The SDKs abstract WebRTC peer connections and SDP negotiation.

Example:

Authentication

Securely authenticate using ephemeral JWTs to protect API keys. Use a backend proxy to call /v1/realtime/client-secrets then pass the token to /v1/realtime/ice-config.

Why Proxy? Exposing API keys client-side risks abuse (per OWASP API security guidelines). A proxy keeps keys server-side, returning tokens/ICE configs to the client.

SDK Simplification: The SDK’s init function accepts a proxy callback, handling token/ICE retrieval internally.

WebRTC and ICE Configuration

Orga AI uses WebRTC for low-latency communication. ICE servers (STUN/TURN) ensure connectivity across NATs/firewalls, fetched via /ice-config.

SDK Simplification: The SDK passes ICE configs to RTCPeerConnection automatically after calling your proxy function. No manual WebRTC setup needed.

DataChannel Events

Once connected, Orga AI sends real-time events over WebRTC DataChannels, like:

  • session.update: Runtime adjustments (e.g, model, voice, temperature, modalities).

  • response.output_item.done: Orga AI’s transcription results.

  • conversation.item.input_audio_transcription.completed: User transcription results.

SDK Simplification: Use hook (useOrgaAI) to access events without parsing raw DataChannel messages. See SDK API Reference > Hooks for details.

Secure Proxy Setup

To prevent API key exposure, implement a backend proxy (e.g., Node.js, Python, Go) to call /v1/realtime/client-secrets and /v1/realtime/ice-config. Pass the token and ICE servers to the SDK’s init function.