Real-time AI, without the infrastructure headache.
Connect any AI model to a Fishjam room as a participant. It listens, speaks, and processes media — just like a human. You build the intelligence, we handle the transport.

Invite AI to the conversation
An agent joins a room, receives media tracks, and sends audio back — just like anyone else.
Your model, our infrastructure
No special APIs, no separate audio pipelines. An AI agent connects to a Fishjam room using the same SDKs as any other participant — it receives media tracks, processes them, and sends audio back in real time.
One agent, many roles
Fishjam is a modular real-time media infrastructure. Use one product or combine them depending on your needs.
Voice bots
Let an AI agent handle inbound voice calls, answer questions, or guide users through a flow — before handing off to a human when needed.
Live transcription
Stream audio to a transcription model and get real-time text output. Useful for accessibility, note-taking, or building searchable session records.
Content moderation
Monitor live audio or video streams with an AI agent that flags content in real time, without interrupting the session.
Custom AI pipelines
Connect any model, chain multiple agents, or build fully custom audio processing workflows. Fishjam handles the media transport, you own the logic.
Use ready integrations or implement your own
Connect the model you already use or bring your own.

Connect to Gemini Live with minimal setup — no code hunting, no custom plumbing. Import one line, start building.

While Vapi handles the voice AI, Fishjam handles the real-time transport. Connect them and your agent joins the room — ready to help you.
More integrations
Other ways are coming! Stay tuned so you don’t miss your new favourite one. Or, if you got a specific model in mind, let us know about it!
Create, connect, go live
Just three steps to an AI-powered session.
Create a room
Set up a Fishjam room and generate tokens for your participants and your AI agent.
const { agent: fishjamAgent } = await fishjam.createAgent(roomId);
const agentTrack = fishjamAgent.createTrack();Connect AI
Connect your AI model to the room as a participant. It starts receiving audio tracks immediately.
const session = await genai.live.connect({
model: "gemini-3.1-flash-live-preview",
config: {
responseModalities: [Modality.AUDIO],
},
callbacks: {
...
},
});Go live
Human participants join. The agent listens, responds, and processes media in real time.
fishjamAgent.on("trackData", ({ data }) => {
session.sendRealtimeInput({
audio: {
data: Buffer.from(data).toString("base64"),
mimeType: FishjamGemini.inputMimeType,
},
});
});Create a room
Set up a Fishjam room and generate tokens for your participants and your AI agent.
const { agent: fishjamAgent } = await fishjam.createAgent(roomId);
const agentTrack = fishjamAgent.createTrack();Connect AI
Connect your AI model to the room as a participant. It starts receiving audio tracks immediately.
const session = await genai.live.connect({
model: "gemini-3.1-flash-live-preview",
config: {
responseModalities: [Modality.AUDIO],
},
callbacks: {
...
},
});Go live
Human participants join. The agent listens, responds, and processes media in real time.
fishjamAgent.on("trackData", ({ data }) => {
session.sendRealtimeInput({
audio: {
data: Buffer.from(data).toString("base64"),
mimeType: FishjamGemini.inputMimeType,
},
});
});Transparent pricing — AI included.
Fishjam uses a usage-based pricing model across all products — from video conferencing and live streaming to AI voice agents. No hidden infrastructure fees. No surprise scaling costs.
Start for free, scale as you grow.
View pricing details