sup.ai.audio

The sup.ai.audio package provides AI-powered audio understanding and interpretation capabilities. It is accessed through sup.ai.audio.

// Load an audio file
const audio = sup.audio("audio.mp3");

// Basic audio interpretation
const description = sup.ai.audio.interpret(audio);
console.log(description); // "A person speaking clearly about technology with background music..."

// Custom prompt for specific analysis
const instruments = sup.ai.audio.interpret(
  audio,
  "What musical instruments can you hear in this audio?"
);

// Analyze speech content
const transcript = sup.ai.audio.interpret(
  audio,
  "Please provide a detailed transcript of the speech in this audio."
);

Methods

sup.ai.audio.interpret()

`(audio: SupAudio, prompt?: string)` `→ string`

const audio = sup.audio("audio.wav");
const description = sup.ai.audio.interpret(audio);

Converts audio to text using AI audio understanding. This method can transcribe speech, describe sound effects, identify musical instruments, and provide detailed audio analysis.

Parameters:

audio (SupAudio): The audio file to analyze
prompt (optional string): Custom instructions for the AI analysis. If not provided, uses a default prompt that provides both transcription and audio description.

Returns: A string containing the AI’s interpretation of the audio

Examples:

// Basic interpretation - transcribes speech and describes audio
const audio = sup.audio("recording.mp3");
const description = sup.ai.audio.interpret(audio);

// Custom analysis for music
const musicAnalysis = sup.ai.audio.interpret(
  audio,
  "Identify the musical genre, instruments, and mood of this audio."
);

// Focus on speech transcription
const transcript = sup.ai.audio.interpret(
  audio,
  "Please provide an accurate transcript of all spoken words in this audio."
);

// Identify sound effects
const soundEffects = sup.ai.audio.interpret(
  audio,
  "What sound effects or non-speech audio can you hear? Describe them in detail."
);

Notes

The AI model used for audio interpretation is Gemini 3 Flash, which supports both speech transcription and general audio understanding.
Audio files can be loaded using sup.audio() with URLs or file paths.
The default behavior (when no custom prompt is provided) includes both speech transcription and detailed audio description.
Supported audio formats include MP3, WAV, M4A, and other common audio formats.
For best results with speech transcription, use clear audio with minimal background noise.

sup.ai.audio

Methods

sup.ai.audio.interpret()

(audio: SupAudio, prompt?: string) → string

Notes

`(audio: SupAudio, prompt?: string)` `→ string`