Class: BuiltInAdaptiveVadProvider
Defined in: packages/agentos/src/hearing/providers/BuiltInAdaptiveVadProvider.ts:92
Built-in voice activity detection (VAD) provider backed by the
AdaptiveVAD engine and EnvironmentalCalibrator.
This is the default VAD provider in AgentOS and requires no external dependencies or API keys. It operates entirely locally on raw audio frames.
How It Works
- The
EnvironmentalCalibratorcontinuously estimates the ambient noise floor and spectral profile from incoming audio frames. - The
AdaptiveVADuses the calibrator's noise profile to set dynamic thresholds for speech detection — louder environments get higher thresholds to avoid false positives. - Each
processFrame()call returns a SpeechVadDecision withisSpeech,confidence, the raw VAD result, and the current noise profile.
Configuration Defaults
- Sample rate: 16 kHz (standard for voice pipelines)
- Frame duration: 20ms (320 samples per frame)
- VAD and calibration: Use sensible defaults from the underlying engines
See
BuiltInAdaptiveVadProviderConfig for configuration options
See AdaptiveVAD for the underlying VAD algorithm.
See EnvironmentalCalibrator for the noise profiling engine.
Example
const vad = new BuiltInAdaptiveVadProvider({
sampleRate: 16_000,
frameDurationMs: 20,
vad: { minSpeechDurationMs: 100 },
});
const frame = new Float32Array(320); // 20ms at 16kHz
// ... fill frame with audio samples ...
const decision = vad.processFrame(frame);
if (decision.isSpeech) {
console.log(`Speech detected (confidence: ${decision.confidence})`);
}
Implements
Constructors
Constructor
new BuiltInAdaptiveVadProvider(
config?):BuiltInAdaptiveVadProvider
Defined in: packages/agentos/src/hearing/providers/BuiltInAdaptiveVadProvider.ts:133
Creates a new BuiltInAdaptiveVadProvider.
Initializes both the environmental calibrator and the adaptive VAD engine with the provided or default configuration.
Parameters
config?
BuiltInAdaptiveVadProviderConfig = {}
Optional VAD configuration. All fields default to standard values suitable for 16kHz mono voice audio.
Returns
BuiltInAdaptiveVadProvider
Example
// Default configuration (16kHz, 20ms frames)
const vad = new BuiltInAdaptiveVadProvider();
// Custom configuration
const vad = new BuiltInAdaptiveVadProvider({
sampleRate: 48_000,
frameDurationMs: 10,
vad: { minSpeechDurationMs: 200 },
});
Properties
displayName
readonlydisplayName:"AgentOS Adaptive VAD"='AgentOS Adaptive VAD'
Defined in: packages/agentos/src/hearing/providers/BuiltInAdaptiveVadProvider.ts:97
Human-readable display name for UI and logging.
Implementation of
id
readonlyid:"agentos-adaptive-vad"='agentos-adaptive-vad'
Defined in: packages/agentos/src/hearing/providers/BuiltInAdaptiveVadProvider.ts:94
Unique provider identifier used for registration and resolution.
Implementation of
Methods
getNoiseProfile()
getNoiseProfile():
NoiseProfile|null
Defined in: packages/agentos/src/hearing/providers/BuiltInAdaptiveVadProvider.ts:207
Returns the current environmental noise profile estimated by the calibrator.
The noise profile includes the estimated noise floor RMS, spectral shape,
and confidence metrics. Returns null if insufficient audio has been
processed for a reliable estimate.
Returns
NoiseProfile | null
The current noise profile, or null if not yet calibrated.
Example
const profile = vad.getNoiseProfile();
if (profile) {
console.log(`Noise floor: ${profile.noiseFloorRms}`);
}
Implementation of
SpeechVadProvider.getNoiseProfile
processFrame()
processFrame(
frame):SpeechVadDecision
Defined in: packages/agentos/src/hearing/providers/BuiltInAdaptiveVadProvider.ts:162
Process a single audio frame and return a speech/non-speech decision.
This method must be called sequentially with consecutive audio frames. The VAD maintains internal state (speech onset tracking, hangover counters) that depends on temporal continuity between frames.
Parameters
frame
Float32Array
A Float32Array of audio samples for one frame. The expected
length is sampleRate * frameDurationMs / 1000 (e.g. 320 for 16kHz/20ms).
Samples should be normalized to the range [-1.0, 1.0].
Returns
A decision object with isSpeech, confidence, the raw VAD result,
and the current environmental noise profile.
Example
const frame = new Float32Array(320);
// ... fill with audio samples ...
const decision = vad.processFrame(frame);
console.log(decision.isSpeech, decision.confidence);
Implementation of
SpeechVadProvider.processFrame
reset()
reset():
void
Defined in: packages/agentos/src/hearing/providers/BuiltInAdaptiveVadProvider.ts:186
Reset the VAD state for a new audio session.
Clears internal counters (speech onset tracking, hangover timers) so the VAD starts fresh. Should be called when starting a new conversation turn or after a significant audio gap. Does NOT reset the environmental calibrator — the noise profile persists across resets.
Returns
void
Example
// Start a new conversation turn
vad.reset();