Ir al contenido principal

VoiceAgentRAG: Solving the RAG Latency Bottleneck in Real-Time Voice Agents Using Dual-Agent Architectures

Abstract

Dual-agent memory router: Slow Thinker predice follow-up topics y pre-fetch chunks a FAISS cache de sub-millisecond; Fast Talker lee solo del cache. Elimina latencia vector DB en voz real-time.

Escríbenos por WhatsApp
Asesor VirtualAsesor Virtual 24h