Payload CMS APIs
Explore APIs managed through Payload CMS.

Text edit
Text translation is the process of converting written content from one language into another while preserving its meaning, tone, and intent. It involves more than just replacing words; translators must consider grammar, cultural nuances, idiomatic expressions, and context to ensure the target text reads naturally and accurately reflects the original. This can be done manually by human translators, who rely on linguistic expertise and cultural understanding, or through machine translation systems, which use algorithms and large language models to generate translations quickly. Effective translation balances accuracy with fluency, making the text accessible and engaging for the intended audience.
View Details

Speech to Text edited
Sarvam’s Speech to Text API transforms spoken words into precise, readable text with exceptional speed and accuracy. It works seamlessly with both live audio streams and recorded files, handling India’s rich variety of languages, accents, and background noise. Designed for demanding environments like customer support, media production, and accessibility services, it also supports custom vocabularies and domain-specific optimizations for even higher accuracy.
View Details

ASR Translate
Sarvam’s ASR Translate API merges speech recognition and translation into a single, streamlined process. It listens to spoken input, transcribes it accurately, and instantly translates it into another language—ideal for bridging communication gaps in multilingual contexts. With deep support for India’s linguistic diversity and resilience in noisy or informal speech conditions, it ensures translations are contextually accurate and ready for real-world use.
View Details

Text to Speech
Bring your text to life with Sarvam’s Text to Speech API, which generates natural, expressive voices in multiple Indian languages and dialects. Perfect for applications such as voice assistants, automated phone systems, and accessibility tools, this API offers customization over pitch, pace, and pronunciation. It’s built to deliver clear, human-like audio output while integrating effortlessly into voice-driven products.
View Details