India's Open Voice Foundation Model — a sub-1B parameter SLM over discrete audio tokens, covering 22 scheduled languages and ~30 major dialects, released as open infrastructure.
India can build world-class voice infrastructure for its 22 scheduled languages on a fraction of the global-frontier budget — by collecting better data, not more data.
Svara v3 is a proof of methodology. The model, dataset, codec, sentence bank, evaluation benchmark, and recording playbook are all released as open infrastructure — for any community in any country to build voice AI for languages currently invisible to commercial AI.
Kenpath builds and operates AI platforms at population scale across agriculture, governance, dairy, accessibility, and edge retail. Live impact across our flagship deployments.
Six AI/ML engineers report to the Chief Data Scientist; per-variety native linguists and language experts engaged on contract. Domain-expert advisors from IIIT Jammu and IISc Bangalore.
B.E. Computer Science, Osmania University. 20+ years engineering leadership; previously NIIT and Manipal. Founded Kenpath in 2021. Strategic lead since inception; partnerships with EkStep (Nandan Nilekani), Gates Foundation, Intel, and IIIT Bangalore.
Research-driven data scientist specialising in NLP, speech AI, and agentic systems. Led research and engineering for Svara v1 and Voiceclone-beta. Will lead v3 model architecture, data design, training, and evaluation.
Architected and scaled the production inference stack (vLLM, FastAPI, Docker, GGUF, CUDA decoders) supporting streaming audio. Will lead v3 inference, mobile-app pipelines, and on-device deployment via OpenVINO and NPU optimisation.
Kenpath is the technology partner across these engagements — owning architecture, model fine-tuning, platform engineering, and production operations.
Voice-first agentic AI for crop, pest, weather, mandi, and govt schemes. Real-time advisory by phone — no smartphone, no app required.
2.5M+ farmers · 20K+ daily interactionsIndia's national Agri DPI for population-scale farmer advisory. Network-of-networks architecture connecting states. Launched at AI Impact Summit 2026.
140M+ target reach · 1.2M+ in first monthConversational AI and Gujarati voice layer for India's largest dairy cooperative. Real-time guidance on cattle health and productivity. Endorsed by PM at AI Impact 2026.
3.6M+ women dairy farmers · GujaratConversational AI surfacing sectoral policies, incentives, and approval pathways for investors. Built on UP's portal and Niveshmitra single-window framework.
64 industries · 17+ ministries integratedBuilt end-to-end with the Centre for Exponential Change. AI product suite (MAP, LENS, COMPASS, THREAD), voice + WhatsApp interfaces, SaaS for systems change.
50+ organisations · 87%+ accuracy feedbackML-Ops pipeline for automated retraining on edge devices. OpenTelemetry observability + multi-tenant SaaS architecture. Vision models for produce recognition.
65%+ checkout time reduced · <100ms latencyThe target is not volume but design. v3 will collect roughly 600–800 hours of speech engineered for what TTS actually needs — phonetic and prosodic coverage, studio-grade fidelity, fair-paid native speakers, linguist-validated annotation, and open release under community-rooted governance.
Built Te Reo ASR that beats commercial systems by collecting 300 hours in 10 days through community-led campaigns and governing it under the Kaitiakitanga License.
~1100 carefully designed sentences with full diphone coverage produce studio-quality TTS — proof that designed minimal data outperforms brute-force scale for synthesis.
Just 1 hour of neutral speech and 30 minutes per emotion, syllabically balanced, yields usable expressive TTS in low-resource Indian languages. Validated by Interspeech 2025 PEFT work.
Seven open-source artefacts. Every deliverable releases openly to the global research community — particularly to communities of low-resource and indigenous-language speakers worldwide who can adopt the same methodology.
| Artefact | License | Description |
|---|---|---|
| Svara v3 TTS Model | Apache 2.0 | Sub-1B parameter multilingual TTS covering ~50 Indian language varieties; trained from scratch on the v3 designed corpus |
| Svara-NeuCodec-Indic | Inherits upstream | NeuCodec audio codec finetuned on Indic dialect audio; tokens represent retroflex consonants, aspirated stops, nasal vowels, phonation distinctions |
| Svara v3 Speech Dataset | CC-BY-SA · CARE | ~600–800 hours of studio audio (48 kHz/24-bit) and mobile audio (24 kHz), ~50 varieties, ~120 fair-paid native speakers, full annotation |
| Svara Sentence Bank | CC-BY-SA | Per-variety designed sentence sets (~500–1000 utterances each), built using set-cover algorithms over phonetic and prosodic targets |
| Svara Eval Benchmark | Open | Multi-dimensional eval suite — phonological fidelity, morphological integrity, semantic adequacy, sociolinguistic appropriateness, practical usability |
| Svara Methodology Pack | Apache 2.0 | Sentence-design toolkit (set-cover algorithm), mobile recording app, neural-denoising cleanup pipeline, evaluation harness, recording playbook |
| Svara Inference Toolkit | Apache 2.0 | Production-ready package: Docker, vLLM, FastAPI, GGUF, CUDA decoders, ElevenLabs-compatible REST API, streaming audio + zero-shot voice cloning |
Svara is a foundation layer that powers voice experiences across sectors. Sub-1B and open-source — deploys on-device, on-premise, or in the cloud with no vendor lock-in.
Voice-first farmer advisory in dialects farmers actually speak — Awadhi for UP, Marwari for Rajasthan, Telangana Telugu for southern Andhra.
Multilingual IVR and voice bots for central, state, and municipal helplines across 22 scheduled languages. Grievance redressal, scheme discovery, tax helplines.
Symptom checkers, ICDS and Poshan advisory, tele-consultation support, vaccination reminders in regional languages and for low-literacy populations.
NCERT and state-board AI tutors that read textbooks, explain concepts, and converse in mother tongue. Spoken English practice for rural and semi-urban students.
Screen readers, navigation assistants, and learning tools for the visually impaired. Extends Blimey's 100K+ lives to all Indian languages.
Voice-based banking, insurance, and micro-credit interfaces for UPI, PMJDY, PMFBY, Jan Suraksha. Fraud-awareness voice bots for rural customers.
Contact centres, customer service, telehealth, food delivery, and ride-hailing agents via LiveKit. Real-time voice-to-voice translation for cross-border CX.
Sub-1B variant runs on Intel Core Ultra NPUs, Qualcomm NPUs, and commodity Android. Fully offline voice AI for privacy-sensitive workloads.
Cash grant + IndiaAI compute subsidy via empanelled cloud providers. The 12-month plan is structured into four 3-month release cycles with a public deliverable each cycle.
| Phase | Months | GPU Type | GPU Count & Duration |
|---|---|---|---|
| Codec finetuning + experiment infra | 1–3 | H100 SXM | 8 GPUs for 3 months |
| Stage 1 base TTS training · 22 scheduled languages | 4–6 | H100 SXM | 16 GPUs for 3 months |
| Full v3 base + Stage 2 dialect adaptations + ablations | 7–12 | H100 SXM | 24 GPUs for 3 months |
| Eval, demo, research inference hosting | 1–12 | H100 SXM | 4 GPUs continuous (avg) |
Each cycle ends with a public milestone release on HuggingFace and GitHub — partners, governments, and the developer community can adopt capabilities as they come online.
NeuCodec finetuned on Indic dialect audio; sentence-design toolkit released; mobile recording app v1 with on-device quality screening; community-governance framework documented; first 5 stage-1 sentence banks crafted with native linguists. Public GitHub release.
Studio recording complete for 22 scheduled Indian languages (~250 hrs total); Svara v3 base model trained from scratch on stage-1 corpus; multi-dimensional evaluation benchmark v1 published; first per-variety sentence-bank releases; preview model on HuggingFace.
Studio + mobile recording for 25–30 major dialects (Awadhi, Bhojpuri, Magahi, Marwari, Garhwali, Sylheti, Telangana, Rayalaseema, Madurai, Kongu, Varhadi, etc.); per-dialect LoRA adaptation; expanded benchmark; second public release of v3.
Full v3 stack covering ~50 varieties under Apache 2.0; complete Indic dialectal voice benchmark publication; OpenVINO and Qualcomm NPU optimisation for on-device deployment; technical paper submission; national developer workshops.
Ministry of Agriculture (Govt of India), Government of Maharashtra, Government of Uttar Pradesh, Intel, EkStep Foundation (Nandan Nilekani), People+ AI, Wadhwani Foundation, Gates Foundation (via programme sponsors), IIIT Bangalore (CoSS), IIT Jammu, Selco, Enable India, Apurva AI, Open Inclusion, MetLife, and others.
| Name | Phone | |
|---|---|---|
| Muneender Jillella Founder & CEO | muneender@kenpath.io | +91 98867 35532 |
| Aditya Chhabra Chief Data Scientist | aditya@kenpath.io | +91 81780 52333 |
| Thyagarajan Babu CTO | rajan@kenpath.io | +91 98454 56279 |