01 / 17

+ 01 / KENPATH × INDIAAI MISSION

svara

India's Open TTS Data Infrastructure — methodology, dataset, codec, sentence bank, benchmark, and sub-1B SLM designed to scale to 100+ Indian language varieties. Released as open infrastructure under Permissive license.

Muneender Jillella · Founder & CEO, Kenpath Technologies kenpath.io / huggingface.co/kenpath May 2026

02 / 17

+ 02 / MOTIVATION AND PROJECT SUMMARY

India has open ASR infrastructure for 100+ languages. It has nothing equivalent for TTS. svara builds that — with better data, not more data.

svara is a proof of methodology designed for 100+ Indian language varieties. Model, dataset, codec, sentence bank, eval benchmark, and recording playbook — all released as open infrastructure.

Bhashini / AI4Bharat / IndicTTS

Strong foundations for Indic speech — enabled svara v1 training. But gaps remain for production scale: limited code-mixing, word skipping in non-Hindi languages, long-form instability, prosody degradation over context, text normalisation gaps, and low-resource language inconsistency. No expressivity or voice design.

Meta MMS / Parler / XTTS

English-centric or thin per-language. No dialect granularity, no expressivity, no Indic phonetic fidelity in codec, no on-device.

svara (ours)

LLM-based discrete audio tokens. Expressivity & emotion control, voice cloning, code-switching, Indic-finetuned codec, robust text normalisation (WFST), 100+ varieties, sub-1B on-device.

03 / 17

+ 03 / ORGANISATION AND TEAM CREDENTIALS

Team.

A founding team that has shipped AI to millions.

04 / 17

+ 04 / LEADERSHIP

Leadership team.

Muneender Jillella

Founder & CEO

Visionary engineering leader with 20+ years across enterprise and public-sector tech. Previously led engineering at NIIT and Manipal. Drives Kenpath's partnerships with EkStep (Nandan Nilekani), Gates Foundation, Intel, and IIIT Bangalore.

Thyagarajan Babu

Chief Technology Officer

Decades building robust, scalable systems for mission-critical deployments. Architected svara's production inference stack (vLLM, FastAPI, Docker, GGUF, CUDA). Leads platform engineering, on-device deployment via OpenVINO and NPU.

Aditya Chhabra

Chief Data Scientist

NLP, speech AI, agentic systems. Led svara v1 and Voiceclone-beta. Research collaborator on Urbanization from Within (Oxford University Press, 2026) and Efficient Neural Geo-referencing of Map Collections (2023). Leads svara SLM architecture, training, and evaluation.

Amit Chakravarty

VP — Strategy & Client Engagements

Strategy and client engagement leader driving Kenpath's partnerships with governments, enterprises, and mission-driven organisations. Translates complex AI capabilities into measurable outcomes at scale.

Sreekanth S

Senior Project Manager

Experienced project manager steering Kenpath's engagements from first pilot to production deployment. Brings rigour to planning, delivery, and stakeholder coordination — ensuring complex AI programs ship on schedule.

6 AI/ML engineers 6 software engineers Per-variety native linguists on contract Advisors: IIT Jammu

05 / 17

+ 05 / PROVEN TRACK RECORD

Population-scale track record.

Kenpath builds and operates AI platforms at population scale across agriculture, governance, dairy, accessibility, and edge retail.

2.5M+

Farmers Reached

MahaVISTAAR · Maharashtra

140M+

National DPI Target Reach

Bharat-VISTAAR · Union Budget 2026–27

3.6M+

Dairy Farmers

Amul AI Sarlaben · PM-endorsed

500K+

HuggingFace Downloads

svara TTS v1 · #7 globally · Feb 2026

100K+

Lives Empowered

Blimey · Enable India

Union Budget 2026–27 · ₹150 Cr allocated PM-endorsed at AI Impact Summit 2026 #7 globally · highest-downloaded open Indic TTS Apache 2.0 · <1B params · on-device

06 / 17

+ 06 / PROVEN TRACK RECORD

Population-scale AI — already live.

Agriculture · Govt of Maharashtra

MahaVISTAAR AI

2.5M+

Farmers

20K+

Daily Interactions

19+

Languages

Full-stack Voice AI platform — phone call access, no app or internet needed
Agentic architecture for crop, pest, weather, mandi & schemes
Fine-tuned Agri LLM (Marathi + Hindi + English + Bhili tribal) on IndiaAI Mission GPUs

National Agri DPI · Ministry of Agriculture

Bharat-VISTAAR

120M+

Target Reach

3 states

In 1 Month

1.2M+

Month 1 Interactions

Population-scale DPI for agriculture — architected for India scale
Foundational Agri AI models — AgriStack + IMD department integration
Fine-tuned LLMs on India's agricultural diversity on IndiaAI Mission GPUs

Cooperative Economy · EkStep · MeitY

Amul AI — Sarlaben

3.6M+

Dairy Farmers

18.5K

Villages · Gujarat

3wks

Idea to Live

Fine-tuned Gujarati LLM for dairy domain — cattle health, milk quality, scheme guidance
Voice AI through telephony — works on feature phones, no literacy barrier
"A model for AI at national scale" — Nandan Nilekani, AI Impact Summit 2026

07 / 17

+ 07 / DELIVERED AT SCALE

More engagements.

AGRI PROCESSING

Wadhwani FPO

AI for Farmer Producer Orgs — processing, packaging, branding, market access.

Wadhwani Foundation

GOVT & INVESTMENT

Invest UP

Multilingual agentic interface for policies, subsidies & approvals.

64 industries · 17+ ministries

RETAIL EDGE AI

Intel VCaaS

Edge checkout with real-time produce recognition. ML-Ops on-device.

<100ms · 65%+ faster

DEVELOPMENT

Apurva.ai

AI suite with voice + WhatsApp. SaaS across India, Brazil, Africa.

50+ orgs · 87%+ accuracy

ACCESSIBILITY

Blimey

AI-powered learning & navigation tools for the visually impaired.

Enable India · 100K+ lives

EDTECH

Zavmo.ai

Personalised career journeys using conversational intelligence.

UK · Neuroscience-driven

Released Models & Research

Open-Source Models

svara TTS v1 — 500K+ downloads, #7 globally
svara Voiceclone-beta — 10s reference cloning
Indic Text Normalisation — WFST, 5ms, 19 langs

Research Publications

Using LLMs for Qualitative Analysis can Introduce Serious Bias — arXiv, 85 citations
Qualitative Analysis with Large-N — World Bank Policy Research WP, 10 citations
Deep Learning for Crater Detection on Chandrayaan-2 — J. Indian Society of Remote Sensing
Urbanization from Within — Oxford University Press, 2026 (research collaborator)
Efficient Neural Geo-referencing of Map Collections, 2023
Causes and Consequences of Religious Segregation — with Harvard University, ACEGD 2024

LLM Fine-Tuning Experience

Fine-tuned Agri LLMs for MahaVISTAAR & Bharat-VISTAAR on IndiaAI GPUs
Gujarati dairy LLM for Amul AI Sarlaben
svara TTS — speech SLM training from scratch

08 / 17

+ 08 / EVENTS & ENGAGEMENT

Active across India's AI ecosystem.

India AI Impact Summit

Bharat-VISTAAR & Amul AI recognised by PM Modi. Launched by Union Agriculture Minister. Kenpath as tech partner.

Frugal AI for India

Panel with C.V. Sridhar (Amaravati Quantum Mission, AP), Anusha Dandapani & Sameer Chauhan (UNICC), Frugal AI Hub.

Voice AI for Helplines

With EkStep Foundation (Nandan Nilekani) — designing voice AI for India's national citizen helplines at scale.

TANUH AI CoE — IISc

Aditya Chhabra on Ministry of Education evaluation committee for TANUH AI CoE for Healthcare at IISc.

Global AI Show — Abu Dhabi

Presenting India's sovereign voice AI and population-scale DPI to the international AI community.

Bengaluru Tech Summit

Showcasing population-scale AI platforms and open-source svara TTS.

AI in Agriculture — Global

With Chief Secretary Maharashtra Vikas Rastogi. Hosted by Govts of India & Maharashtra and AIAIC.

Bharat-VISTAAR Launch

National Agri DPI launched by Union Agriculture Minister in Jaipur. Kenpath as core technology partner.

09 / 17

+ 09 / PROVEN TRACK RECORD

svara v1 — already the world's most downloaded open Indic TTS.

Speech with Feeling

Expressive emotion tags bring natural warmth, humour, and feeling to synthesized speech.

हिHindi

বাBengali

मरMarathi

తెTelugu

ಕನKannada

தமTamil

മലMalayalam

ગુGujarati

ਪੰPunjabi

অসAssamese

नेNepali

भोBhojpuri

संSanskrit

মৈMaithili

बडBodo

छतChhattisgarhi

राRajasthani

EnEnglish

डोDogri

500K+ Downloads in 6 months

#7 Most downloaded TTS globally

19 Indian languages · v1

3B Orpheus (Llama 3.2) + SNAC

Hear it for yourself

kenpath.io/en/open-source/svara

10 / 17

+ 10 / OPEN SOURCE

Indic Text Normalization.

Deterministic WFST-based text normalisation for 19 Indian languages. Critical preprocessing for any TTS pipeline — converts numbers, dates, currency, measurements into natural spoken form at 1–5ms latency. Open-sourced under Apache 2.0.

19 Indian languages

12 Semiotic classes

1–5ms Latency · CPU only

WHY NOT USE AN LLM?

100x faster, deterministic, no GPU

1–5ms WFST traversal vs 500ms local LLM vs 2000ms API LLM. Same input always produces same output — no hallucinations, no temperature tuning. Runs on CPU anywhere. Born from real-time TTS needs with svara.

PIPELINE

Tokenize → Classify → Verbalize → Post-process

Built on NVIDIA NeMo Text Processing framework (Pynini). Extended by Kenpath for all 19 Indic languages with language-specific rules for each semiotic class.

Cardinals

25 → पच्चीस

Dates

15/08 → पंद्रह अगस्त

Currency

₹500 → पांच सौ रुपये

Time

10:30 → साढ़े दस बजे

Measurements

5kg → पांच किलोग्राम

Fractions

½ → आधा

Decimals

3.14 → तीन दशमलव एक चार

Abbreviations

Dr. → डॉक्टर

11 / 17

+ 11 / DATA STRATEGY

Designed data, not volume.

Existing Indic TTS datasets cover literary registers of major languages — but miss dialects, prosodic diversity, emotion, and code-switching. TTS doesn't need 100,000 hours. It needs ~2 hours of phonetically designed data per variety — that's what makes 100+ varieties viable.

WHAT EXISTS

Existing open datasets

Trained svara v1 on SYSPIN, IndicTTS, RASA, SPICOR — ~2,000 hours across 22 languages. These are excellent but have a ceiling: standard literary registers, limited dialect granularity, limited prosodic and emotional diversity.

WHAT WE BUILD

New designed corpus — ~600–800 hrs

~1000 phonetically designed sentences per variety via set-cover algorithms. ~200 parallel core sentences across all varieties as cross-lingual anchors for voice cloning and dialect transfer. Composed natively, never translated.

TWO-TIER RECORDING

Studio + mobile — why it scales

Studio (48kHz/24-bit) trains the TTS model. Mobile app (24kHz, on-device neural denoising) trains the codec. You don't need a studio in every village — mobile tier makes tribal and remote dialects viable.

UNIT ECONOMICS

~2 hrs per variety → 1 LoRA adapter

Base model trained on 22 languages. Each new dialect = ~2 hours of designed data + 1 LoRA adapter. 100 varieties ≈ 200 hours total. Fair-paid native speakers, 2–4 per variety, gender and age balanced.

12 / 17

+ 12 / MODEL & EVALUATION

Architecture and evaluation.

A sub-1B SLM over discrete audio tokens — small enough for on-device, powerful enough for 100+ varieties. Evaluation goes beyond MOS and WER to measure what actually matters for Indian languages.

ARCHITECTURE

Sub-1B SLM · discrete audio tokens

Language model generates audio token sequences, decoded to waveform. Single-GPU trainable. Deployable on Android, Intel Core Ultra / Xeon / Panther Lake CPUs, Qualcomm NPUs via GGUF. Near-zero per-call cost for Bhashini population-scale distribution.

INDIC CODEC

Audio tokens that understand Indian phonetics

Finetune audio codec on Indic speech so tokens natively represent retroflex consonants, aspirated stops, nasal vowels, breathy phonation — sounds that English-trained codecs handle poorly. Released as standalone open infrastructure.

TRAINING PATH

From scratch + per-dialect LoRA

Stage 1: Train base model from scratch on 22 scheduled languages. Stage 2: Per-dialect LoRA adaptation from ~2 hours. Cross-lingual transfer via parallel core sentence bank — same speaker, same meaning, different language.

EVALUATION

Five dimensions, beyond MOS/WER

Phonological fidelity (retroflex, aspiration, nasals). Morphological integrity. Semantic adequacy. Sociolinguistic appropriateness — disaggregated by age, register, code-switching. Community-led usability with paid native evaluators.

13 / 17

+ 13 / MODEL AND EVALUATION STRATEGY

Proof-of-methodology programme — not a model-only programme.

Seven open-source artefacts, released under IndiaAI-approved licensing terms and uploaded to AIKosh — so any community, in India or globally, can extend coverage to their own language using the same methodology and tools.

MODEL

TTS SLM

Sub-1B multilingual model. ~50 varieties in 12 months, scales to 100+. On-device via NPU/GGUF.

CODEC

Indic Audio Codec

Finetuned on Indic audio — tokens natively represent retroflex, aspirated stops, nasal vowels.

DATASET

Speech Dataset

Studio (48 kHz) + mobile audio. ~50 varieties, ~120 fair-paid native speakers, full annotation.

DESIGN

Sentence Bank

Per-variety sets (~500–1000 sentences) via set-cover algorithms over phonetic & prosodic targets.

EVALUATION

Eval Benchmark

Phonological fidelity, morphology, semantics, sociolinguistic appropriateness, community-led usability.

TOOLKIT

Methodology Pack

Sentence-design toolkit, mobile recording app, neural-denoising pipeline, eval harness, playbook.

INFERENCE

Inference Toolkit

Production-ready: Docker, vLLM, FastAPI, GGUF, CUDA, REST API, streaming & voice cloning.

DISTRIBUTION

Open Release

All artefacts uploaded to AIKosh, HuggingFace & GitHub under IndiaAI-approved licensing.

14 / 17

+ 14 / USE CASE, DOWNSTREAM APPLICATIONS & IMPACT

Use cases.

Built for Bharat. Runs anywhere — on a feature phone, a cloud server, or an edge device.

AGRICULTURE

Rural advisory

Voice-first farmer advisory in dialects farmers actually speak — Awadhi for UP, Marwari for Rajasthan, Telangana Telugu for southern Andhra.

CITIZEN SERVICES

Governance & IVR

Multilingual IVR and voice bots for central, state, and municipal helplines. Grievance redressal, scheme discovery, tax helplines.

HEALTHCARE

Public health

Symptom checkers, ICDS and Poshan advisory, tele-consultation support, vaccination reminders for low-literacy populations.

EDUCATION

AI tutors

NCERT and state-board AI tutors that read textbooks, explain concepts, and converse in mother tongue. Spoken English practice for rural students.

ACCESSIBILITY

Screen readers & audio description

Screen readers, navigation assistants, and audio description of visual content for the visually impaired in native Indian languages.

FINANCIAL INCLUSION

UPI · PMJDY · PMFBY

Voice-based banking, insurance, and micro-credit interfaces. Fraud-awareness voice bots for rural customers in their native language.

ENTERPRISE

Voice agents & narration

Contact centres, voice-described product catalogues, content narration for media and e-commerce — real-time voice-to-voice translation via LiveKit.

ON-DEVICE

AI PC experiences

Runs on Intel Xeon / Panther Lake CPUs, Core Ultra NPUs, Qualcomm NPUs, and commodity Android. Fully offline voice AI for privacy-sensitive workloads.

15 / 17

+ 15 / PROJECT TIMELINE

Four release cycles in 12 months.

Each cycle ends with a public milestone release on HuggingFace, AIKosh & GitHub.

M1

M2

M3

M4

M5

M6

M7

M8

M9

M10

M11

M12

Sentence bank design

Set-cover algorithm · ~50 varieties

Codec finetuning

v1 · 22 langs

v2 · + dialects

Studio recording

22 langs (studio) + 25-30 dialects (studio + mobile)

Stage 1 base training

22 scheduled languages

Stage 2 dialect LoRA

25-30 major dialects

Eval + benchmark

Multi-dimensional eval · community-led usability

Public releases

R1 Codec

R2 Base

R3 Dialects

R4 Final

16 / 17

+ 16 / COMPUTE & FINANCIALS

Compute requirements & budget.

GPU Ramp Schedule · H100 SXM

M1

M2

M3

M4

M5

M6

M7

M8

M9

M10

M11

M12

Codec + experiments

8 GPUs

Base model training

16 GPUs

Dialect adaptation

24 GPUs

Eval + hosting

4 GPUs continuous

Non-Compute Budget · Cash Grant

Engineering & Platform 2.0 Cr

Personnel 1.2Cr · Mobile app + tooling 0.4Cr · Ops + legal 0.2Cr · Workshops + papers 0.2Cr

Data Collection & Studio 3.0 Cr

Speaker fees 1.2Cr · Linguistic design 1.0Cr · Studio + mobile rig 0.4Cr · Field travel 0.2Cr · Community eval 0.2Cr

Storage (subsidy) 18 TB

Total non-compute ask 5.0 Cr

Funding & Sustainability

PREFERRED MECHANISM

Cash grant + IndiaAI compute subsidy

Via empanelled cloud providers. H100 SXM at ~INR 92/GPU/hour subsidised rate.

SUSTAINABILITY BEYOND GRANT

Enterprise revenue cross-subsidises open source

Managed inference for banks, PSUs, telecom, healthcare. Domain-specific voice agents. Custom voice creation. Bhashini distribution at near-zero per-call cost. 500K+ organic downloads validate enterprise demand.

COMPUTE JUSTIFICATION

H100 SXM required for from-scratch training of sub-1B core across 50+ varieties with long audio sequences and large batch sizes. Phased ramp: 8 GPUs while collecting data → 16 for base training → 24 at peak multi-variety intensity.

17 / 17

+ 17 / TRUSTED BY

Partners & contact.

Ministry of Agriculture (Govt of India), Government of Maharashtra, Government of Uttar Pradesh, Intel, EkStep Foundation (Nandan Nilekani), People+ AI, Wadhwani Foundation, Gates Foundation (via programme sponsors), IIIT Bangalore (CoSS), IIT Jammu, Selco, Enable India, Apurva AI, Open Inclusion, MetLife, and others.

Name	Email	Phone
Muneender Jillella Founder & CEO	muneender@kenpath.io	+91 98867 35532
Aditya Chhabra Chief Data Scientist	aditya@kenpath.io	+91 81780 52333
Thyagarajan Babu CTO	rajan@kenpath.io	+91 98454 56279

kenpath.io huggingface.co/kenpath github.com/Kenpath Permissive license