Flagship effort

Teach the moon
to speak Koshur

Zoon means moon. It is also the largest open speech corpus ever attempted for Kashmiri — built so the language can live inside the voice technology the rest of the world takes for granted.

The idea

A voice the machines have never been taught to hear — until now.

How Zoon is built · scroll

Two halves of one voice.

Expressive TTS

700 hours, full of feeling

Around 8 dedicated speakers record 700 hours — not flat studio reading, but speech carrying real emotion: laughter, lament, lullaby. Enough for a voice that actually sounds Kashmiri.

Speech-to-text

500 hours, 500 voices

500 hours gathered from about 500 everyday speakers — every district, dialect, age and accent — so recognition works for the whole valley, not just newsreaders.

Curate & align

Cleaned, labelled, aligned

Every clip is transcribed, timestamped and quality-checked — turning raw recordings into data a model can actually learn from.

Release openly

Free for everyone

The finished corpus is released openly for researchers, developers and dreamers — anyone who wants to build Kashmiri into their tools.

Your part

Add your voice

The STT half needs hundreds of ordinary speakers. Five minutes of reading aloud is a real contribution.

By the numbers

A corpus the size
of a language

0 hours of expressive TTS recordings
0 dedicated voices, full of feeling
0 hours of speech-to-text data
~0 speakers from across the valley

Lend your voice

Your accent belongs in the corpus.

No studio, no signup fuss — just your phone and your voice. Every dialect you add makes Kashmiri speech tech work for more people.

Start recording