Resources, demos, and tools
Demonstration of Hebrew Text-to-Speech using ChatterBox AI with Phonikud integration. Features multilingual zero-shot voice cloning and emotion control with performance that outperforms ElevenLabs.
New Hebrew TTS model with natural speech and accurate pronunciation. Uses fine-tuned Gemma 3 LLM for unvocalized text understanding, features zero-shot voice cloning, and delivers high-quality synthesis with low latency for real-time applications.
Demonstration of seamless Hebrew and English mixing in the same input. Part of ongoing improvements to build the most powerful Hebrew TTS model.
Demonstration of seamless integration with voice cloning approaches. Phonikud enables accurate Hebrew pronunciation modeling that works perfectly with voice cloning techniques for personalized speech synthesis.
New Hebrew TTS model based on StyleTTS2 with accurate IPA transcription and stress markers. Optimized for local deployment on simple hardware.
Advanced Hebrew TTS model based on Zonos architecture, trained on Phonikud IPA and Saspeech datasets. Features zero-shot voice cloning and multilingual support with high-quality 44kHz output.
Demonstration of emotion control in ChatterBox Hebrew TTS. Shows how the same Hebrew sentence can be synthesized with different emotional expressions, showcasing the model's advanced prosodic capabilities.
Fine-tuned from ivrit.ai Whisper Large v3 Turbo model for transcribing Hebrew speech into IPA phonetic representation. Trained on the ILSpeech dataset with ~90% accuracy, providing highly accurate Hebrew phonetic transcription for speech recognition applications.
Studio-quality Hebrew speech dataset with two male speakers. Includes clean text and phoneme annotations in LJSpeech format, phonemized using Phonikud.
Large-scale Hebrew speech dataset with single-speaker audio at 44.1kHz. Enhanced from OpenSLR with Hebrew diacritics and Phonikud-generated phonemes.
Clean Hebrew text dataset based on Common Crawl containing modern Hebrew content from across the internet. Enhanced with diacritics, stress marks, and morphological information.
Training dataset used to create the first version of Phonikud. Contains clean Hebrew sentences with nikud and phonetic marks, with manual corrections for high-frequency words.
Dataset of text and phonemes that can be used to train G2P models. Contains Hebrew text with diacritics paired with IPA phonetic transcriptions. Includes hedc4-phonemes (2M lines) and knesset_phonemes (5M lines) generated with Phonikud.
Visual presentation that explains the challenges of Hebrew writing system and how Phonikud solves the phonetic ambiguity problem. Demonstrates multiple pronunciations of the same Hebrew text.
Comprehensive Hebrew TTS benchmark comparing 17 models using Word Error Rate vs Character Error Rate metrics. Features interactive scatter plot visualization and uses whisper-heb-ipa for evaluation on hand-annotated SASpeech dataset samples.
Benchmark comparing Hebrew G2P models using WER and CER metrics on IPA transcriptions. Includes an interactive scatter plot and leaderboard showing how different models perform. Add your model!
Fast Text-to-Speech in Hebrew with Phonetic Control. Enter unvocalized Hebrew text to generate speech with control over text, diacritics, and phonemes.
Local AI assistant powered by Phonikud for natural Hebrew speech synthesis. Wake it up with "Picovoice!" and have conversations with full offline TTS capabilities.
Interactive Hebrew phonemization tool that converts text to phonemes using Phonikud. Choose between different phoneme schemas and optionally add diacritics to Hebrew text for accurate pronunciation.
Python library for Hebrew text-to-speech using Phonikud with Piper and StyleTTS2 support. Easy pip installation with ONNX models for efficient offline Hebrew speech synthesis. Includes examples and non-commercial license.
Join our Discord community to discuss Text-to-Speech, Grapheme-to-Phoneme conversion, Hebrew linguistics, and collaborate on advancing Hebrew speech technology.
End-to-end Hebrew G2P model based on ByT5 architecture and Phonikud. Provides state-of-the-art grapheme-to-phoneme conversion with pre-trained checkpoints available on Hugging Face.
Hebrew TTS with ChatterBox and Phonikud integration. Built on ChatterBox's leading open-source voice cloning AI model with multilingual support, emotion control, and superior quality that outperforms ElevenLabs in blind evaluations.
High-performance Rust library for Hebrew diacritics and phonetic marks. Built with enhanced Dicta model achieving 0.1s per sentence with memory safety and dual output modes (nikud male and nikud haser).