▲Show HN: Moonshine Open-Weights STT models – higher accuracy than WhisperLargev3github.com

64 points by petewarden 2 hours ago | 11 comments

Karrot_Kream 18 minutes ago [-]

According to the OpenASR Leaderboard [1], looks like Parakeet V2/V3 and Canary-Qwen (a Qwen finetune) handily beat Moonshine. All 3 models are open, but Parakeet is the smallest of the 3. I use Parakeet V3 with Handy and it works great locally for me.

[1]: https://huggingface.co/spaces/hf-audio/open_asr_leaderboard

ac29 47 minutes ago [-]

No idea why 'sudo pip install --break-system-packages moonshine-voice' is the recommended way to install on raspi?

The authors do acknowledge this though and give a slightly too complex way to do this with uv in an example project (FYI, you dont need to source anything if you use uv run)

asqueella 22 minutes ago [-]

For those wondering about the language support, currently English, Arabic, Japanese, Korean, Mandarin, Spanish, Ukrainian, Vietnamese are available (most in Base size = 58M params)

armcat 49 minutes ago [-]

This is awesome, well done guys, I’m gonna try it as my ASR component on the local voice assistant I’ve been building https://github.com/acatovic/ova. The tiny streaming latencies you show look insane

pzo 32 minutes ago [-]

haven't tested yet but I'm wondering how it will behave when talking about many IT jargon and tech acronyms. For those reason I had to mostly run LLM after STT but that was slowing done parakeet inference. Otherwise had problems to detect properly sometimes when talking about e.g. about CoreML, int8, fp16, half float, ARKit, AVFoundation, ONNX etc.

g-mork 43 minutes ago [-]

How does this compare to Parakeet, which runs wonderfully on CPU?

sroussey 26 minutes ago [-]

onnx models for browser possible?

lostmsu 1 hours ago [-]

How does it compare to Microsoft VibeVoice ASR https://news.ycombinator.com/item?id=46732776 ?

cyanydeez 2 hours ago [-]

No LICENSE no go

bangaladore 1 hours ago [-]

There is a license blurb in the readme.

> This code, apart from the source in core/third-party, is licensed under the MIT License, see LICENSE in this repository.

> The English-language models are also released under the MIT License. Models for other languages are released under the Moonshine Community License, which is a non-commercial license.

> The code in core/third-party is licensed according to the terms of the open source projects it originates from, with details in a LICENSE file in each subfolder.

altruios 1 hours ago [-]

reading through readme.md "License This code, apart from the source in core/third-party, is licensed under the MIT License, see LICENSE in this repository.

The English-language models are also released under the MIT License. Models for other languages are released under the Moonshine Community License, which is a non-commercial license.

The code in core/third-party is licensed according to the terms of the open source projects it originates from, with details in a LICENSE file in each subfolder."

Loading comments...