Had Claude test it out on 3 videos. Worked at 5-8x realtime. The beauty of it is that it works on all videos, not just the one with transcripts. Combine it with YouTube search and LLM takeaways from transcripts, and you have super-efficient content consumption. There are SaaS products that charge 1 cent per video for those with transcripts. There is a viable product in here somewhere, methinks.
mrkn1 2 minutes ago [-]
thanks for running it Niraj. I see something similar on my machine, which still surprises me every time lol
spudlyo 2 hours ago [-]
So, this project consists of a ~175 line README and a ~500 line Python program that glues yt-dlp and Kroko together. Neat.
I guess if it encourages you to install and figure out how to use ffmpeg, yt-dlp, kroko, numpy, and onnx that's a good thing. Sometimes just knowing a thing is possible is a huge benefit.
mrkn1 44 minutes ago [-]
thank you. You nailed the actual value, that's right. The real win is just knowing you can do this on a laptop CPU, offline, no GPU or cloud bill. There are tiny done-for-you details, like rescaling token timestamps back to real time after the atempo speedup so --timestamps doesn't lie to you, but they are minor.
iririririr 7 minutes ago [-]
I see the value as a centralized anti-content-blocker.
This repo is now a good way to centralize hacks around the sure-to-come blockers those platforms will add to prevent download.
Just like uBlockOrigin was a way to centralize all the "just run this greasemonkey script" comments, I can see this getting a huge following for people who really value transcriptions.
mrkn1 3 minutes ago [-]
I appreciate the perspective! higher ceiling than I'd put on it, but if it gets there awesome. PRs welcome!
charcircuit 38 minutes ago [-]
Most of these platforms already have transcriptions built in.
mrkn1 35 minutes ago [-]
Youtube has transcripts on most videos, not all. The others don't expose them. If you mean the "transcript APIs" for TikTok/IG/X, they are all transcribing audio like yapsnap does. If you have a way to pull native ones, let me know, genuinely curious.
charcircuit 4 minutes ago [-]
YouTube's is transcribing the audio too. The other do expose them as subtitles as the video is playing.
mrkn1 50 seconds ago [-]
Yes fair point, asr cached and exposed. I meant to draw the line more on fetchable or not.
I guess if it encourages you to install and figure out how to use ffmpeg, yt-dlp, kroko, numpy, and onnx that's a good thing. Sometimes just knowing a thing is possible is a huge benefit.
This repo is now a good way to centralize hacks around the sure-to-come blockers those platforms will add to prevent download.
Just like uBlockOrigin was a way to centralize all the "just run this greasemonkey script" comments, I can see this getting a huge following for people who really value transcriptions.