Prototyping Component Re-Use, and the Simplest Whisper Wrapper
Hi!
In the last issue I mentioned I've been working on some more computer-y side-projects, and here's the first one: Vineyard — an exploration of prototype-based component re-use.
This idea was brewing in the back of my mind for years now — at the moment it still feels undercooked, but I'm curious to get some feedback and hear your thoughts. Go check it out, there's a bit of a longer write-up and a live version to play with ↗.
A friend recently made a dictation app in a weekend ↗.
Here's my 15-minute version:
#!/usr/bin/env bash
set -euo pipefail
export OPENAI_API_KEY="$(cat ~/.openai-key)"
tmp=$(mktemp /tmp/whisper.XXXXXX.wav)
trap 'rm -f "$tmp"' EXIT
echo "Recording... (press Ctrl-C to stop)"
rec --no-show-progress -c 1 -b 16 -r 16000 "$tmp"
[ "$(soxi -r "$tmp")" != "16000" ] && sox "$tmp" -q -r 16000 "${tmp%.wav}_16k.wav" && mv "${tmp%.wav}_16k.wav" "$tmp"
text=$(openai api audio.transcriptions.create -m whisper-1 -f "$tmp" --response-format text)
echo $text | pbcopy
printf "\nTranscript:\n%s\n" "$text"
Adding a global shortcut in Hammerspoon ↗ was another 5 minutes on top of that.
Yes, it sends the audio files to OpenAI and requires a network connection — I don't care, for my limited use it's more than ok.
It would be easy to speed up the recorded audio before beaming up to OpenAI to save on some of the cost ↗.
Worth Checking Out
What I've been reading lately:
- The Annotated Flatland ↗ (and now I wish every book had a Talmudic rubrication ↗)
- Logic and Design In Art, Science, And Mathematics ↗
- Category Theory for Programmers ↗
On the web:
- Ink&Switch has a new essay on Malleable Software ↗
- Beyond Code Generation: LLM-supported Exploration of the Program Design Space ↗
- What I Learned Building Whole Earth AI ↗
- Design for 3D-Printing ↗
- Semi-Formal Programming ↗
- Scrappy ↗ from John Chang and Pontus Granström (with a nice write-up ↗)
- Agentic Engineering in Action with Mitchell Hashimoto ↗