We Automated Our IDE With AppleScript and We're Not Sorry
I wrote about my two-pass development methodology last week. The system works brilliantly: Gemini builds, Opus reviews, Opus retros, and the codebase gets better every sprint. But I left out the part where executing it manually was driving me insane.
Every sprint has three stages: build, review, retro. Each stage means opening a new chat (or staying in the current one for retro), switching models, typing a slash command, and hitting enter. Across a full project with 14 sprints, that is 42 transitions. Forty-two times I have to context-switch away from whatever I am doing, click into Antigravity, open a chat, type /sprint-07-review, and wait.
So I spent one session building a tool to do it for me.
Why Not Just Use the CLI?
This is the first question everyone asks. Claude has a CLI. Gemini has a CLI. Why not script against those?
Simple answer: money.
My Antigravity Ultra subscription gives me unlimited usage in the GUI. Every message to Gemini 3.1 Pro, every Opus 4.6 review, all included. If I routed the same work through CLI APIs, I would be paying per token. For a 14-sprint project with long architecture documents in context, that adds up to hundreds of dollars.
GUI automation is free. It just requires some creativity.
The Architecture (If You Can Call It That)
Sprint Runner has four moving parts:
- A Python HTTP listener running on
localhost:7890 - AppleScript wrappers that control Antigravity's window
- A notify script that gets called at the end of each workflow
- A CLI entrypoint (
sprint) that ties it together
The flow looks like this:
You run: sprint run 3 --project ~/Dev/uptrail/hardened
Listener starts → triggers build in Antigravity
↓
Gemini runs /sprint-03-build workflow
↓
Workflow's // turbo block calls notify.sh
↓
notify.sh POSTs to Slack (":rocket: BUILD complete")
notify.sh POSTs to localhost:7890/notify (stage=build, status=complete)
↓
Listener receives completion → waits 2 seconds → triggers review
↓
AppleScript focuses window → opens new chat → pastes /sprint-03-review → submits
↓
Opus runs the review workflow → notify.sh fires again
↓
Listener triggers retro (in SAME chat, Opus needs the review context)
↓
Opus runs retro → notify.sh → Listener marks sprint as COMPLETE
↓
Final Slack message: ":checkered_flag: Sprint 03 FULLY AUTOMATED"
The trick is the dual-post pattern in notify.sh. Every workflow already had a Slack notification at the end. I just added a second curl to the local listener:
# 1. Post to Slack (human notification)
curl -s -X POST -H 'Content-type: application/json' \
-d "{\"text\": \"$MESSAGE\"}" \
"$SLACK_URL" > /dev/null 2>&1 || true
# 2. Post to local sprint listener (triggers next stage)
curl -s -X POST -H 'Content-type: application/json' \
-d "{\"stage\": \"$STAGE\", \"status\": \"$STATUS\"}" \
"http://127.0.0.1:${LISTENER_PORT}/notify" > /dev/null 2>&1 || trueBoth posts are fire-and-forget (the || true means if the listener is not running, nothing breaks). The Slack notification exists for me. The localhost post exists for the machine.
The AppleScript Disasters
Getting the AppleScript right was the most painful part. A few things I learned the hard way:
Antigravity registers as "Electron" in macOS System Events. Not "Antigravity." Not "antigravity." Just "Electron." If you have VS Code, Slack, or any other Electron app open, you have to match by window title, not process name. I match against the project directory name in the window title to find the right one.
Clipboard paste beats keystroke typing. My first attempt used keystroke "/sprint-03-build" which sends characters one at a time. This is catastrophically fragile. If I type anything on my keyboard mid-sequence, the characters interleave. The fix: write the command to the clipboard with pbcopy, then simulate Cmd+V. One atomic paste. No interleaving possible.
You need to press Enter twice. When you type a / in Antigravity's chat, it opens an autocomplete dropdown showing your available workflows. The first Enter selects the highlighted workflow. The second Enter actually submits the message. Miss that, and your command just sits in the input field doing nothing while you wonder why Gemini has gone quiet.
-- Paste from clipboard (atomic, won't mix with user typing)
keystroke "v" using {command down}
delay 1.5
-- First Enter: select from autocomplete dropdown
key code 36
delay 1.0
-- Second Enter: submit the message
key code 36Retro stays in the same chat. This one matters for correctness. The retro workflow needs to see the review output. It categorises the FAILs and updates PATTERNS.md. If I opened a new chat for retro, Opus would lose all that context. So the trigger script checks: if the stage is retro, skip the "open new chat" step and send the command directly into the current conversation.
Independence Is the Point
One design decision I am genuinely proud of: the listener is a standalone Python process. It does not depend on Claude Code, the terminal session that started it, or anything else. You can kill your terminal, close Claude Code, go make a coffee. The listener keeps running, waiting for the next notify.sh call.
sprint run 3 --project ~/Dev/uptrail/hardened
That command starts the listener (if it is not already running), triggers the build stage, and exits. From that point, the listener handles everything. I can check progress with sprint status or tail the log with sprint log.
The state is just a JSON file on disc:
{
"status": "running",
"project_name": "hardened",
"project_dir": "/Users/mitsi/Dev/uptrail/hardened",
"sprint": "01",
"current_stage": "build",
"started_at": "2026-03-07T01:32:43.540289"
}No database. No Redis. No Docker. It is a Python script, some bash, and some AppleScript. About 400 lines total.
What It Actually Feels Like
I ran it for real today on the hardened project. Typed the command, switched to something else, and got Slack pings as each stage completed. Build done. Review done. Retro done. Sprint complete.
The retro automatically updated PATTERNS.md with the issues Opus found. The PATTERNS.md file then feeds back into the next sprint's build prompt, so Gemini avoids the same mistakes. The whole feedback loop (build, review, retro, learn) ran without me touching the keyboard once.
It is not elegant. The delays are hardcoded. The AppleScript is brittle. If Antigravity changes their keyboard shortcut for "new chat," the whole thing breaks. I would never ship this to anyone.
But it saved me 42 manual transitions on this project alone. And every future project gets the same benefit for free.
Setup Is Intentionally Simple
To wire up a new project, you run ./setup ~/Dev/your-project. It copies notify.sh into the project's scripts directory. Then you add one line to the end of each workflow's // turbo block:
Run `./scripts/notify.sh build complete ":rocket: Sprint 03 BUILD complete"`
That is it. The workflow does its normal thing, and at the very end, fires off two HTTP requests. One for your phone. One for the machine.
Closing Thought
There is a spectrum between "fully manual" and "production-grade automation." Most people think they need to be at one extreme or the other. But the middle, scrappy scripts that automate the tedious bits while you retain full control, is where most of the leverage actually lives.
Sprint Runner is held together with clipboard hacks and hardcoded delays. It will probably break the next time Antigravity updates. And it has already saved me more time than it took to build.
What repetitive workflow are you three AppleScript commands away from never doing again?