Enterprise architecture review · hiring brief

Zucchini Chat
spawner repo · architectural & security review

Decision brief for hiring the author of zucchini-chat/zucchini-spawner. Business context, architecture diagrams, SWOT, and an interview cheat sheet — distilled from a line-by-line read of the public mirror at v2.3.2.

Reviewer · senior enterprise architect Subject · github.com/zucchini-chat/zucchini-spawner Scope · 3,571 LOC Rust · v2.3.2 Date · 2026-05-26

01 · Business description

What Zucchini Chat is

Zucchini Chat is a chat-style mobile remote control for coding agents. An iOS app on a developer's phone talks to Claude Code running on the developer's own laptop or desktop — the way Telegram talks to a bot, but the "bot" is an agentic IDE.

B2C / pro-sumerTarget segment

iOS firstClient surface

macOS · LinuxHost platforms

The three‑sided product

iOS app — closed‑source. Chat UI, agent‑status, push notifications, key paste / pairing.
Backend — closed‑source. Postgres + PowerSync streaming sync rules, a Rust auth service, Stytch (email OTP / Apple / Google), Cloudflare R2 for attachment blobs, APNs for push.
Spawner — open‑sourced in this repo. A small Rust daemon that lives on the developer's machine, decrypts messages with the user's key, forks the claude CLI in the right project directory, and streams responses back encrypted.

Trust model in one sentence

Message bodies are end‑to‑end encrypted with a per‑user key K_user that never leaves the user's devices. The backend operator sees ciphertext and metadata only. The spawner is the only Zucchini code that ever touches plaintext — and it lives on the user's own hardware. trust root E2E

Strategic position

Zucchini sits in a young, fast‑moving category: mobile front‑ends for desktop coding agents. Adjacent / competitive: GitHub mobile review flows, Cursor's web/mobile clients, Conductor / VibeTunnel / Termius, and the agent‑on‑a‑laptop variants emerging around Claude Code and Codex. The defensible wedge is the E2E posture (backend‑blind) combined with a polished iOS experience, not a backend SaaS moat.

02 · Architecture diagrams

2.1 — System topology

End-to-end view: iOS → backend → spawner → Claude Code. The spawner is the only component that ever sees plaintext.

flowchart LR
  classDef closed fill:#1a2028,stroke:#3a414c,color:#aab2bd
  classDef open fill:#13321b,stroke:#7ee787,color:#e6e9ee
  classDef external fill:#1a1f2e,stroke:#79c0ff,color:#cdd6e3

  subgraph User["Developer"]
    iOS["iOS app
(closed source)"]:::closed
    Mac["Mac / Linux dev box"]
  end

  subgraph Backend["Cloudflare-hosted backend (closed source)"]
    Auth["Rust auth svc
Stytch · OTP · OAuth"]:::closed
    PS["PowerSync stream
per-user buckets"]:::closed
    PG[("Postgres
ciphertext + metadata")]:::closed
    R2[("Cloudflare R2
encrypted attachment blobs")]:::external
    APNs["APNs"]:::external
  end

  subgraph Host["Developer machine"]
    Spawner["zucchini-spawner
(this repo, open)"]:::open
    Claude["claude CLI
--dangerously-skip-permissions"]:::open
    FS[("~/projects, ~/.ssh
~/.claude/, K_user")]:::open
  end

  iOS -- "encrypted writes" --> PS
  iOS <-- "encrypted reads" --> PS
  iOS -- "/auth/* (Stytch)" --> Auth
  PS <--> PG
  iOS -- "presigned PUT" --> R2

  Spawner -- "POST /sync/stream
read ciphertext" --> PS
  Spawner -- "POST /api/writes
write ciphertext" --> PG
  Spawner -- "POST /api/blobs/download-url
then presigned GET" --> R2
  Spawner -- "POST /auth/token
spawner-token → short JWT" --> Auth

  Spawner -- "decrypt & fork" --> Claude
  Claude -- "stdout stream-json" --> Spawner
  Claude -- "reads/writes" --> FS

  Auth -. "push agent-finished" .-> APNs
  APNs -. "notification" .-> iOS

2.2 — End‑to‑end encryption envelope

XChaCha20-Poly1305 with random 24-byte nonces. Per‑(user, machine) keys; backend sees only base64 ciphertext in messages.body and blob bytes in R2.

sequenceDiagram
  autonumber
  participant iOS as iOS app
  participant BE as Backend (PowerSync + R2)
  participant SP as Spawner
  participant CL as Claude Code

  Note over iOS,SP: K_user established once at pairing
(today: pasted; planned: SAS-verified ECDH)
  iOS->>iOS: serialize { text, attachments[] } → JSON
  iOS->>iOS: XChaCha20-Poly1305 encrypt with K_user
(24B nonce ‖ ciphertext ‖ tag)
  iOS->>BE: write messages.body (base64 ciphertext)
  iOS->>BE: PUT encrypted blob to R2 via presigned URL

  BE-->>SP: stream PUT messages row (ciphertext only)
  SP->>SP: lookup KUser by user_id
(file ~/.zucchini-spawner/key_)
  SP->>SP: decrypt envelope JSON
  SP->>BE: POST /api/blobs/download-url
  BE-->>SP: presigned R2 GET
  SP->>BE: GET blob ciphertext
  SP->>SP: decrypt blob → ~/.zucchini-spawner/attachments/
  SP->>CL: spawn claude --print --stream-json
cd project && cat prompt | claude
  CL-->>SP: stdout JSON frames
  SP->>SP: encrypt each frame with K_user
  SP->>BE: POST /api/writes (batched ciphertext)
  BE-->>iOS: stream PUT to iOS (ciphertext)
  iOS->>iOS: decrypt and render

2.3 — Spawner internals (Tokio process model)

A single Tokio runtime, several long‑lived tasks behind a select-loop. Cleanly separated I/O concerns; drain semantics on update.

flowchart TB
  classDef task fill:#13171c,stroke:#7ee787,color:#e6e9ee
  classDef ext fill:#1a1f2e,stroke:#79c0ff,color:#cdd6e3
  classDef state fill:#2b1a1a,stroke:#ff7b72,color:#e6e9ee

  Main["main select-loop
(SIGTERM · SIGINT · revoked)"]:::task

  PSL["powersync::run
NDJSON stream reader"]:::task
  WR["writer::run
batched POST /api/writes
retry+backoff, idle-drain"]:::task
  UP["updater::run_update_loop
poll version.txt, swap binary"]:::task
  HB["heartbeat task
60s PATCH machines"]:::task
  PW["power::start_wake_watcher
IOKit full-wake only"]:::task
  SUP["Supervisor
per-chat agent tasks"]:::task

  KS[("KeyStore
~/.zucchini-spawner/key_")]:::state
  ST[("state.json
buckets cursor + projects")]:::state

  PSAPI["api.zucchini.chat
/sync/stream · /api/writes · /auth/token"]:::ext
  R2["Cloudflare R2"]:::ext
  CLAUDE["claude CLI
via /bin/zsh -lic (login shell)"]:::ext
  SYS["launchd · systemd --user"]:::ext

  Main -- spawns --> PSL
  Main -- spawns --> WR
  Main -- spawns --> UP
  Main -- spawns --> HB
  Main -- spawns --> PW
  Main -- spawns --> SUP

  PSL --> PSAPI
  WR --> PSAPI
  UP --> PSAPI
  HB --> PSAPI
  SUP --> CLAUDE
  SUP -- attachments --> R2
  SUP -- decrypt/encrypt --> KS
  WR -- encrypt body --> KS
  Main -- persist --> ST
  SYS -- supervises --> Main

2.4 — Supply chain & release pipeline

What protects the user from a malicious build. Sigstore-keyless signing is shipped; verification on the consuming side is not yet wired up — see the Tier-1 concern in the cheat sheet.

flowchart LR
  classDef ok fill:#13321b,stroke:#7ee787,color:#e6e9ee
  classDef gap fill:#3a1a1a,stroke:#ff7b72,color:#e6e9ee
  classDef ext fill:#1a1f2e,stroke:#79c0ff,color:#cdd6e3

  Dev["Author pushes vX.Y.Z tag"]:::ok
  GHA["GitHub Actions
release.yml · 4 targets"]:::ok
  Apple["Apple Developer ID codesign
hardened runtime + timestamp"]:::ok
  Fulcio["Sigstore Fulcio
keyless OIDC certificate"]:::ext
  Rekor["Rekor
public transparency log"]:::ext
  GHR["GitHub Release
binary + .sigstore bundle"]:::ok
  CDN["api.zucchini.chat/install/
+ zucchini-spawner-version.txt"]:::ok

  Updater["src/updater.rs
curl -sf version.txt
curl -f binary
chmod +x · mv"]:::gap
  Verify["cosign verify-blob
NOT YET IMPLEMENTED"]:::gap

  Dev --> GHA --> Apple --> Fulcio --> Rekor
  GHA --> GHR --> CDN --> Updater
  Updater -. "should call but doesn't" .-> Verify
  Verify -. "would pin to refs/tags/v.* identity" .-> Fulcio

03 · SWOT analysis

Strengths internal · positive

Engineering craftsmanship. ~3.5k lines of idiomatic Rust; comments document past incidents with production cost numbers ("16k Sentry events / 1342 restarts"). Top‑decile signal.
Modern crypto choices. XChaCha20‑Poly1305 AEAD with 192‑bit random nonces, per‑(user, machine) keys, 0600 at rest. No homegrown crypto.
Honest threat modelling. README enumerates current gaps with ✅/⏳ markers before a reviewer can. Rare.
Operational discipline. Idle‑drain on autoupdate, retry+backoff, wake‑aware reconnect, IOKit power assertions held only while agents run, codesign DR pinned for TCC stability.
Defensible product framing. Backend‑blind E2E posture is a real differentiator vs. SaaS‑middleman alternatives.
Mature licensing. FSL‑1.1‑MIT — source‑available now, MIT in 2 years; signals legal thoughtfulness.

Weaknesses internal · negative

Autoupdate has no signature verification today. src/updater.rs just curls a binary and mvs it over the running exe. The whole "you can verify what binary runs" pitch is currently aspirational. tier 1
--dangerously-skip-permissions is hardcoded. Any prompt → full RCE on the dev box with the user's shell env (AWS creds, secrets agents, etc.). Not in the README threat model. tier 1
Sentry with send_default_pii: true and a hardcoded DSN. Posture mismatch for an E2E product. tier 1
K_user delivered via CLI argument at install (acknowledged). Leaks via clipboard, /proc/.../cmdline, shell history.
Zero tests in the public mirror for ~3.5k LOC of security‑critical code. Tests presumably live upstream — needs verification.
Plaintext prompts in /tmp with default umask; cleaned on normal paths but leak on panic.
No --locked in CI; minor but cheap to fix for a trust‑root binary.

Opportunities external · positive

Category timing. Mobile front‑ends for desktop agents are early; first‑mover wedge is real.
Enterprise / regulated angle. "Backend operator can't read your code" is a marketable differentiator in finance, healthcare, gov.
Multi‑agent expansion. Architecture is agent‑agnostic — Codex, Aider, Gemini CLI would slot in with minimal change.
Shared‑host mode. Per‑user key file structure (key_<uid>) already anticipates multi‑user dev boxes / cloud workstations.
Compliance story. Sigstore + transparency log + reproducible builds is a credible SBOM / supply‑chain narrative for SOC2 / FedRAMP‑adjacent buyers — once verification ships.
Android port. Mostly a client‑side question; the trust model is platform‑independent.

Threats external · negative

Anthropic / OpenAI native mobile. If Claude or OpenAI ship a first‑party "mobile remote" the value prop compresses to E2E + multi‑agent. Plan for this.
CLI surface volatility. Tight coupling to claude's stream‑json shape (frame‑type substring matches, compactMetadata.postTokens). One CLI rev can break the spawner.
Supply‑chain attack on autoupdate. Until cosign verification ships, a backend compromise = RCE on every paired dev box. Reputational kill‑shot if exploited.
Regulatory drift. Any future "remote code execution agent" regulation (EU AI Act, etc.) lands here squarely.
PowerSync dependency. Vendor concentration on a small commercial sync layer; bus factor.
Solo / small team bus factor. Comment style and bug‑story specificity suggest a single primary author. Mitigation should be a hiring priority.

04 · Interview cheat sheet

How to use this

Three tiers of questions. Tier 1 must be answered before an offer; Tier 2 should be answered before scoping the first project; Tier 3 is upside / depth. Each item lists the question, what a strong answer sounds like, and what would be a red flag.

Tier 1 — must discuss before offer

Tier 1 · trust narrative gap

"You opened the source publicly before the autoupdater verifies signatures. Walk me through why."

Green Pragmatic sequencing: ship transparency first, build trust progressively; verification is on the next sprint; risk is documented in README; internal channel exists to contain blast radius until then.

Red Dismisses the gap, conflates "we have the .sigstore file" with "we verify it," or treats this as marketing.

Tier 1 · agent permissions

"Why is `--dangerously-skip-permissions` the right default? What does scoped tooling look like in this product?"

Green Owns the trade‑off: remote agency is the product, the right mitigation is identity (signed updates) + scope (per‑chat allowlists, sandboxing) on the roadmap. Knows the blast radius (full shell env).

Red Surprised by the question or hand‑waves "users opt in to remote control."

Tier 1 · ops posture

"Sentry has `send_default_pii: true` and a hardcoded DSN. For an E2E product. Talk me through that."

Green Recognises the mismatch, has a plan to scrub or move to self‑hosted error sink, can explain what PII actually flows today.

Red Was unaware, or argues PII is fine because "it's only IPs."

Tier 1 · key management

"Walk me through K_user end‑to‑end: generation, transfer to a new machine, revocation, lost laptop."

Green Knows the SAS‑ECDH plan in detail; explains revocation = backend cancels spawner token (410) → spawner self‑uninstalls and wipes its key copy; has a recovery story.

Red Vague on the pairing protocol or treats "user retypes the key" as recovery.

Tier 2 — must discuss before first sprint

Tier 2 · testing

"The public mirror has zero tests. Show me the upstream test suite — unit, integration, e2e."

Green Walks through a tiered test harness, including an e2e box (the comments reference one); explains why mirror excludes them.

Red "We test in production."

Tier 2 · shell coupling

"You spawn claude via `sh -lic`. Why not exec it directly with explicit env?"

Green Knows the trade: login shell is needed for nvm/asdf/direnv PATH discovery; trade is acceptable because the agent is already trusted as the user; would consider a hybrid.

Red Didn't realise the agent inherits the entire shell init.

Tier 2 · CLI coupling

"Your stdout parser substring-matches on frame types. How do you handle a `claude` CLI breaking change?"

Green Pins claude versions on the host (or surfaces version), has a smoke‑test harness, can describe the last breaking change they absorbed.

Red Assumes the CLI surface is stable.

Tier 2 · prompt‑file hygiene

"Prompts are decrypted to `/tmp/zucchini-prompt-*.txt` with default umask. Walk me through why that's safe."

Green Acknowledges it's not ideal; explains the (short) lifecycle; has a plan to move to O_TMPFILE or per‑user dir.

Red "It's /tmp, it's fine."

Tier 3 — depth & upside

Tier 3 · scar tissue

"Tell me about the worst production bug you shipped in this codebase. How did you find it?"

Green The systemd ${VAR} story (or similar) — restart loop seen at 1342 restarts / 16k Sentry events; explains the diagnosis and the fix (stage script on disk, not inline argv).

Red Can't name one.

Tier 3 · system design

"You chose PowerSync over rolling your own diff‑stream. Trade‑offs?"

Green Bucket model, op_id cursors, checksum semantics, the PUT → REMOVE → PUT quirk on row updates, and a clear‑eyed view of the lock‑in risk.

Red Doesn't know what a bucket is.

Tier 3 · roadmap judgement

"If you had three engineers for one quarter, what ships?"

Green Cosign verification, SAS pairing, scoped tool permissions — in that order, because each is gating the next narrative claim. Or a defensible counter‑argument.

Red Feature work that doesn't close the security narrative gap.

Tier 3 · org fit

"This codebase reads like one author. How do you onboard a second?"

Green Has thought about it — comment style as onboarding, paired threat‑modelling, the upstream monorepo structure as a forcing function.

Red Hasn't.

Quick read — green flags vs. red flags

Green flags already visible in the repo

Comments document why, not what; many cite past incidents.
SIGTERM-then-SIGKILL with grace, correct process group semantics.
Channel‑level idle‑drain before binary swap.
Codesign designated‑requirement pinning for TCC stability.
Power management aware of dark wake vs. full wake.
Migration code for legacy single‑key → per‑user key.
Threat model written in plain English in the README.

Red flags that would surface in interview

Can't explain the cosign‑gap honestly.
Treats --dangerously-skip-permissions as solved by encryption.
No mental model of which secrets the spawner inherits from the shell.
Surprised the autoupdater isn't verifying signatures.
"Tests live upstream" but can't actually produce them.
Doesn't have a recovery story for a falsely‑revoked spawner that wiped K_user.

05 · Verdict

Hire signal: strong, conditional on Tier‑1 answers.

This is above‑average senior engineering. Idiomatic Rust, correct concurrency primitives, real operational scar‑tissue documented as code comments, modern crypto, and a credible supply‑chain pipeline. The single load‑bearing gap is the distance between the public security narrative and the current code — and it's documented honestly in the repo itself.

If the candidate's Tier‑1 answers don't dodge, this is a strong staff‑level hire who can lead the security‑critical surface of a small startup. If they do dodge, you've learned that the narrative outran the code without the author noticing — and that's a different role conversation.

no hire

junior

mid

senior · conditional

staff · conditional

Confidence: medium‑high on craft, medium on architecture (cannot evaluate backend / iOS halves), medium on team‑fit (one repo isn't a team signal).