How It Works - Architecture Overview

At the core of DevLingo is an intelligent “tiered lookup” system: starting with fast local lookups and escalating to AI reasoning as needed. This way, most lookups complete in milliseconds, while even complex ones take no more than 2 seconds.

Complete Workflow

User selects text in any app
    ↓
[Trigger] Press ⌘⇧D hotkey
    ↓
[Extract] TextExtractor captures:
    • The selected text itself
    • 50 characters of surrounding context (for AI to understand context)
    • Current app's bundleIdentifier (Xcode / Slack / GitHub, etc.)
    ↓
[Classify] InputTypeDetector determines input type:
    • Chinese → Express mode
    • Single word → Word mode
    • 2-4 words with no sentence structure → Phrase mode
    • Complete sentence ≤20 words → Sentence mode
    • Multiple sentences or >20 words → Paragraph mode
    ↓
[Lookup] LookupCoordinator performs tiered lookup (details below)
    ↓
[Render] FloatingPanelController:
    • Pops up an NSPanel floating window below the cursor
    • Displays a skeleton screen (loading animation)
    • Selects the appropriate View subclass based on mode (WordView / PhraseView / SentenceView, etc.)
    • When data arrives, smoothly fades in the results
    ↓
[Interact] User can:
    • Click the play button to hear pronunciation
    • Click a word for a recursive lookup
    • Save to "Word Book" (SwiftData)
    • Close the floating window and continue working

Tiered Lookup System

This is the key to DevLingo’s performance. Lookups are performed in priority order, returning as soon as a result is found:

Tier 1: Local Technical Vocabulary (<50ms)

For English input, the local SQLite database is queried first. It contains 85+ pre-loaded common development terms:

idempotent, deployment, microservice, containerization,
latency, throughput, cache invalidation, API gateway,
circuit breaker, distributed tracing, ...

Hits roughly 30% of developer vocabulary queries. Ultra-fast response.

Tier 2: Local Development Phrase Database (<100ms)

For 2-4 word phrases, the local phrase database is queried (50+ pre-loaded phrases):

yak shaving, bikeshedding, rubber ducking, code smell,
low-hanging fruit, technical debt, nerd sniping, ...

Hits roughly 20% of phrase queries.

Tier 3: SwiftData Cache (<10ms)

Words, phrases, and sentences the user has previously looked up are cached locally in SwiftData. If found, results are returned immediately.

User looked up "idempotent" → next lookup <10ms

Hit rate is roughly 50% (depending on usage history).

Tier 4: Claude API (0.5-2s)

If the first three tiers all miss, the Claude API is called for a structured response.

Request: {
  "text": "gracefully degrade",
  "mode": "phrase",
  "context": "when upstream dependencies are unavailable",
  "sourceApp": "Xcode",
  "userLanguage": "zh-CN"
}

Response: {
  "type": "phrase", // compound verb
  "definition_en": "...",
  "definition_zh": "...",
  "examples": [...],
  "pronunciation": {...},
  "register": "technical",
  "l1_tips": "..."
}

:::note Why the Tiered Design

95% of common vocabulary hits the first three tiers, returning in <100ms
5% of rare or new terms require the Claude API, but still only take 1-2 seconds
Without a network, the local database and cache are still available
Saves user quota — Claude API pricing is usage-based; the tiered system reduces API calls by 95% :::

Backend Architecture

The DevLingo backend is deployed on Cloudflare Workers edge nodes to handle API requests.

Mac App
  ↓ HTTPS (Bearer token)
Cloudflare Workers (Edge)
  ├─ API Gateway (Hono router)
  ├─ Auth Middleware (JWT verification)
  ├─ Lookup Endpoint (/api/lookup)
  │   └─ Claude API Client (proxies AI requests)
  ├─ TTS Endpoint (/api/tts)
  │   └─ Google Cloud TTS (proxies pronunciation synthesis)
  └─ Data Sync Endpoints
      └─ Cloudflare D1 (user database)

Key Components

Hono Framework: Lightweight HTTP framework optimized for Cloudflare Workers
D1 Database: SQLite on Cloudflare, stores user Word Book and sync data
KV Store: Session tokens, rate limiting cache, API response cache
Claude API Proxy: The Mac app sends requests to Claude through Workers, avoiding local API key exposure

Text Extraction & Context

TextExtractor uses AXUIElement (macOS Accessibility API) to capture:

User selects "idempotent" in Slack

Extracted result:
{
  "selectedText": "idempotent",
  "beforeContext": "We need to make sure this endpoint is ",
  "afterContext": " for retry requests.",
  "sourceApp": "com.tinyspeck.slackmacgap",  // Slack
  "fullSentence": "We need to make sure this endpoint is idempotent for retry requests."
}

This context is sent to Claude, helping the AI understand: “idempotent here refers to an API design concept,” rather than something else.

Input Type Detection Algorithm

if text contains Chinese:
  → Express mode
elif word count == 1:
  → Word mode
elif word count in [2, 3, 4]:
  if contains full sentence structure:
    → Sentence mode
  else:
    → Phrase mode
elif sentence count > 1 or word count > 20:
  → Paragraph mode
else:
  → Sentence mode

:::tip Smart Detection Input detection happens locally and instantly, with no API call. This ensures the user never notices any delay from the “determine input type” step. :::

Floating Panel Presentation

FloatingPanelController creates an NSPanel (floating window):

Properties:
• Appears above all apps (Level: screenSaver)
• Position: directly below the cursor (y offset +20px to avoid obstruction)
• Initial state: skeleton screen (loading skeleton)
• Data arrival: fade in, destroy skeleton
• Interaction: fully passes through mouse events (doesn't block background apps)
• Close methods: click outside, press ESC, auto-close after 5 minutes of inactivity

Performance Targets

Local database lookup: <50ms (99th percentile)
Cache hit: <10ms
Full end-to-end (including API): <2s (99th percentile)
Floating window appearance: <300ms (perceived as fast by users)
TTS audio generation: First time <2s, subsequent cache <100ms

:::note Caching Strategy TTS audio is also cached locally and in KV; pronunciation for the same word is only synthesized once. :::

Data Privacy

User text
  ↓
Mac App (HTTPS encrypted)
  ↓
Cloudflare Edge (instant processing)
  ├─ Claude API request (ephemeral, not used for training)
  └─ Result cached in KV (auto-expires)
  ↓
Result returned to Mac
  ↓
Local SwiftData storage (on-device, optional cloud sync to D1)

Text content is not stored unless the user explicitly saves it to the Word Book. See Data Privacy & Security for details.

DevLingo’s architecture philosophy: “Speed first, AI as a complement, privacy above all.”