How It Works - Architecture Overview
At the core of DevLingo is an intelligent “tiered lookup” system: starting with fast local lookups and escalating to AI reasoning as needed. This way, most lookups complete in milliseconds, while even complex ones take no more than 2 seconds.
Complete Workflow
Section titled “Complete Workflow”User selects text in any app ↓[Trigger] Press ⌘⇧D hotkey ↓[Extract] TextExtractor captures: • The selected text itself • 50 characters of surrounding context (for AI to understand context) • Current app's bundleIdentifier (Xcode / Slack / GitHub, etc.) ↓[Classify] InputTypeDetector determines input type: • Chinese → Express mode • Single word → Word mode • 2-4 words with no sentence structure → Phrase mode • Complete sentence ≤20 words → Sentence mode • Multiple sentences or >20 words → Paragraph mode ↓[Lookup] LookupCoordinator performs tiered lookup (details below) ↓[Render] FloatingPanelController: • Pops up an NSPanel floating window below the cursor • Displays a skeleton screen (loading animation) • Selects the appropriate View subclass based on mode (WordView / PhraseView / SentenceView, etc.) • When data arrives, smoothly fades in the results ↓[Interact] User can: • Click the play button to hear pronunciation • Click a word for a recursive lookup • Save to "Word Book" (SwiftData) • Close the floating window and continue workingTiered Lookup System
Section titled “Tiered Lookup System”This is the key to DevLingo’s performance. Lookups are performed in priority order, returning as soon as a result is found:
Tier 1: Local Technical Vocabulary (<50ms)
Section titled “Tier 1: Local Technical Vocabulary (<50ms)”For English input, the local SQLite database is queried first. It contains 85+ pre-loaded common development terms:
idempotent, deployment, microservice, containerization,latency, throughput, cache invalidation, API gateway,circuit breaker, distributed tracing, ...Hits roughly 30% of developer vocabulary queries. Ultra-fast response.
Tier 2: Local Development Phrase Database (<100ms)
Section titled “Tier 2: Local Development Phrase Database (<100ms)”For 2-4 word phrases, the local phrase database is queried (50+ pre-loaded phrases):
yak shaving, bikeshedding, rubber ducking, code smell,low-hanging fruit, technical debt, nerd sniping, ...Hits roughly 20% of phrase queries.
Tier 3: SwiftData Cache (<10ms)
Section titled “Tier 3: SwiftData Cache (<10ms)”Words, phrases, and sentences the user has previously looked up are cached locally in SwiftData. If found, results are returned immediately.
User looked up "idempotent" → next lookup <10msHit rate is roughly 50% (depending on usage history).
Tier 4: Claude API (0.5-2s)
Section titled “Tier 4: Claude API (0.5-2s)”If the first three tiers all miss, the Claude API is called for a structured response.
Request: { "text": "gracefully degrade", "mode": "phrase", "context": "when upstream dependencies are unavailable", "sourceApp": "Xcode", "userLanguage": "zh-CN"}
Response: { "type": "phrase", // compound verb "definition_en": "...", "definition_zh": "...", "examples": [...], "pronunciation": {...}, "register": "technical", "l1_tips": "..."}:::note Why the Tiered Design
- 95% of common vocabulary hits the first three tiers, returning in <100ms
- 5% of rare or new terms require the Claude API, but still only take 1-2 seconds
- Without a network, the local database and cache are still available
- Saves user quota — Claude API pricing is usage-based; the tiered system reduces API calls by 95% :::
Backend Architecture
Section titled “Backend Architecture”The DevLingo backend is deployed on Cloudflare Workers edge nodes to handle API requests.
Mac App ↓ HTTPS (Bearer token)Cloudflare Workers (Edge) ├─ API Gateway (Hono router) ├─ Auth Middleware (JWT verification) ├─ Lookup Endpoint (/api/lookup) │ └─ Claude API Client (proxies AI requests) ├─ TTS Endpoint (/api/tts) │ └─ Google Cloud TTS (proxies pronunciation synthesis) └─ Data Sync Endpoints └─ Cloudflare D1 (user database)Key Components
Section titled “Key Components”- Hono Framework: Lightweight HTTP framework optimized for Cloudflare Workers
- D1 Database: SQLite on Cloudflare, stores user Word Book and sync data
- KV Store: Session tokens, rate limiting cache, API response cache
- Claude API Proxy: The Mac app sends requests to Claude through Workers, avoiding local API key exposure
Text Extraction & Context
Section titled “Text Extraction & Context”TextExtractor uses AXUIElement (macOS Accessibility API) to capture:
User selects "idempotent" in Slack
Extracted result:{ "selectedText": "idempotent", "beforeContext": "We need to make sure this endpoint is ", "afterContext": " for retry requests.", "sourceApp": "com.tinyspeck.slackmacgap", // Slack "fullSentence": "We need to make sure this endpoint is idempotent for retry requests."}This context is sent to Claude, helping the AI understand: “idempotent here refers to an API design concept,” rather than something else.
Input Type Detection Algorithm
Section titled “Input Type Detection Algorithm”if text contains Chinese: → Express modeelif word count == 1: → Word modeelif word count in [2, 3, 4]: if contains full sentence structure: → Sentence mode else: → Phrase modeelif sentence count > 1 or word count > 20: → Paragraph modeelse: → Sentence mode:::tip Smart Detection Input detection happens locally and instantly, with no API call. This ensures the user never notices any delay from the “determine input type” step. :::
Floating Panel Presentation
Section titled “Floating Panel Presentation”FloatingPanelController creates an NSPanel (floating window):
Properties:• Appears above all apps (Level: screenSaver)• Position: directly below the cursor (y offset +20px to avoid obstruction)• Initial state: skeleton screen (loading skeleton)• Data arrival: fade in, destroy skeleton• Interaction: fully passes through mouse events (doesn't block background apps)• Close methods: click outside, press ESC, auto-close after 5 minutes of inactivityPerformance Targets
Section titled “Performance Targets”- Local database lookup: <50ms (99th percentile)
- Cache hit: <10ms
- Full end-to-end (including API): <2s (99th percentile)
- Floating window appearance: <300ms (perceived as fast by users)
- TTS audio generation: First time <2s, subsequent cache <100ms
:::note Caching Strategy TTS audio is also cached locally and in KV; pronunciation for the same word is only synthesized once. :::
Data Privacy
Section titled “Data Privacy”User text ↓Mac App (HTTPS encrypted) ↓Cloudflare Edge (instant processing) ├─ Claude API request (ephemeral, not used for training) └─ Result cached in KV (auto-expires) ↓Result returned to Mac ↓Local SwiftData storage (on-device, optional cloud sync to D1)Text content is not stored unless the user explicitly saves it to the Word Book. See Data Privacy & Security for details.
DevLingo’s architecture philosophy: “Speed first, AI as a complement, privacy above all.”