diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md deleted file mode 100644 index faf19fa..0000000 --- a/ARCHITECTURE.md +++ /dev/null @@ -1,432 +0,0 @@ -# wttr.in Architecture Documentation - -## System Overview - -wttr.in is a console-oriented weather forecast service with a hybrid Python/Go architecture: -- **Go proxy layer** (cmd/): LRU caching proxy with prefetching -- **Python backend** (lib/, bin/): Weather data fetching, formatting, rendering -- **Static assets** (share/): Translations, templates, emoji, help files - -## Request Flow - -``` -Client Request - ↓ -Go Proxy (port 8082) - LRU cache + prefetch - ↓ (cache miss) -Python Backend (port 8002) - Flask/gevent - ↓ -Location Resolution (GeoIP/IP2Location/IPInfo) - ↓ -Weather API (met.no or WorldWeatherOnline) - ↓ -Format & Render (ANSI/HTML/PNG/JSON/Prometheus) - ↓ -Response (cached with TTL 1000-2000s) -``` - -## Component Breakdown - -### 1. Go Proxy Layer (cmd/) - -**Files:** -- `cmd/srv.go` - Main HTTP server (port 8082) -- `cmd/processRequest.go` - Request processing & caching logic -- `cmd/peakHandling.go` - Peak time prefetching (cron-based) - -**Responsibilities:** -- LRU cache (12,800 entries, 1000-1500s TTL) -- Cache key: `UserAgent:Host+URI:ClientIP:AcceptLanguage` -- Prefetch popular requests at :24 and :54 past the hour -- Forward cache misses to Python backend (127.0.0.1:9002) -- Handle concurrent requests (InProgress flag prevents thundering herd) - -**Key Logic:** -- `dontCache()`: Skip caching for cyclic requests (location contains `:`) -- `getCacheDigest()`: Generate cache key from request metadata -- `processRequest()`: Main request handler with cache-aside pattern -- `savePeakRequest()`: Record requests at :30 and :00 for prefetching - -### 2. Python Backend (bin/, lib/) - -#### Entry Points (bin/) - -**bin/srv.py** - Main Flask application -- Listens on port 8002 (configurable via WTTRIN_SRV_PORT) -- Routes: `/`, `/`, `/files/`, `/favicon.ico` -- Uses gevent WSGI server for async I/O -- Delegates to `wttr_srv.wttr()` for all weather requests - -**bin/proxy.py** - Weather API proxy (separate service) -- Caches weather API responses -- Transforms met.no/WWO data to standard JSON -- Test mode support (WTTRIN_TEST env var) -- Handles translations for weather conditions - -**bin/geo-proxy.py** - Geolocation service proxy -- Not examined in detail (separate microservice) - -#### Core Logic (lib/) - -**lib/wttr_srv.py** - Main request handler -- `wttr(location, request)` - Entry point for all weather queries -- `parse_request()` - Parse location, language, format from request -- `_response()` - Generate response (checks cache, calls renderers) -- Rate limiting (300/min, 3600/hour, 24*3600/day per IP) -- ThreadPool (25 workers) for PNG rendering -- Two-phase processing: fast path (cache/static) then full path - -**lib/parse_query.py** - Query string parsing -- Parse single-letter options: `n`=narrow, `m`=metric, `u`=imperial, `T`=no-terminal, etc. -- Parse PNG filenames: `City_200x_lang=ru.png` → structured dict -- Serialize/deserialize query state (base64+zlib for short URLs) -- Metric vs imperial logic (US IPs default to imperial) - -**lib/location.py** - Location resolution -- `location_processing()` - Main entry point -- IP → Location: GeoIP2 (MaxMind), IP2Location API, IPInfo API -- Location normalization (lowercase, strip special chars) -- Geolocator service (localhost:8004) for GPS coordinates -- IATA airport code support -- Alias resolution (share/aliases file) -- Blacklist checking (share/blacklist file) -- Hemisphere detection for moon phases -- Special prefixes: - - `~` = search term (use geolocator) - - `@` = domain name (resolve to IP first) - - No prefix = exact location name - -**lib/globals.py** - Configuration -- Environment variables: WTTR_MYDIR, WTTR_GEOLITE, WTTR_WEGO, etc. -- File paths: cache dirs, static files, translations -- API keys: IP2Location, IPInfo, WorldWeatherOnline -- Constants: NOT_FOUND_LOCATION, PLAIN_TEXT_AGENTS, QUERY_LIMITS -- IP location order: geoip → ip2location → ipinfo - -**lib/cache.py** - LRU cache (Python side) -- In-memory LRU (10,000 entries, pylru) -- File cache for large responses (>80 bytes) -- TTL: 1000-2000s (randomized) -- Cache key: `UserAgent:QueryString:ClientIP:Lang` -- Dynamic timestamp replacement: `%{{NOW(timezone)}}` - -**lib/limits.py** - Rate limiting -- Per-IP query limits (minute/hour/day buckets) -- Whitelist support -- Returns 429 on limit exceeded - -#### View Renderers (lib/view/) - -**lib/view/wttr.py** - Main weather view -- Calls `wego` (Go binary) for weather rendering -- Passes flags: -inverse, -wind_in_ms, -narrow, -lang, -imperial -- Post-processes output (location name, formatting) -- Converts to HTML if needed - -**lib/view/line.py** - One-line format -- Formats: 1, 2, 3, 4, or custom with % notation -- Custom format codes: %c=condition, %t=temp, %h=humidity, %w=wind, etc. -- Supports multiple locations (`:` separated) - -**lib/view/v2.py** - Data-rich v2 format -- Experimental format with more detail -- Moon phase, astronomical times, temperature graphs -- Terminal-only, English-only - -**lib/view/moon.py** - Moon phase view -- Uses `pyphoon-lolcat` for rendering -- Supports date selection: `Moon@2016-12-25` - -**lib/view/prometheus.py** - Prometheus metrics -- Exports weather data as Prometheus metrics -- Format: `p1` - -#### Formatters (lib/fmt/) - -**lib/fmt/png.py** - PNG rendering -- Converts ANSI terminal output to PNG images -- Uses pyte (terminal emulator) + PIL -- Transparency support -- Font rendering - -**lib/fmt/unicodedata2.py** - Unicode handling -- Character width calculations for terminal rendering - -#### Other Modules (lib/) - -**lib/translations.py** - i18n support -- 54 languages supported -- Weather condition translations -- Help file translations (share/translations/) -- Language detection from Accept-Language header - -**lib/constants.py** - Weather constants -- Weather codes (WWO API) -- Condition mappings -- Emoji mappings - -**lib/buttons.py** - HTML UI elements -- Add interactive buttons to HTML output - -**lib/fields.py** - Data field extraction -- Parse weather API responses - -**lib/weather_data.py** - Weather data structures - -**lib/airports.py** - IATA code handling - -**lib/metno.py** - met.no API client -- Norwegian Meteorological Institute API -- Transforms to standard JSON format - -### 3. Static Assets (share/) - -**share/translations/** - 54 language files -- Format: `{lang}.txt` (weather conditions) -- Format: `{lang}-help.txt` (help pages) - -**share/emoji/** - Weather emoji PNGs -- Used for PNG rendering - -**share/static/** - Web assets -- favicon.ico -- style.css -- example images - -**share/templates/** - Jinja2 templates -- index.html (HTML output wrapper) - -**share/** - Data files -- `aliases` - Location aliases (from:to format) -- `blacklist` - Blocked locations -- `list-of-iata-codes.txt` - Airport codes -- `help.txt` - English help -- `bash-function.txt` - Shell integration -- `translation.txt` - Translation info page - -## API Endpoints - -### Weather Queries - -- `GET /` - Weather for IP-based location -- `GET /{location}` - Weather for specific location -- `GET /{location}.png` - PNG image output -- `GET /{location}?{options}` - Weather with options - -### Special Pages - -- `GET /:help` - Help page -- `GET /:bash.function` - Shell function -- `GET /:translation` - Translation info -- `GET /:iterm2` - iTerm2 integration - -### Static Files - -- `GET /files/{path}` - Static assets -- `GET /favicon.ico` - Favicon - -## Query Parameters - -### Single-letter Options (combined in query string) - -- `A` - Force ANSI output -- `n` - Narrow output -- `m` - Metric units -- `M` - m/s for wind speed -- `u` - Imperial units -- `I` - Inverted colors -- `t` - Transparency (PNG) -- `T` - No terminal sequences -- `p` - Padding -- `0-3` - Number of days -- `q` - No caption -- `Q` - No city name -- `F` - No follow line - -### Named Parameters - -- `lang={code}` - Language override -- `format={fmt}` - Output format (1-4, v2, j1, p1, custom) -- `view={view}` - View type (alias for format) -- `period={sec}` - Update interval for cyclic locations - -### PNG Filename Format - -`{location}_{width}x{height}_{options}_lang={lang}.png` - -Example: `London_200x_t_lang=ru.png` - -## Output Formats - -1. **ANSI** - Terminal with colors/formatting -2. **Plain text** - No ANSI codes (T option) -3. **HTML** - Web browser output -4. **PNG** - Image file -5. **JSON** (j1) - Machine-readable data -6. **Prometheus** (p1) - Metrics format -7. **One-line** (1-4) - Compact formats -8. **v2** - Data-rich experimental format - -## External Dependencies - -### Weather APIs - -- **met.no** (Norwegian Meteorological Institute) - Primary, free -- **WorldWeatherOnline** - Fallback, requires API key - -### Geolocation - -- **GeoLite2** (MaxMind) - Free GeoIP database (required) -- **IP2Location** - Commercial API (optional, needs key) -- **IPInfo** - Commercial API (optional, needs key) -- **Geolocator service** - localhost:8004 (GPS coordinates) - -### External Binaries - -- **wego** (we-lang) - Go weather rendering binary -- **pyphoon-lolcat** - Moon phase rendering - -### Python Libraries - -- Flask - Web framework -- gevent - Async I/O -- geoip2 - GeoIP lookups -- geopy - Geocoding -- requests - HTTP client -- PIL - Image processing -- pyte - Terminal emulator -- pytz - Timezone handling -- pylru - LRU cache - -### Go Libraries - -- github.com/hashicorp/golang-lru - LRU cache -- github.com/robfig/cron - Cron scheduler - -## Configuration - -### Environment Variables - -- `WTTR_MYDIR` - Installation directory -- `WTTR_GEOLITE` - Path to GeoLite2-City.mmdb -- `WTTR_WEGO` - Path to wego binary -- `WTTR_LISTEN_HOST` - Bind address (default: "") -- `WTTR_LISTEN_PORT` - Port (default: 8002) -- `WTTR_USER_AGENT` - Custom user agent -- `WTTR_IPLOCATION_ORDER` - IP location method order -- `WTTRIN_SRV_PORT` - Override listen port -- `WTTRIN_TEST` - Enable test mode - -### API Key Files - -- `~/.wwo.key` - WorldWeatherOnline API key -- `~/.ip2location.key` - IP2Location API key -- `~/.ipinfo.key` - IPInfo token -- `~/.wegorc` - Wego configuration (JSON) - -### Data Directories - -- `/wttr.in/cache/ip2l/` - IP location cache -- `/wttr.in/cache/png/` - PNG cache -- `/wttr.in/cache/lru/` - LRU file cache -- `/wttr.in/cache/proxy-wwo/` - Weather API cache -- `/wttr.in/log/` - Log files - -## Caching Strategy - -### Three-tier Cache - -1. **Go LRU** (12,800 entries, 1000-1500s TTL) - - In-memory, fastest - - Full HTTP responses - - Shared across all requests - -2. **Python LRU** (10,000 entries, 1000-2000s TTL) - - In-memory for small responses (<80 bytes) - - File-backed for large responses - - Per-process cache - -3. **File Cache** - - IP location cache (persistent) - - Weather API cache (persistent) - - PNG cache (persistent) - -### Cache Keys - -- Go: `UserAgent:Host+URI:ClientIP:AcceptLanguage` -- Python: `UserAgent:QueryString:ClientIP:Lang` - -### Cache Invalidation - -- TTL-based expiration (no manual invalidation) -- Randomized TTL prevents thundering herd -- Non-cacheable: cyclic requests (location contains `:`) - -## Prefetching - -- Cron jobs at :24 and :54 past the hour -- Records popular requests at :30 and :00 -- Spreads prefetch over 5 minutes (300s) -- Prevents cache expiry during peak times - -## Rate Limiting - -- Per-IP limits: 300/min, 3600/hour, 24*3600/day -- Whitelist support (MY_EXTERNAL_IP) -- Returns HTTP 429 on limit exceeded -- Implemented in Python layer only - -## Error Handling - -- Location not found → "not found" location (fallback weather) -- API errors → 503 Service Unavailable -- Malformed requests → 500 Internal Server Error (HTML) or error message (text) -- Blocked locations → 403 Forbidden -- Rate limit → 429 Too Many Requests - -## Logging - -- Main log: `/wttr.in/log/main.log` -- Debug log: `/tmp/wttr.in-debug.log` -- Go proxy logs to stdout - -## Testing - -- No unit tests -- Integration test: `test/query.sh` - - Makes HTTP requests to running server - - Compares SHA1 hashes of responses - - Test data in `test/test-data/signatures` -- CI: flake8 linting only (no actual tests run) - -## Known Issues & Limitations - -1. No unit test coverage -2. v2 format is experimental (terminal-only, English-only) -3. Moon phase Unicode ambiguity (hemisphere-dependent) -4. Hardcoded IP whitelist (MY_EXTERNAL_IP) -5. Multiple cache layers with different keys -6. Mixed Python/Go codebase -7. External binary dependencies (wego, pyphoon) -8. Requires external geolocator service (port 8004) -9. File cache grows unbounded -10. No cache warming on startup - -## Performance Characteristics - -- Go proxy handles ~12,800 cached requests in memory -- Python backend spawns 25 threads for PNG rendering -- Gevent provides async I/O for Python -- Prefetching reduces latency during peak times -- File cache avoids memory pressure for large responses -- Rate limiting prevents abuse - -## Security Considerations - -- IP-based rate limiting -- Location blacklist -- No authentication required -- User-provided location names passed to external APIs -- File cache uses MD5 hashes (not cryptographic) -- No input sanitization for location names -- Trusts X-Forwarded-For header diff --git a/DATA_FLOW.md b/DATA_FLOW.md deleted file mode 100644 index 461f2c3..0000000 --- a/DATA_FLOW.md +++ /dev/null @@ -1,641 +0,0 @@ -# wttr.in Data Flow Documentation - -## Request Processing Flow - -### 1. Initial Request - -``` -Client → Go Proxy (port 8082) -``` - -**Input:** -- HTTP request with location in URL -- Headers: User-Agent, Accept-Language, X-Forwarded-For -- Query parameters - -**Go Proxy Actions:** -1. Extract cache key: `UserAgent:Host+URI:ClientIP:AcceptLanguage` -2. Check if request is cacheable (no `:` in location) -3. Look up in LRU cache (12,800 entries) - -**Cache Hit Path:** -``` -Go Proxy → Check expiry → Return cached response -``` - -**Cache Miss Path:** -``` -Go Proxy → Set InProgress flag → Forward to Python Backend -``` - -### 2. Python Backend Processing - -``` -Go Proxy → Python Backend (port 8002) → Flask Router -``` - -**Flask Routes:** -- `/` → `wttr_srv.wttr(None, request)` -- `/{location}` → `wttr_srv.wttr(location, request)` -- `/:help`, `/:bash.function`, etc. → Static file handlers - -### 3. Request Parsing (wttr_srv.py) - -**Phase 1: Fast Path (Cache + Static)** - -```python -parse_request(location, request, query, fast_mode=True) - ↓ -_response(parsed_query, query, fast_mode=True) - ↓ -Check Python LRU cache - ↓ -Check if static page (:help, :bash.function, etc.) - ↓ -Return if found, else continue to slow path -``` - -**Phase 2: Full Processing** - -```python -parse_request(location, request, query, fast_mode=False) - ↓ -Location Processing - ↓ -_response(parsed_query, query, fast_mode=False) - ↓ -Render weather - ↓ -Cache and return -``` - -### 4. Location Processing (location.py) - -**Input:** Location string, Client IP - -**Processing Steps:** - -``` -1. Detect location type - ├─ Empty/MyLocation → Use client IP - ├─ IP address → Resolve to location - ├─ @domain → Resolve domain to IP, then location - ├─ ~search → Use geolocator service - ├─ Moon → Special moon handler - └─ Name → Use as-is - -2. Normalize location - ├─ Lowercase - ├─ Replace _ and + with space - └─ Remove special chars (!@#$*;:\) - -3. Check aliases (share/aliases) - └─ from:to mapping - -4. Check blacklist (share/blacklist) - └─ Return 403 if blocked - -5. Resolve location - ├─ IP → Location (GeoIP/IP2Location/IPInfo) - ├─ Name → GPS coords (Geolocator service) - └─ IATA code → Airport location - -6. Get hemisphere (for moon queries) - └─ GPS latitude > 0 = North -``` - -**Output:** -- `location` - Normalized location or GPS coords -- `override_location_name` - Display name -- `full_address` - Full address from geolocator -- `country` - Country name -- `query_source_location` - Client's location (city, country) -- `hemisphere` - True=North, False=South - -### 5. IP to Location Resolution - -**Method Priority (configurable via WTTR_IPLOCATION_ORDER):** - -``` -1. GeoIP (MaxMind GeoLite2) - ├─ Read from GeoLite2-City.mmdb - ├─ Extract city and country - └─ Fast, local, free - -2. IP2Location API (optional) - ├─ HTTP GET to api.ip2location.com - ├─ Requires API key (~/.ip2location.key) - ├─ Cache result in /wttr.in/cache/ip2l/{ip} - └─ Format: city;country - -3. IPInfo API (optional) - ├─ HTTP GET to ipinfo.io - ├─ Requires token (~/.ipinfo.key) - ├─ Cache result in /wttr.in/cache/ip2l/{ip} - └─ JSON response - -Fallback: NOT_FOUND_LOCATION ("not found") -``` - -**Caching:** -- File cache: `/wttr.in/cache/ip2l/{ip_address}` -- Format: `city;country` or `location;country;extra;city` -- Persistent across restarts - -### 6. Geolocator Service - -**For search terms (~location) and non-ASCII names:** - -``` -Python Backend → HTTP GET localhost:8004/{location} - ↓ -Geolocator Service (separate microservice) - ↓ -Returns JSON: -{ - "latitude": 48.8582602, - "longitude": 2.29449905432, - "address": "Tour Eiffel, 5, Avenue Anatole France..." -} -``` - -**Used for:** -- `~Eiffel Tower` → GPS coordinates -- `~Kilimanjaro` → GPS coordinates -- Non-ASCII location names -- IATA airport codes - -### 7. Weather Data Fetching - -**Two data sources (configured via WWO_KEY):** - -#### Option A: met.no (Norwegian Meteorological Institute) - -``` -Python Backend → metno.py - ↓ -HTTP GET to api.met.no - ↓ -Parse XML/JSON response - ↓ -Transform to standard JSON format - ↓ -Return weather data -``` - -**Advantages:** -- Free, no API key required -- High quality data -- No rate limits - -#### Option B: WorldWeatherOnline (WWO) - -``` -Python Backend → bin/proxy.py (separate service) - ↓ -Check proxy cache (/wttr.in/cache/proxy-wwo/) - ↓ -If miss: HTTP GET to api.worldweatheronline.com - ↓ -Cache response - ↓ -Return weather data -``` - -**Advantages:** -- More locations supported -- Historical data available - -**Disadvantages:** -- Requires API key (~/.wwo.key) -- Rate limited (500 queries/day free tier) - -**Weather Data Structure:** -```json -{ - "current_condition": [{ - "temp_C": "22", - "temp_F": "72", - "weatherCode": "122", - "weatherDesc": [{"value": "Overcast"}], - "windspeedKmph": "7", - "humidity": "76", - ... - }], - "weather": [ - { - "date": "2025-12-17", - "maxtempC": "25", - "mintempC": "18", - "hourly": [...] - } - ] -} -``` - -### 8. Weather Rendering - -**Route to appropriate renderer based on query:** - -``` -parsed_query → Determine view type - ↓ - ├─ format=1,2,3,4 → view/line.py (one-line format) - ├─ format=j1 → Return raw JSON - ├─ format=p1 → view/prometheus.py - ├─ format=v2 → view/v2.py (data-rich) - ├─ location=Moon → view/moon.py - └─ default → view/wttr.py (main view) -``` - -#### Main View (view/wttr.py) - -``` -get_wetter(parsed_query) - ↓ -Call wego binary (Go program) - ├─ Pass flags: -city, -lang, -imperial, -narrow, etc. - ├─ wego fetches weather data - ├─ wego renders ANSI output - └─ Return ANSI text - ↓ -Post-process output - ├─ Add location name override - ├─ Add "not found" message if needed - └─ Format for display - ↓ -If HTML output: - └─ Convert ANSI to HTML (ansi2html.sh) -``` - -**wego Command Example:** -```bash -/path/to/we-lang \ - --city=London,GB \ - -lang=en \ - -imperial \ - -narrow \ - -location_name="London" -``` - -#### One-Line View (view/line.py) - -``` -wttr_line(query, parsed_query) - ↓ -Get weather data (JSON) - ↓ -Parse format string - ├─ Predefined: 1, 2, 3, 4 - └─ Custom: %c, %t, %h, %w, etc. - ↓ -Replace format codes with data - ├─ %c → Weather emoji - ├─ %t → Temperature - ├─ %h → Humidity - └─ etc. - ↓ -Return formatted string -``` - -**Format Examples:** -- `format=3` → `London: ⛅️ +7°C` -- `format=%l:+%c+%t` → `London: ⛅️ +7°C` - -#### Moon View (view/moon.py) - -``` -get_moon(parsed_query) - ↓ -Parse date from location (Moon@2016-12-25) - ↓ -Call pyphoon-lolcat binary - ├─ Pass date parameter - └─ Return ASCII moon phase art - ↓ -Return moon phase output -``` - -#### v2 View (view/v2.py) - -``` -Experimental data-rich format - ↓ -Get weather data - ↓ -Render: - ├─ Temperature graph (ASCII) - ├─ Precipitation graph (ASCII) - ├─ Moon phases (4 days) - ├─ Current conditions (detailed) - ├─ Astronomical times (dawn, sunrise, etc.) - └─ GPS coordinates - ↓ -Return formatted output -``` - -#### Prometheus View (view/prometheus.py) - -``` -Get weather data (JSON) - ↓ -Convert to Prometheus metrics format - ├─ temperature_feels_like_celsius{forecast="current"} 7 - ├─ humidity_percent{forecast="current"} 65 - └─ etc. - ↓ -Return metrics text -``` - -### 9. PNG Rendering (fmt/png.py) - -**For .png requests:** - -``` -ANSI text output - ↓ -Spawn thread (ThreadPool, 25 workers) - ↓ -render_ansi(output, options) - ↓ -Create virtual terminal (pyte) - ├─ Feed ANSI sequences - └─ Capture terminal state - ↓ -Render to image (PIL) - ├─ Draw characters with font - ├─ Apply colors from ANSI codes - └─ Apply transparency if requested - ↓ -Return PNG bytes - ↓ -Cache in /wttr.in/cache/png/ -``` - -**Options:** -- `t` - Transparency (150) -- `transparency={0-255}` - Custom transparency -- `{width}x{height}` - Image dimensions - -### 10. Translation (translations.py) - -**Language Detection:** - -``` -1. Check subdomain (de.wttr.in → lang=de) -2. Check lang parameter (?lang=de) -3. Check Accept-Language header -4. Default to English -``` - -**Translation Files:** -- `share/translations/{lang}.txt` - Weather conditions -- `share/translations/{lang}-help.txt` - Help pages - -**Translation Process:** - -``` -Weather condition text (English) - ↓ -Look up in translations.py:TRANSLATIONS dict - ↓ -Find translation for target language - ↓ -Return translated text -``` - -**Example:** -```python -TRANSLATIONS = { - "en": {"Partly cloudy": "Partly cloudy"}, - "de": {"Partly cloudy": "Teilweise bewölkt"}, - "fr": {"Partly cloudy": "Partiellement nuageux"} -} -``` - -### 11. Caching (cache.py) - -**Python LRU Cache:** - -``` -Request → Generate cache signature - ↓ -signature = f"{user_agent}:{query_string}:{client_ip}:{lang}" - ↓ -Check in-memory LRU (10,000 entries) - ↓ -If found and not expired: - ├─ If value starts with "file:" or "bfile:" - │ └─ Read from /wttr.in/cache/lru/{md5_hash} - └─ Return value - ↓ -If not found: - ├─ Generate response - ├─ If response > 80 bytes: - │ ├─ Write to /wttr.in/cache/lru/{md5_hash} - │ └─ Store "file:{md5_hash}" in LRU - └─ Else: Store value directly in LRU - ↓ -Set expiry: current_time + random(1000, 2000) seconds - ↓ -Return response -``` - -**Dynamic Timestamps:** - -``` -Cached response with %{{NOW(timezone)}} - ↓ -On retrieval: Replace with current time in timezone - ↓ -Example: %{{NOW(Europe/London)}} → 14:32:15+0000 -``` - -### 12. Response Wrapping - -**Final response preparation:** - -``` -Response text/bytes - ↓ -Determine content type - ├─ PNG → image/png - ├─ HTML → text/html - └─ ANSI/text → text/plain - ↓ -Add buttons (if HTML and not format query) - ├─ Add interactive UI elements - └─ Wrap in HTML template - ↓ -Set HTTP headers - ├─ Content-Type - ├─ Cache-Control (PNG only) - └─ Access-Control-Allow-Origin: * - ↓ -Return Flask response -``` - -### 13. Go Proxy Caching - -**After Python backend returns:** - -``` -Python Backend → Response - ↓ -Go Proxy receives response - ↓ -If status code 200 or 304: - ├─ Store in LRU cache - ├─ Set expiry: current_time + random(1000, 1500) seconds - └─ Remove InProgress flag - ↓ -Else (error): - └─ Remove from cache - ↓ -Return response to client -``` - -### 14. Peak Request Prefetching - -**Cron-based prefetching:** - -``` -Every hour at :30 and :00 - ↓ -Record incoming requests in sync.Map - ↓ -At :24 and :54 (5 minutes before peak) - ↓ -Iterate through recorded requests - ↓ -For each request: - ├─ Spawn goroutine - ├─ Call processRequest() (refreshes cache) - ├─ Sleep (spread over 300 seconds) - └─ Delete from sync.Map - ↓ -Cache is warm for peak time -``` - -**Peak Times:** -- :30 past the hour (recorded at :30, prefetched at :24) -- :00 on the hour (recorded at :00, prefetched at :54) - -## Data Structures - -### Parsed Query - -```python -{ - "location": "London,GB", # Normalized location - "orig_location": "London", # Original input - "override_location_name": None, # Display name override - "full_address": "London, UK", # Full address - "country": "GB", # Country code - "query_source_location": ("Paris", "France"), # Client location - "hemisphere": True, # North=True, South=False - "lang": "en", # Language code - "view": None, # View type (v2, etc.) - "html_output": False, # HTML vs ANSI - "png_filename": None, # PNG filename if .png request - "ip_addr": "1.2.3.4", # Client IP - "user_agent": "curl/7.68.0", # User agent - "request_url": "http://...", # Full request URL - - # Query options - "use_metric": True, # Metric units - "use_imperial": False, # Imperial units - "use_ms_for_wind": False, # m/s for wind - "narrow": False, # Narrow output - "inverted_colors": False, # Inverted colors - "no-terminal": False, # Plain text - "no-caption": False, # No caption - "no-city": False, # No city name - "no-follow-line": False, # No follow line - "days": "3", # Number of days - "transparency": None, # PNG transparency - "padding": False, # Add padding - "force-ansi": False, # Force ANSI -} -``` - -### Cache Entry (Go) - -```go -type responseWithHeader struct { - InProgress bool // Request being processed - Expires time.Time // Expiration time - Body []byte // Response body - Header http.Header // HTTP headers - StatusCode int // HTTP status code -} -``` - -### Cache Entry (Python) - -```python -{ - "val": "response text" or "file:md5hash", - "expiry": 1702834567.123 # Unix timestamp -} -``` - -## Error Handling Flow - -### Location Not Found - -``` -Location resolution fails - ↓ -Set location = NOT_FOUND_LOCATION ("not found") - ↓ -Fetch weather for default location (Oymyakon) - ↓ -Append "not found" message in user's language - ↓ -Return response -``` - -### API Error - -``` -Weather API returns error - ↓ -Log error - ↓ -If HTML output: - └─ Return malformed-response.html (500) -Else: - └─ Return "capacity limit reached" message (503) -``` - -### Rate Limit Exceeded - -``` -Check IP against limits (300/min, 3600/hour, 86400/day) - ↓ -If exceeded: - └─ Return 429 with error message -``` - -### Blocked Location - -``` -Check location against blacklist - ↓ -If blocked: - └─ Return 403 Forbidden -``` - -## Performance Optimizations - -1. **Two-tier caching** (Go + Python) -2. **Fast path** (cache + static files checked first) -3. **File cache** for large responses (>80 bytes) -4. **Prefetching** at peak times -5. **ThreadPool** for PNG rendering (25 workers) -6. **Gevent** for async I/O in Python -7. **LRU eviction** prevents memory bloat -8. **Randomized TTL** prevents thundering herd -9. **InProgress flag** prevents duplicate work -10. **IP location caching** (persistent file cache) diff --git a/IMPERIAL_UNITS.md b/IMPERIAL_UNITS.md deleted file mode 100644 index fa56fc5..0000000 --- a/IMPERIAL_UNITS.md +++ /dev/null @@ -1,98 +0,0 @@ -# Imperial Units Implementation - -## Overview - -The application automatically selects between metric and imperial (USCS) units based on the request context, matching the behavior of the original Python implementation. - -## Unit Selection Priority - -The system determines which units to use in the following priority order: - -1. **Explicit query parameter** (highest priority) - - `?u` - Force imperial units (°F, mph, inches, inHg) - - `?m` - Force metric units (°C, km/h, mm, hPa) - -2. **Language parameter** - - `?lang=us` - Use imperial units (for us.wttr.in subdomain) - -3. **Client IP geolocation** - - Requests from US IP addresses automatically use imperial units - - Uses GeoIP database to detect country code - -4. **Default** - - Metric units for all other cases - -## Implementation Details - -### Code Changes - -1. **GeoIP Module** (`src/location/geoip.zig`) - - Added `isUSIP()` method to detect US IP addresses - - Queries MaxMind database for country ISO code - - Returns `true` if country code is "US" - -2. **Handler** (`src/http/handler.zig`) - - Added `geoip` to `HandleWeatherOptions` - - Implements priority logic for unit selection - - Extracts client IP from headers (X-Forwarded-For, X-Real-IP) - - Passes `use_imperial` flag to all renderers - -3. **Renderers** (all updated to support imperial units) - - `ansi.zig` - Shows °F instead of °C - - `line.zig` - Shows °F and mph for formats 1-4 - - `custom.zig` - Converts all units (temp, wind, precip, pressure) - - `v2.zig` - Shows imperial units in detailed format - - `json.zig` - Already outputs both units (no changes needed) - -### Conversions - -- Temperature: `°F = °C × 9/5 + 32` -- Wind speed: `mph = km/h × 0.621371` -- Precipitation: `inches = mm × 0.0393701` -- Pressure: `inHg = hPa × 0.02953` - -## Testing - -### Unit Tests - -All renderers have unit tests verifying imperial units: -- `test "render with imperial units"` in ansi.zig -- `test "format 2 with imperial units"` in line.zig -- `test "render custom format with imperial units"` in custom.zig -- `test "render v2 format with imperial units"` in v2.zig -- `test "imperial units selection logic"` in handler.zig -- `test "isUSIP detects US IPs"` in geoip.zig - -### Integration Testing - -```bash -# Test with lang=us -curl "http://localhost:8002/London?lang=us&format=2" -# Output: 51.5074,-0.1278: ☁️ 54°F 🌬️SW19mph - -# Test with explicit ?u -curl "http://localhost:8002/London?format=2&u" -# Output: 51.5074,-0.1278: ☁️ 54°F 🌬️SW19mph - -# Test metric override -curl "http://localhost:8002/London?lang=us&format=2&m" -# Output: 51.5074,-0.1278: ☁️ 12°C 🌬️SW30km/h - -# Test from US IP (automatic detection) -curl -H "X-Forwarded-For: 1.1.1.1" "http://localhost:8002/London?format=2" -# Output: Uses imperial as 1.1.1.1 is detected as US IP -``` - -## Documentation Updates - -- **API_ENDPOINTS.md** - Added "Unit System Defaults" section explaining priority -- **README.md** - Updated "Implemented Features" to mention auto-detection -- **IMPERIAL_UNITS.md** - This document - -## Compatibility - -This implementation matches the original Python behavior in `lib/parse_query.py`: -- Same priority order -- Same detection logic -- Same unit conversions -- Compatible with existing clients and scripts diff --git a/TARGET_ARCHITECTURE.md b/TARGET_ARCHITECTURE.md index ab5db9f..4133e28 100644 --- a/TARGET_ARCHITECTURE.md +++ b/TARGET_ARCHITECTURE.md @@ -5,10 +5,13 @@ Single Zig binary (`wttr`) with simplified architecture: - One HTTP server (karlseguin/http.zig) - One caching layer (LRU + file-backed) + Note: There are multiple caches for IP->location, geocoding, and weather, + but these are in a single layer...this avoids multiple file-based + cache layers - Pluggable weather provider interface - Rate limiting as injectable middleware - English-only (i18n can be added later) -- No cron-based prefetching (operational concern) +- No cron-based prefetching (if you want that, just hit the endpoint on a schedule) ## System Diagram @@ -31,13 +34,14 @@ Weather Provider (interface) ├─ MetNo (default) └─ Mock (tests) ↓ +Cache Store + ↓ Renderer ├─ ANSI ├─ JSON ├─ Line - └─ (PNG/HTML later) - ↓ -Cache Store + └─ HTML + └─ (PNG later) ↓ Response ``` @@ -902,14 +906,8 @@ WTTR_LISTEN_PORT=8080 ./zig-out/bin/wttr ``` **Docker:** -```dockerfile -FROM alpine:latest -COPY zig-out/bin/wttr /usr/local/bin/wttr -COPY GeoLite2-City.mmdb /data/GeoLite2-City.mmdb -ENV WTTR_GEOLITE_PATH=/data/GeoLite2-City.mmdb -EXPOSE 8002 -CMD ["/usr/local/bin/wttr"] -``` + +See dockerfile in ./docker. This is a FROM SCRATCH image based on a musl static binary ## Performance Targets @@ -924,15 +922,14 @@ CMD ["/usr/local/bin/wttr"] - >1,000 req/s (uncached) **Memory:** -- Base: <50MB +- Base: <50MB (currently 9MB) - With 10,000 cache entries: <200MB -- Binary size: <5MB +- Binary size: <5MB (ReleaseSmall currently less than 2MB) ## Future Enhancements (Not in Initial Version) 1. **i18n Support** - Add translation system 2. **PNG Rendering** - Add image output -3. **HTML Output** - Add web browser support 4. **v2 Format** - Data-rich experimental format 5. **Prometheus Format** - Metrics output 6. **Moon Phases** - Special moon queries @@ -940,49 +937,31 @@ CMD ["/usr/local/bin/wttr"] 8. **Metrics/Observability** - Prometheus metrics, structured logging 9. **Admin API** - Cache stats, health checks -## Migration from Current System - -**Phase 1: Parallel Deployment** -- Deploy Zig version on different port (8003) -- Route 1% of traffic to Zig version -- Monitor errors and performance -- Compare outputs - -**Phase 2: Gradual Rollout** -- Increase traffic to 10%, 25%, 50%, 100% -- Monitor at each step -- Rollback if issues - -**Phase 3: Cutover** -- Switch all traffic to Zig version -- Keep Python/Go as backup for 1 week -- Decommission old system - ## Success Criteria **Functional:** -- [ ] All core endpoints work (/, /{location}, /:help) -- [ ] ANSI output matches current system -- [ ] JSON output matches current system -- [ ] One-line formats work -- [ ] Location resolution works (IP, name, coordinates) -- [ ] Rate limiting works -- [ ] Caching works +- [x] All core endpoints work (/, /{location}, /:help) +- [x] ANSI output matches current system +- [x] JSON output matches current system +- [x] One-line formats work +- [x] Location resolution works (IP, name, coordinates) +- [x] Rate limiting works +- [x] Caching works **Performance:** -- [ ] Latency ≤ current system -- [ ] Throughput ≥ current system -- [ ] Memory ≤ current system -- [ ] Binary size <10MB +- [x] Latency ≤ current system +- [x] Throughput ≥ current system +- [x] Memory ≤ current system +- [x] Binary size <10MB (under ReleaseSafe) **Quality:** - [ ] >80% test coverage - [ ] No memory leaks (valgrind clean) - [ ] No crashes under load -- [ ] Clean error handling +- [x] Clean error handling **Operational:** -- [ ] Single binary deployment -- [ ] Simple configuration -- [ ] Good logging -- [ ] Easy to debug +- [x] Single binary deployment +- [x] Simple configuration +- [x] Good logging +- [x] Easy to debug diff --git a/src/http/help.zig b/src/http/help.zig index f95ebcf..b0dbc89 100644 --- a/src/http/help.zig +++ b/src/http/help.zig @@ -1,39 +1,86 @@ const std = @import("std"); pub const help_page = - \\wttr.in - Weather Forecast Service + \\wttr - Weather Forecast Service + \\ + \\Based on wttr.in. items below with an * are not implemented in this version \\ \\Usage: - \\ curl wttr.in # Weather for your location - \\ curl wttr.in/London # Weather for London - \\ curl wttr.in/~Eiffel+Tower # Weather for special location - \\ curl wttr.in/@github.com # Weather for domain location - \\ curl wttr.in/muc # Weather for airport (IATA code) \\ - \\Query Parameters: - \\ ?format=FORMAT Output format (1,2,3,4,j1,p1,v2) - \\ ?lang=LANG Language code (en,de,fr,etc) - \\ ?u Use USCS units - \\ ?m Use metric units - \\ ?t Transparency for PNG + \\ $ curl # current location + \\ $ curl /muc # weather in the Munich airport \\ - \\Special Endpoints: - \\ /:help This help page - \\ /:translation Translation information + \\Supported location types: \\ - \\Examples: - \\ curl wttr.in/Paris?format=3 - \\ curl wttr.in/Berlin?lang=de - \\ curl wttr.in/Tokyo?m + \\ /paris # city name + \\ /~Eiffel+tower # any location (+ for spaces) + \\ /Москва # Unicode name of any location in any language + \\ /muc # airport code (3 letters) + \\ /@stackoverflow.com # domain name + \\ /94107 # area codes + \\ /-78.46,106.79 # GPS coordinates \\ - \\For more information visit: https://github.com/chubin/wttr.in + \\Moon phase information (not yet implemented): \\ + \\ /moon # Moon phase (add ,+US or ,+France for these cities) + \\ /moon@2016-10-25 # Moon phase for the date (@2016-10-25) + \\ + \\Query string units: + \\ + \\ m # metric (SI) (used by default everywhere except US) + \\ u # USCS (used by default in US) + \\ M # * show wind speed in m/s + \\ + \\Query string view options: + \\ + \\ 0 # only current weather + \\ 1 # current weather + today's forecast + \\ 2 # current weather + today's + tomorrow's forecast + \\ A # ignore User-Agent and force ANSI output format (terminal) + \\ d # * restrict output to standard console font glyphs + \\ F # * do not show the "Follow" line (not necessary - this version does not have a follow line) + \\ n # * narrow version (only day and night) + \\ q # * quiet version (no "Weather report" text) + \\ Q # * superquiet version (no "Weather report", no city name) + \\ T # switch terminal sequences off (no colors) + \\ + \\PNG options: + \\ + \\ /paris.png # generate a PNG file + \\ p # add frame around the output + \\ t # transparency 150 + \\ transparency=... # transparency from 0 to 255 (255 = not transparent) + \\ background=... # background color in form RRGGBB, e.g. 00aaaa + \\ + \\Options can be combined: + \\ + \\ /Paris?0pq + \\ /Paris?0pq&lang=fr + \\ /Paris_0pq.png # in PNG the file mode are specified after _ + \\ /Rome_0pq_lang=it.png # long options are separated with underscore + \\ + \\* Localization: + \\ + \\ $ curl fr.wttr.in/Paris + \\ $ curl wttr.in/paris?lang=fr + \\ $ curl -H "Accept-Language: fr" wttr.in/paris + \\ + \\* Supported languages: + \\ + \\ am ar af be bn ca da de el et fr fa gl hi hu ia id it lt mg nb nl oc pl pt-br ro ru ta tr th uk vi zh-cn zh-tw (supported) + \\ az bg bs cy cs eo es eu fi ga hi hr hy is ja jv ka kk ko ky lv mk ml mr nl fy nn pt pt-br sk sl sr sr-lat sv sw te uz zh zu he (in progress) + \\ + \\Special URLs: + \\ + \\ /:help # show this page + \\ /:bash.function # show recommended bash function wttr() + \\ /:translation # show the information about the translators ; pub const translation_page = - \\wttr.in Translation + \\wttr Translation \\ - \\wttr.in is currently translated into 54 languages. + \\wttr is currently translated into 54 languages. \\ \\NOTE: Translations are not implemented in this version! \\ diff --git a/src/location/resolver.zig b/src/location/resolver.zig index 68b6566..48afc6d 100644 --- a/src/location/resolver.zig +++ b/src/location/resolver.zig @@ -24,6 +24,19 @@ pub const LocationType = enum { domain_name, }; +/// Primary way to resolve a string to some sort of location. The string +/// can represent: +/// +/// * An IP address: +/// Uses GeoIp, which checks the Geolite2 database, then falls back to +/// ip2location.info if not found. ip2location has a permanent cache +/// * Domain name: +/// Resolves the domain name to an IP address, then follows the IP flow +/// * Airport code: +/// Uses Airports, which uses openflights data to determine location +/// * Place name (also "special location, when a user uses '~' as the query): +/// Uses Nominatum (open street map) online service to resolve. This also +/// has a permanent cache pub const Resolver = struct { allocator: std.mem.Allocator, geoip: ?*GeoIp,