# wttr.in Data Flow Documentation ## Request Processing Flow ### 1. Initial Request ``` Client → Go Proxy (port 8082) ``` **Input:** - HTTP request with location in URL - Headers: User-Agent, Accept-Language, X-Forwarded-For - Query parameters **Go Proxy Actions:** 1. Extract cache key: `UserAgent:Host+URI:ClientIP:AcceptLanguage` 2. Check if request is cacheable (no `:` in location) 3. Look up in LRU cache (12,800 entries) **Cache Hit Path:** ``` Go Proxy → Check expiry → Return cached response ``` **Cache Miss Path:** ``` Go Proxy → Set InProgress flag → Forward to Python Backend ``` ### 2. Python Backend Processing ``` Go Proxy → Python Backend (port 8002) → Flask Router ``` **Flask Routes:** - `/` → `wttr_srv.wttr(None, request)` - `/{location}` → `wttr_srv.wttr(location, request)` - `/:help`, `/:bash.function`, etc. → Static file handlers ### 3. Request Parsing (wttr_srv.py) **Phase 1: Fast Path (Cache + Static)** ```python parse_request(location, request, query, fast_mode=True) ↓ _response(parsed_query, query, fast_mode=True) ↓ Check Python LRU cache ↓ Check if static page (:help, :bash.function, etc.) ↓ Return if found, else continue to slow path ``` **Phase 2: Full Processing** ```python parse_request(location, request, query, fast_mode=False) ↓ Location Processing ↓ _response(parsed_query, query, fast_mode=False) ↓ Render weather ↓ Cache and return ``` ### 4. Location Processing (location.py) **Input:** Location string, Client IP **Processing Steps:** ``` 1. Detect location type ├─ Empty/MyLocation → Use client IP ├─ IP address → Resolve to location ├─ @domain → Resolve domain to IP, then location ├─ ~search → Use geolocator service ├─ Moon → Special moon handler └─ Name → Use as-is 2. Normalize location ├─ Lowercase ├─ Replace _ and + with space └─ Remove special chars (!@#$*;:\) 3. Check aliases (share/aliases) └─ from:to mapping 4. Check blacklist (share/blacklist) └─ Return 403 if blocked 5. Resolve location ├─ IP → Location (GeoIP/IP2Location/IPInfo) ├─ Name → GPS coords (Geolocator service) └─ IATA code → Airport location 6. Get hemisphere (for moon queries) └─ GPS latitude > 0 = North ``` **Output:** - `location` - Normalized location or GPS coords - `override_location_name` - Display name - `full_address` - Full address from geolocator - `country` - Country name - `query_source_location` - Client's location (city, country) - `hemisphere` - True=North, False=South ### 5. IP to Location Resolution **Method Priority (configurable via WTTR_IPLOCATION_ORDER):** ``` 1. GeoIP (MaxMind GeoLite2) ├─ Read from GeoLite2-City.mmdb ├─ Extract city and country └─ Fast, local, free 2. IP2Location API (optional) ├─ HTTP GET to api.ip2location.com ├─ Requires API key (~/.ip2location.key) ├─ Cache result in /wttr.in/cache/ip2l/{ip} └─ Format: city;country 3. IPInfo API (optional) ├─ HTTP GET to ipinfo.io ├─ Requires token (~/.ipinfo.key) ├─ Cache result in /wttr.in/cache/ip2l/{ip} └─ JSON response Fallback: NOT_FOUND_LOCATION ("not found") ``` **Caching:** - File cache: `/wttr.in/cache/ip2l/{ip_address}` - Format: `city;country` or `location;country;extra;city` - Persistent across restarts ### 6. Geolocator Service **For search terms (~location) and non-ASCII names:** ``` Python Backend → HTTP GET localhost:8004/{location} ↓ Geolocator Service (separate microservice) ↓ Returns JSON: { "latitude": 48.8582602, "longitude": 2.29449905432, "address": "Tour Eiffel, 5, Avenue Anatole France..." } ``` **Used for:** - `~Eiffel Tower` → GPS coordinates - `~Kilimanjaro` → GPS coordinates - Non-ASCII location names - IATA airport codes ### 7. Weather Data Fetching **Two data sources (configured via WWO_KEY):** #### Option A: met.no (Norwegian Meteorological Institute) ``` Python Backend → metno.py ↓ HTTP GET to api.met.no ↓ Parse XML/JSON response ↓ Transform to standard JSON format ↓ Return weather data ``` **Advantages:** - Free, no API key required - High quality data - No rate limits #### Option B: WorldWeatherOnline (WWO) ``` Python Backend → bin/proxy.py (separate service) ↓ Check proxy cache (/wttr.in/cache/proxy-wwo/) ↓ If miss: HTTP GET to api.worldweatheronline.com ↓ Cache response ↓ Return weather data ``` **Advantages:** - More locations supported - Historical data available **Disadvantages:** - Requires API key (~/.wwo.key) - Rate limited (500 queries/day free tier) **Weather Data Structure:** ```json { "current_condition": [{ "temp_C": "22", "temp_F": "72", "weatherCode": "122", "weatherDesc": [{"value": "Overcast"}], "windspeedKmph": "7", "humidity": "76", ... }], "weather": [ { "date": "2025-12-17", "maxtempC": "25", "mintempC": "18", "hourly": [...] } ] } ``` ### 8. Weather Rendering **Route to appropriate renderer based on query:** ``` parsed_query → Determine view type ↓ ├─ format=1,2,3,4 → view/line.py (one-line format) ├─ format=j1 → Return raw JSON ├─ format=p1 → view/prometheus.py ├─ format=v2 → view/v2.py (data-rich) ├─ location=Moon → view/moon.py └─ default → view/wttr.py (main view) ``` #### Main View (view/wttr.py) ``` get_wetter(parsed_query) ↓ Call wego binary (Go program) ├─ Pass flags: -city, -lang, -imperial, -narrow, etc. ├─ wego fetches weather data ├─ wego renders ANSI output └─ Return ANSI text ↓ Post-process output ├─ Add location name override ├─ Add "not found" message if needed └─ Format for display ↓ If HTML output: └─ Convert ANSI to HTML (ansi2html.sh) ``` **wego Command Example:** ```bash /path/to/we-lang \ --city=London,GB \ -lang=en \ -imperial \ -narrow \ -location_name="London" ``` #### One-Line View (view/line.py) ``` wttr_line(query, parsed_query) ↓ Get weather data (JSON) ↓ Parse format string ├─ Predefined: 1, 2, 3, 4 └─ Custom: %c, %t, %h, %w, etc. ↓ Replace format codes with data ├─ %c → Weather emoji ├─ %t → Temperature ├─ %h → Humidity └─ etc. ↓ Return formatted string ``` **Format Examples:** - `format=3` → `London: ⛅️ +7°C` - `format=%l:+%c+%t` → `London: ⛅️ +7°C` #### Moon View (view/moon.py) ``` get_moon(parsed_query) ↓ Parse date from location (Moon@2016-12-25) ↓ Call pyphoon-lolcat binary ├─ Pass date parameter └─ Return ASCII moon phase art ↓ Return moon phase output ``` #### v2 View (view/v2.py) ``` Experimental data-rich format ↓ Get weather data ↓ Render: ├─ Temperature graph (ASCII) ├─ Precipitation graph (ASCII) ├─ Moon phases (4 days) ├─ Current conditions (detailed) ├─ Astronomical times (dawn, sunrise, etc.) └─ GPS coordinates ↓ Return formatted output ``` #### Prometheus View (view/prometheus.py) ``` Get weather data (JSON) ↓ Convert to Prometheus metrics format ├─ temperature_feels_like_celsius{forecast="current"} 7 ├─ humidity_percent{forecast="current"} 65 └─ etc. ↓ Return metrics text ``` ### 9. PNG Rendering (fmt/png.py) **For .png requests:** ``` ANSI text output ↓ Spawn thread (ThreadPool, 25 workers) ↓ render_ansi(output, options) ↓ Create virtual terminal (pyte) ├─ Feed ANSI sequences └─ Capture terminal state ↓ Render to image (PIL) ├─ Draw characters with font ├─ Apply colors from ANSI codes └─ Apply transparency if requested ↓ Return PNG bytes ↓ Cache in /wttr.in/cache/png/ ``` **Options:** - `t` - Transparency (150) - `transparency={0-255}` - Custom transparency - `{width}x{height}` - Image dimensions ### 10. Translation (translations.py) **Language Detection:** ``` 1. Check subdomain (de.wttr.in → lang=de) 2. Check lang parameter (?lang=de) 3. Check Accept-Language header 4. Default to English ``` **Translation Files:** - `share/translations/{lang}.txt` - Weather conditions - `share/translations/{lang}-help.txt` - Help pages **Translation Process:** ``` Weather condition text (English) ↓ Look up in translations.py:TRANSLATIONS dict ↓ Find translation for target language ↓ Return translated text ``` **Example:** ```python TRANSLATIONS = { "en": {"Partly cloudy": "Partly cloudy"}, "de": {"Partly cloudy": "Teilweise bewölkt"}, "fr": {"Partly cloudy": "Partiellement nuageux"} } ``` ### 11. Caching (cache.py) **Python LRU Cache:** ``` Request → Generate cache signature ↓ signature = f"{user_agent}:{query_string}:{client_ip}:{lang}" ↓ Check in-memory LRU (10,000 entries) ↓ If found and not expired: ├─ If value starts with "file:" or "bfile:" │ └─ Read from /wttr.in/cache/lru/{md5_hash} └─ Return value ↓ If not found: ├─ Generate response ├─ If response > 80 bytes: │ ├─ Write to /wttr.in/cache/lru/{md5_hash} │ └─ Store "file:{md5_hash}" in LRU └─ Else: Store value directly in LRU ↓ Set expiry: current_time + random(1000, 2000) seconds ↓ Return response ``` **Dynamic Timestamps:** ``` Cached response with %{{NOW(timezone)}} ↓ On retrieval: Replace with current time in timezone ↓ Example: %{{NOW(Europe/London)}} → 14:32:15+0000 ``` ### 12. Response Wrapping **Final response preparation:** ``` Response text/bytes ↓ Determine content type ├─ PNG → image/png ├─ HTML → text/html └─ ANSI/text → text/plain ↓ Add buttons (if HTML and not format query) ├─ Add interactive UI elements └─ Wrap in HTML template ↓ Set HTTP headers ├─ Content-Type ├─ Cache-Control (PNG only) └─ Access-Control-Allow-Origin: * ↓ Return Flask response ``` ### 13. Go Proxy Caching **After Python backend returns:** ``` Python Backend → Response ↓ Go Proxy receives response ↓ If status code 200 or 304: ├─ Store in LRU cache ├─ Set expiry: current_time + random(1000, 1500) seconds └─ Remove InProgress flag ↓ Else (error): └─ Remove from cache ↓ Return response to client ``` ### 14. Peak Request Prefetching **Cron-based prefetching:** ``` Every hour at :30 and :00 ↓ Record incoming requests in sync.Map ↓ At :24 and :54 (5 minutes before peak) ↓ Iterate through recorded requests ↓ For each request: ├─ Spawn goroutine ├─ Call processRequest() (refreshes cache) ├─ Sleep (spread over 300 seconds) └─ Delete from sync.Map ↓ Cache is warm for peak time ``` **Peak Times:** - :30 past the hour (recorded at :30, prefetched at :24) - :00 on the hour (recorded at :00, prefetched at :54) ## Data Structures ### Parsed Query ```python { "location": "London,GB", # Normalized location "orig_location": "London", # Original input "override_location_name": None, # Display name override "full_address": "London, UK", # Full address "country": "GB", # Country code "query_source_location": ("Paris", "France"), # Client location "hemisphere": True, # North=True, South=False "lang": "en", # Language code "view": None, # View type (v2, etc.) "html_output": False, # HTML vs ANSI "png_filename": None, # PNG filename if .png request "ip_addr": "1.2.3.4", # Client IP "user_agent": "curl/7.68.0", # User agent "request_url": "http://...", # Full request URL # Query options "use_metric": True, # Metric units "use_imperial": False, # Imperial units "use_ms_for_wind": False, # m/s for wind "narrow": False, # Narrow output "inverted_colors": False, # Inverted colors "no-terminal": False, # Plain text "no-caption": False, # No caption "no-city": False, # No city name "no-follow-line": False, # No follow line "days": "3", # Number of days "transparency": None, # PNG transparency "padding": False, # Add padding "force-ansi": False, # Force ANSI } ``` ### Cache Entry (Go) ```go type responseWithHeader struct { InProgress bool // Request being processed Expires time.Time // Expiration time Body []byte // Response body Header http.Header // HTTP headers StatusCode int // HTTP status code } ``` ### Cache Entry (Python) ```python { "val": "response text" or "file:md5hash", "expiry": 1702834567.123 # Unix timestamp } ``` ## Error Handling Flow ### Location Not Found ``` Location resolution fails ↓ Set location = NOT_FOUND_LOCATION ("not found") ↓ Fetch weather for default location (Oymyakon) ↓ Append "not found" message in user's language ↓ Return response ``` ### API Error ``` Weather API returns error ↓ Log error ↓ If HTML output: └─ Return malformed-response.html (500) Else: └─ Return "capacity limit reached" message (503) ``` ### Rate Limit Exceeded ``` Check IP against limits (300/min, 3600/hour, 86400/day) ↓ If exceeded: └─ Return 429 with error message ``` ### Blocked Location ``` Check location against blacklist ↓ If blocked: └─ Return 403 Forbidden ``` ## Performance Optimizations 1. **Two-tier caching** (Go + Python) 2. **Fast path** (cache + static files checked first) 3. **File cache** for large responses (>80 bytes) 4. **Prefetching** at peak times 5. **ThreadPool** for PNG rendering (25 workers) 6. **Gevent** for async I/O in Python 7. **LRU eviction** prevents memory bloat 8. **Randomized TTL** prevents thundering herd 9. **InProgress flag** prevents duplicate work 10. **IP location caching** (persistent file cache)