641 lines
14 KiB
Markdown
641 lines
14 KiB
Markdown
# wttr.in Data Flow Documentation
|
|
|
|
## Request Processing Flow
|
|
|
|
### 1. Initial Request
|
|
|
|
```
|
|
Client → Go Proxy (port 8082)
|
|
```
|
|
|
|
**Input:**
|
|
- HTTP request with location in URL
|
|
- Headers: User-Agent, Accept-Language, X-Forwarded-For
|
|
- Query parameters
|
|
|
|
**Go Proxy Actions:**
|
|
1. Extract cache key: `UserAgent:Host+URI:ClientIP:AcceptLanguage`
|
|
2. Check if request is cacheable (no `:` in location)
|
|
3. Look up in LRU cache (12,800 entries)
|
|
|
|
**Cache Hit Path:**
|
|
```
|
|
Go Proxy → Check expiry → Return cached response
|
|
```
|
|
|
|
**Cache Miss Path:**
|
|
```
|
|
Go Proxy → Set InProgress flag → Forward to Python Backend
|
|
```
|
|
|
|
### 2. Python Backend Processing
|
|
|
|
```
|
|
Go Proxy → Python Backend (port 8002) → Flask Router
|
|
```
|
|
|
|
**Flask Routes:**
|
|
- `/` → `wttr_srv.wttr(None, request)`
|
|
- `/{location}` → `wttr_srv.wttr(location, request)`
|
|
- `/:help`, `/:bash.function`, etc. → Static file handlers
|
|
|
|
### 3. Request Parsing (wttr_srv.py)
|
|
|
|
**Phase 1: Fast Path (Cache + Static)**
|
|
|
|
```python
|
|
parse_request(location, request, query, fast_mode=True)
|
|
↓
|
|
_response(parsed_query, query, fast_mode=True)
|
|
↓
|
|
Check Python LRU cache
|
|
↓
|
|
Check if static page (:help, :bash.function, etc.)
|
|
↓
|
|
Return if found, else continue to slow path
|
|
```
|
|
|
|
**Phase 2: Full Processing**
|
|
|
|
```python
|
|
parse_request(location, request, query, fast_mode=False)
|
|
↓
|
|
Location Processing
|
|
↓
|
|
_response(parsed_query, query, fast_mode=False)
|
|
↓
|
|
Render weather
|
|
↓
|
|
Cache and return
|
|
```
|
|
|
|
### 4. Location Processing (location.py)
|
|
|
|
**Input:** Location string, Client IP
|
|
|
|
**Processing Steps:**
|
|
|
|
```
|
|
1. Detect location type
|
|
├─ Empty/MyLocation → Use client IP
|
|
├─ IP address → Resolve to location
|
|
├─ @domain → Resolve domain to IP, then location
|
|
├─ ~search → Use geolocator service
|
|
├─ Moon → Special moon handler
|
|
└─ Name → Use as-is
|
|
|
|
2. Normalize location
|
|
├─ Lowercase
|
|
├─ Replace _ and + with space
|
|
└─ Remove special chars (!@#$*;:\)
|
|
|
|
3. Check aliases (share/aliases)
|
|
└─ from:to mapping
|
|
|
|
4. Check blacklist (share/blacklist)
|
|
└─ Return 403 if blocked
|
|
|
|
5. Resolve location
|
|
├─ IP → Location (GeoIP/IP2Location/IPInfo)
|
|
├─ Name → GPS coords (Geolocator service)
|
|
└─ IATA code → Airport location
|
|
|
|
6. Get hemisphere (for moon queries)
|
|
└─ GPS latitude > 0 = North
|
|
```
|
|
|
|
**Output:**
|
|
- `location` - Normalized location or GPS coords
|
|
- `override_location_name` - Display name
|
|
- `full_address` - Full address from geolocator
|
|
- `country` - Country name
|
|
- `query_source_location` - Client's location (city, country)
|
|
- `hemisphere` - True=North, False=South
|
|
|
|
### 5. IP to Location Resolution
|
|
|
|
**Method Priority (configurable via WTTR_IPLOCATION_ORDER):**
|
|
|
|
```
|
|
1. GeoIP (MaxMind GeoLite2)
|
|
├─ Read from GeoLite2-City.mmdb
|
|
├─ Extract city and country
|
|
└─ Fast, local, free
|
|
|
|
2. IP2Location API (optional)
|
|
├─ HTTP GET to api.ip2location.com
|
|
├─ Requires API key (~/.ip2location.key)
|
|
├─ Cache result in /wttr.in/cache/ip2l/{ip}
|
|
└─ Format: city;country
|
|
|
|
3. IPInfo API (optional)
|
|
├─ HTTP GET to ipinfo.io
|
|
├─ Requires token (~/.ipinfo.key)
|
|
├─ Cache result in /wttr.in/cache/ip2l/{ip}
|
|
└─ JSON response
|
|
|
|
Fallback: NOT_FOUND_LOCATION ("not found")
|
|
```
|
|
|
|
**Caching:**
|
|
- File cache: `/wttr.in/cache/ip2l/{ip_address}`
|
|
- Format: `city;country` or `location;country;extra;city`
|
|
- Persistent across restarts
|
|
|
|
### 6. Geolocator Service
|
|
|
|
**For search terms (~location) and non-ASCII names:**
|
|
|
|
```
|
|
Python Backend → HTTP GET localhost:8004/{location}
|
|
↓
|
|
Geolocator Service (separate microservice)
|
|
↓
|
|
Returns JSON:
|
|
{
|
|
"latitude": 48.8582602,
|
|
"longitude": 2.29449905432,
|
|
"address": "Tour Eiffel, 5, Avenue Anatole France..."
|
|
}
|
|
```
|
|
|
|
**Used for:**
|
|
- `~Eiffel Tower` → GPS coordinates
|
|
- `~Kilimanjaro` → GPS coordinates
|
|
- Non-ASCII location names
|
|
- IATA airport codes
|
|
|
|
### 7. Weather Data Fetching
|
|
|
|
**Two data sources (configured via WWO_KEY):**
|
|
|
|
#### Option A: met.no (Norwegian Meteorological Institute)
|
|
|
|
```
|
|
Python Backend → metno.py
|
|
↓
|
|
HTTP GET to api.met.no
|
|
↓
|
|
Parse XML/JSON response
|
|
↓
|
|
Transform to standard JSON format
|
|
↓
|
|
Return weather data
|
|
```
|
|
|
|
**Advantages:**
|
|
- Free, no API key required
|
|
- High quality data
|
|
- No rate limits
|
|
|
|
#### Option B: WorldWeatherOnline (WWO)
|
|
|
|
```
|
|
Python Backend → bin/proxy.py (separate service)
|
|
↓
|
|
Check proxy cache (/wttr.in/cache/proxy-wwo/)
|
|
↓
|
|
If miss: HTTP GET to api.worldweatheronline.com
|
|
↓
|
|
Cache response
|
|
↓
|
|
Return weather data
|
|
```
|
|
|
|
**Advantages:**
|
|
- More locations supported
|
|
- Historical data available
|
|
|
|
**Disadvantages:**
|
|
- Requires API key (~/.wwo.key)
|
|
- Rate limited (500 queries/day free tier)
|
|
|
|
**Weather Data Structure:**
|
|
```json
|
|
{
|
|
"current_condition": [{
|
|
"temp_C": "22",
|
|
"temp_F": "72",
|
|
"weatherCode": "122",
|
|
"weatherDesc": [{"value": "Overcast"}],
|
|
"windspeedKmph": "7",
|
|
"humidity": "76",
|
|
...
|
|
}],
|
|
"weather": [
|
|
{
|
|
"date": "2025-12-17",
|
|
"maxtempC": "25",
|
|
"mintempC": "18",
|
|
"hourly": [...]
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
### 8. Weather Rendering
|
|
|
|
**Route to appropriate renderer based on query:**
|
|
|
|
```
|
|
parsed_query → Determine view type
|
|
↓
|
|
├─ format=1,2,3,4 → view/line.py (one-line format)
|
|
├─ format=j1 → Return raw JSON
|
|
├─ format=p1 → view/prometheus.py
|
|
├─ format=v2 → view/v2.py (data-rich)
|
|
├─ location=Moon → view/moon.py
|
|
└─ default → view/wttr.py (main view)
|
|
```
|
|
|
|
#### Main View (view/wttr.py)
|
|
|
|
```
|
|
get_wetter(parsed_query)
|
|
↓
|
|
Call wego binary (Go program)
|
|
├─ Pass flags: -city, -lang, -imperial, -narrow, etc.
|
|
├─ wego fetches weather data
|
|
├─ wego renders ANSI output
|
|
└─ Return ANSI text
|
|
↓
|
|
Post-process output
|
|
├─ Add location name override
|
|
├─ Add "not found" message if needed
|
|
└─ Format for display
|
|
↓
|
|
If HTML output:
|
|
└─ Convert ANSI to HTML (ansi2html.sh)
|
|
```
|
|
|
|
**wego Command Example:**
|
|
```bash
|
|
/path/to/we-lang \
|
|
--city=London,GB \
|
|
-lang=en \
|
|
-imperial \
|
|
-narrow \
|
|
-location_name="London"
|
|
```
|
|
|
|
#### One-Line View (view/line.py)
|
|
|
|
```
|
|
wttr_line(query, parsed_query)
|
|
↓
|
|
Get weather data (JSON)
|
|
↓
|
|
Parse format string
|
|
├─ Predefined: 1, 2, 3, 4
|
|
└─ Custom: %c, %t, %h, %w, etc.
|
|
↓
|
|
Replace format codes with data
|
|
├─ %c → Weather emoji
|
|
├─ %t → Temperature
|
|
├─ %h → Humidity
|
|
└─ etc.
|
|
↓
|
|
Return formatted string
|
|
```
|
|
|
|
**Format Examples:**
|
|
- `format=3` → `London: ⛅️ +7°C`
|
|
- `format=%l:+%c+%t` → `London: ⛅️ +7°C`
|
|
|
|
#### Moon View (view/moon.py)
|
|
|
|
```
|
|
get_moon(parsed_query)
|
|
↓
|
|
Parse date from location (Moon@2016-12-25)
|
|
↓
|
|
Call pyphoon-lolcat binary
|
|
├─ Pass date parameter
|
|
└─ Return ASCII moon phase art
|
|
↓
|
|
Return moon phase output
|
|
```
|
|
|
|
#### v2 View (view/v2.py)
|
|
|
|
```
|
|
Experimental data-rich format
|
|
↓
|
|
Get weather data
|
|
↓
|
|
Render:
|
|
├─ Temperature graph (ASCII)
|
|
├─ Precipitation graph (ASCII)
|
|
├─ Moon phases (4 days)
|
|
├─ Current conditions (detailed)
|
|
├─ Astronomical times (dawn, sunrise, etc.)
|
|
└─ GPS coordinates
|
|
↓
|
|
Return formatted output
|
|
```
|
|
|
|
#### Prometheus View (view/prometheus.py)
|
|
|
|
```
|
|
Get weather data (JSON)
|
|
↓
|
|
Convert to Prometheus metrics format
|
|
├─ temperature_feels_like_celsius{forecast="current"} 7
|
|
├─ humidity_percent{forecast="current"} 65
|
|
└─ etc.
|
|
↓
|
|
Return metrics text
|
|
```
|
|
|
|
### 9. PNG Rendering (fmt/png.py)
|
|
|
|
**For .png requests:**
|
|
|
|
```
|
|
ANSI text output
|
|
↓
|
|
Spawn thread (ThreadPool, 25 workers)
|
|
↓
|
|
render_ansi(output, options)
|
|
↓
|
|
Create virtual terminal (pyte)
|
|
├─ Feed ANSI sequences
|
|
└─ Capture terminal state
|
|
↓
|
|
Render to image (PIL)
|
|
├─ Draw characters with font
|
|
├─ Apply colors from ANSI codes
|
|
└─ Apply transparency if requested
|
|
↓
|
|
Return PNG bytes
|
|
↓
|
|
Cache in /wttr.in/cache/png/
|
|
```
|
|
|
|
**Options:**
|
|
- `t` - Transparency (150)
|
|
- `transparency={0-255}` - Custom transparency
|
|
- `{width}x{height}` - Image dimensions
|
|
|
|
### 10. Translation (translations.py)
|
|
|
|
**Language Detection:**
|
|
|
|
```
|
|
1. Check subdomain (de.wttr.in → lang=de)
|
|
2. Check lang parameter (?lang=de)
|
|
3. Check Accept-Language header
|
|
4. Default to English
|
|
```
|
|
|
|
**Translation Files:**
|
|
- `share/translations/{lang}.txt` - Weather conditions
|
|
- `share/translations/{lang}-help.txt` - Help pages
|
|
|
|
**Translation Process:**
|
|
|
|
```
|
|
Weather condition text (English)
|
|
↓
|
|
Look up in translations.py:TRANSLATIONS dict
|
|
↓
|
|
Find translation for target language
|
|
↓
|
|
Return translated text
|
|
```
|
|
|
|
**Example:**
|
|
```python
|
|
TRANSLATIONS = {
|
|
"en": {"Partly cloudy": "Partly cloudy"},
|
|
"de": {"Partly cloudy": "Teilweise bewölkt"},
|
|
"fr": {"Partly cloudy": "Partiellement nuageux"}
|
|
}
|
|
```
|
|
|
|
### 11. Caching (cache.py)
|
|
|
|
**Python LRU Cache:**
|
|
|
|
```
|
|
Request → Generate cache signature
|
|
↓
|
|
signature = f"{user_agent}:{query_string}:{client_ip}:{lang}"
|
|
↓
|
|
Check in-memory LRU (10,000 entries)
|
|
↓
|
|
If found and not expired:
|
|
├─ If value starts with "file:" or "bfile:"
|
|
│ └─ Read from /wttr.in/cache/lru/{md5_hash}
|
|
└─ Return value
|
|
↓
|
|
If not found:
|
|
├─ Generate response
|
|
├─ If response > 80 bytes:
|
|
│ ├─ Write to /wttr.in/cache/lru/{md5_hash}
|
|
│ └─ Store "file:{md5_hash}" in LRU
|
|
└─ Else: Store value directly in LRU
|
|
↓
|
|
Set expiry: current_time + random(1000, 2000) seconds
|
|
↓
|
|
Return response
|
|
```
|
|
|
|
**Dynamic Timestamps:**
|
|
|
|
```
|
|
Cached response with %{{NOW(timezone)}}
|
|
↓
|
|
On retrieval: Replace with current time in timezone
|
|
↓
|
|
Example: %{{NOW(Europe/London)}} → 14:32:15+0000
|
|
```
|
|
|
|
### 12. Response Wrapping
|
|
|
|
**Final response preparation:**
|
|
|
|
```
|
|
Response text/bytes
|
|
↓
|
|
Determine content type
|
|
├─ PNG → image/png
|
|
├─ HTML → text/html
|
|
└─ ANSI/text → text/plain
|
|
↓
|
|
Add buttons (if HTML and not format query)
|
|
├─ Add interactive UI elements
|
|
└─ Wrap in HTML template
|
|
↓
|
|
Set HTTP headers
|
|
├─ Content-Type
|
|
├─ Cache-Control (PNG only)
|
|
└─ Access-Control-Allow-Origin: *
|
|
↓
|
|
Return Flask response
|
|
```
|
|
|
|
### 13. Go Proxy Caching
|
|
|
|
**After Python backend returns:**
|
|
|
|
```
|
|
Python Backend → Response
|
|
↓
|
|
Go Proxy receives response
|
|
↓
|
|
If status code 200 or 304:
|
|
├─ Store in LRU cache
|
|
├─ Set expiry: current_time + random(1000, 1500) seconds
|
|
└─ Remove InProgress flag
|
|
↓
|
|
Else (error):
|
|
└─ Remove from cache
|
|
↓
|
|
Return response to client
|
|
```
|
|
|
|
### 14. Peak Request Prefetching
|
|
|
|
**Cron-based prefetching:**
|
|
|
|
```
|
|
Every hour at :30 and :00
|
|
↓
|
|
Record incoming requests in sync.Map
|
|
↓
|
|
At :24 and :54 (5 minutes before peak)
|
|
↓
|
|
Iterate through recorded requests
|
|
↓
|
|
For each request:
|
|
├─ Spawn goroutine
|
|
├─ Call processRequest() (refreshes cache)
|
|
├─ Sleep (spread over 300 seconds)
|
|
└─ Delete from sync.Map
|
|
↓
|
|
Cache is warm for peak time
|
|
```
|
|
|
|
**Peak Times:**
|
|
- :30 past the hour (recorded at :30, prefetched at :24)
|
|
- :00 on the hour (recorded at :00, prefetched at :54)
|
|
|
|
## Data Structures
|
|
|
|
### Parsed Query
|
|
|
|
```python
|
|
{
|
|
"location": "London,GB", # Normalized location
|
|
"orig_location": "London", # Original input
|
|
"override_location_name": None, # Display name override
|
|
"full_address": "London, UK", # Full address
|
|
"country": "GB", # Country code
|
|
"query_source_location": ("Paris", "France"), # Client location
|
|
"hemisphere": True, # North=True, South=False
|
|
"lang": "en", # Language code
|
|
"view": None, # View type (v2, etc.)
|
|
"html_output": False, # HTML vs ANSI
|
|
"png_filename": None, # PNG filename if .png request
|
|
"ip_addr": "1.2.3.4", # Client IP
|
|
"user_agent": "curl/7.68.0", # User agent
|
|
"request_url": "http://...", # Full request URL
|
|
|
|
# Query options
|
|
"use_metric": True, # Metric units
|
|
"use_imperial": False, # Imperial units
|
|
"use_ms_for_wind": False, # m/s for wind
|
|
"narrow": False, # Narrow output
|
|
"inverted_colors": False, # Inverted colors
|
|
"no-terminal": False, # Plain text
|
|
"no-caption": False, # No caption
|
|
"no-city": False, # No city name
|
|
"no-follow-line": False, # No follow line
|
|
"days": "3", # Number of days
|
|
"transparency": None, # PNG transparency
|
|
"padding": False, # Add padding
|
|
"force-ansi": False, # Force ANSI
|
|
}
|
|
```
|
|
|
|
### Cache Entry (Go)
|
|
|
|
```go
|
|
type responseWithHeader struct {
|
|
InProgress bool // Request being processed
|
|
Expires time.Time // Expiration time
|
|
Body []byte // Response body
|
|
Header http.Header // HTTP headers
|
|
StatusCode int // HTTP status code
|
|
}
|
|
```
|
|
|
|
### Cache Entry (Python)
|
|
|
|
```python
|
|
{
|
|
"val": "response text" or "file:md5hash",
|
|
"expiry": 1702834567.123 # Unix timestamp
|
|
}
|
|
```
|
|
|
|
## Error Handling Flow
|
|
|
|
### Location Not Found
|
|
|
|
```
|
|
Location resolution fails
|
|
↓
|
|
Set location = NOT_FOUND_LOCATION ("not found")
|
|
↓
|
|
Fetch weather for default location (Oymyakon)
|
|
↓
|
|
Append "not found" message in user's language
|
|
↓
|
|
Return response
|
|
```
|
|
|
|
### API Error
|
|
|
|
```
|
|
Weather API returns error
|
|
↓
|
|
Log error
|
|
↓
|
|
If HTML output:
|
|
└─ Return malformed-response.html (500)
|
|
Else:
|
|
└─ Return "capacity limit reached" message (503)
|
|
```
|
|
|
|
### Rate Limit Exceeded
|
|
|
|
```
|
|
Check IP against limits (300/min, 3600/hour, 86400/day)
|
|
↓
|
|
If exceeded:
|
|
└─ Return 429 with error message
|
|
```
|
|
|
|
### Blocked Location
|
|
|
|
```
|
|
Check location against blacklist
|
|
↓
|
|
If blocked:
|
|
└─ Return 403 Forbidden
|
|
```
|
|
|
|
## Performance Optimizations
|
|
|
|
1. **Two-tier caching** (Go + Python)
|
|
2. **Fast path** (cache + static files checked first)
|
|
3. **File cache** for large responses (>80 bytes)
|
|
4. **Prefetching** at peak times
|
|
5. **ThreadPool** for PNG rendering (25 workers)
|
|
6. **Gevent** for async I/O in Python
|
|
7. **LRU eviction** prevents memory bloat
|
|
8. **Randomized TTL** prevents thundering herd
|
|
9. **InProgress flag** prevents duplicate work
|
|
10. **IP location caching** (persistent file cache)
|