wttr/ARCHITECTURE.md
Emil Lerch ea88a88e31
All checks were successful
Generic zig build / build (push) Successful in 1m21s
Generic zig build / deploy (push) Successful in 14s
update architecture cache section to reflect recent changes
2026-01-09 13:54:32 -08:00

373 lines
12 KiB
Markdown
Raw Permalink Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# wttr Architecture
## Overview
Single Zig binary, including:
- HTTP server utilizing [http.zig](https://github.com/karlseguin/http.zig)
- L1 memory/L2 file caching scheme with single directory for
* Geocoding (place name -> coordinates) as a permanent cache
* IP -> location (via [GeoLite2](https://github.com/P3TERX/GeoLite.mmdb) with fallback to [IP2Location](https://ip2location.io)] as a permanent cache
* Weather as a temporary cache
- Pluggable weather provider interface
- Rate limiting middleware
The idea was to keep most or all of the features of [wttr.in](https://wttr.in)
while vastly simplify the architecture of that system, which had evolved
significantly over time. This is a full re-write of that system.
## System Diagram
```
Client Request
HTTP Server (http.zig)
Rate Limiter (middleware)
Router
Request Handler
Location Resolver
Provider interface cache check
↓ (miss)
Weather Provider (interface)
├─ MetNo (default)
└─ Mock (tests)
Cache Store
Renderer
├─ ANSI
├─ JSON
├─ Line
└─ v2
Response
```
## Module Structure
```
src/
├── main.zig # Entry point, server setup
├── Config.zig # Configuration
├── http/
│ ├── server.zig # HTTP server wrapper
│ ├── router.zig # Route matching
│ ├── handler.zig # Request handlers
│ └── middleware.zig # Rate limiter
├── cache/
│ ├── cache.zig # Cache interface
│ ├── lru.zig # In-memory LRU
│ └── file.zig # File-backed storage
├── location/
│ ├── resolver.zig # Main location resolution
│ ├── GeoLite2.zig # GeoIP wrapper
│ └── Ip2location.zig # IP2Location fallback
├── weather/
│ ├── provider.zig # Weather provider interface
│ ├── MetNo.zig # met.no implementation
│ └── types.zig # Weather data structures
└── render/
├── ansi.zig # ANSI terminal output
├── json.zig # JSON output
└── line.zig # One-line format
```
## Core Components
### HTTP Server
**Responsibilities:**
- Listen on configured port
- Parse HTTP requests
- Route to handlers
- Apply middleware (rate limiting)
- Return responses
**Dependencies:**
- [http.zig](https://github.com/karlseguin/http.zig)
**Routes:**
```
GET / → weather for IP location
GET /{location} → weather for location
GET /:help → help page
```
### Rate Limiter
**Algorithm:** Token Bucket
**Configuration:**
```zig
pub const RateLimitConfig = struct {
capacity: u32 = 300, // Max tokens in bucket
refill_rate: u32 = 5, // Tokens per second
refill_interval_ms: u64 = 200, // Refill every 200ms
};
```
**Implementation:**
- HashMap of IP → TokenBucket
- Each request consumes 1 token
- Tokens refill at configured rate (default: 5/second)
- Bucket capacity: 300 tokens (allows bursts)
- Periodic cleanup of old buckets (not accessed in 1 hour)
### Cache
**Single layer, two-tier cache system (L1 memory + L2 file):**
**Interface:**
```zig
pub const Cache = struct {
lru: Lru, // L1: In-memory cache
cache_dir: ?[]const u8, // L2: Optional file-backed storage
pub fn get(self: *Cache, key: []const u8) ?[]const u8;
pub fn put(self: *Cache, key: []const u8, value: []const u8, ttl: u64) !void;
};
```
**Storage Strategy:**
- **L1 (Memory)**: LRU cache with configurable max entries
- **L2 (Disk)**: Optional file-backed storage for persistence
- Files stored as JSON with key, value, and expiration timestamp
- On eviction from L1, data remains in L2
- On cache miss in L1, checks L2 and promotes to L1 if found
- **TTL**: 1000-2000s (randomized to prevent thundering herd)
**Cache Locations:**
All caches default to `$XDG_CACHE_HOME/wttr` (typically `~/.cache/wttr`).
1. **Weather Response Cache**
- Location: `$WTTR_CACHE_DIR` (default: `~/.cache/wttr/`)
- Format: JSON (one file for each entry/location)
- Size: 10,000 entries (configurable via `WTTR_CACHE_SIZE`)
- Expiration: 1000-2000 seconds (randomized)
- Eviction: LRU
- Implementation: `src/cache/Cache.zig`, `src/cache/Lru.zig`
2. **Geocoding Cache**
- Purpose: Caches results from nominatim API to minimize API calls
- Location: `$WTTR_GEOCACHE_FILE` (default: `~/.cache/wttr/geocache.json`)
- Format: JSON (single file with all entries)
- Expiration: None (persists indefinitely)
- Implementation: `src/location/GeoCache.zig`
3. **IP2Location Cache**
- Purpose: Caches results from IP2Location.io API to minimize API calls
- Location: `$IP2LOCATION_CACHE_FILE` (default: `~/.cache/wttr/ip2location.cache`)
- Format: Text file with header `#Ip2location:v2` followed by CSV lines: `ip,lat,lon,name`
Note that name stored at the end so any bytes are included, including commas
- Expiration: None (persists indefinitely)
- Storage: In-memory hash map (u128 IP → Location) + append-only file
- Implementation: `src/location/Ip2location.zig` (internal Cache struct)
4. **GeoIP Database**
- Location: `$WTTR_GEOLITE_PATH` (default: `~/.cache/wttr/GeoLite2-City.mmdb`)
- Auto-downloaded if missing
- Implementation: `src/location/GeoLite2.zig`
### Location Resolver (`src/location/resolver.zig`)
**Responsibilities:**
- Parse location from URL
- Normalize location names
- Resolve IP to location (GeoIP)
- Resolve names to coordinates (geocoding)
- Handle special prefixes (~, @)
**GeoIP:**
- Uses MaxMind GeoLite2 database
- Auto-downloads if missing
- Fallback to IP2Location API
**Ip2Location:**
- Uses [Ip2Location](https://ip2location.io)
- API key not required...comes with a limit of 1k/day. Because GeoLite2 handles
most of the mapping, and results are cached, this should be fine for most needs
- API key provides 50k/month requests
- Set via IP2LOCATION_API_KEY environment variable
**Geocoding:**
- Uses nominatim, part of the OpenStreetMap project
- API key not required
### Weather Provider
**Interface (pluggable):**
```zig
pub const WeatherProvider = struct {
ptr: *anyopaque,
vtable: *const VTable,
cache: *Cache,
pub const VTable = struct {
fetchRaw: *const fn (ptr: *anyopaque, allocator: std.mem.Allocator, coords: Coordinates) anyerror![]const u8,
parse: *const fn (ptr: *anyopaque, allocator: std.mem.Allocator, raw: []const u8) anyerror!types.WeatherData,
deinit: *const fn (ptr: *anyopaque) void,
};
pub fn fetch(self: WeatherProvider, location: []const u8) !WeatherData;
};
```
Note that the interface will handle result caching according to the description
of the cache section above. Fetch and parse are two separate functions in this
interface to assist with unit tests.
**MetNo Implementation:**
- Fetches from Met.no API
- Performs timezone conversions at ingestion, so results are in the timezone
of the target location
- Groups forecast data by local date
- No API key necessary, but requires METNO_TOS_IDENTIFYING_EMAIL environment
variable to be set
**Provider vs Renderer Responsibilities:**
**Weather Provider:**
- Fetch raw weather data from external APIs
- Parse API responses into structured types
- Perform timezone conversions to the timezone of the weather location
once at ingestion time
- Group forecast data by local date (not UTC date)
- Store both UTC time and local time in forecast data
**Renderer:**
- Format weather data for display
- Select appropriate hourly forecasts for display
- Apply unit conversions (metric/imperial). Conversion functions are in the core
weather types, but the renderer is responsible for calling them
- Handle partial days with missing data
- Format dates and times for human readability
- Should NOT perform timezone calculations
**Key principle:** Timezone conversions happen once at the provider level
**Implementation details:**
- Core data structures use `zeit.Time` and `zeit.Date` types for type safety
- `HourlyForecast` contains both `time: zeit.Time` (UTC) and `local_time: zeit.Time`
- MetNo provider converts the location to timezone offsets through `src/location/timezone_offsets.zig`)
* This code uses pre-computed timezone offset lookup table
* It is auto-generated based on a Python script
* It is *NOT* precise, currently calculating the timezone based on the longitude
only. This could be up to 2 hours different from the actual timezone...however,
the purpose here is to report weather to a granularity of morning/noon/evening/night
### Renderers
There are currently 5 renderers:
* formatted
* json
* v2
* custom
* line
**Formatted Renderer:**
```zig
pub const FormattedRenderer = struct {
pub fn render(allocator: Allocator, data: WeatherData, options: RenderOptions) ![]const u8;
};
```
This is the most complex of all the renders, and is the default when the user
doesn't specifically choose something else. It displays the full table of
current conditions and up to 3 day forecast, and does so in plain text, ansi,
or html formats. Currently, these formats are coded separately. Note it is
possible to provide ansi and html, then have the code strip out extra markup to
provide plain text. This is **not** how the code works today. Also, the
original project has an option to avoid characters some brain-dead terminals
can't handle. That has not been implemented. When done, this is likely to be
implemented as a search/replace for the few unicode characters that exist and
replace them manually.
**JSON Renderer:**
```zig
pub const JsonRenderer = struct {
pub fn render(allocator: Allocator, data: WeatherData) ![]const u8;
};
```
Just utilizes the `std.json.fmt` api to render the underlying data type. No
attempt has been made to match wttr.in data or format (probably should).
**One-Line Renderer:**
```zig
pub const LineRenderer = struct {
pub fn render(allocator: Allocator, data: WeatherData, format: []const u8) ![]const u8;
};
// Formats:
// 1: "London: ⛅️ +7°C"
// 2: "London: ⛅️ +7°C 🌬↗11km/h"
// 3: "London: ⛅️ +7°C 🌬↗11km/h 💧65%"
// Custom: "%l: %c %t" → "London: ⛅️ +7°C"
```
## Network Calls
The application makes network calls to the following services:
### 1. GeoLite2 Database Download
- **URL:** `https://github.com/P3TERX/GeoLite.mmdb/raw/download/GeoLite2-City.mmdb`
- **Purpose:** Download MaxMind GeoLite2 database if missing
- **Frequency:** Only on first run or when database is missing
### 2. IP2Location API
- **URL:** `https://api.ip2location.io/?key={API_KEY}&ip={IP_ADDRESS}`
- **Purpose:** Fallback geolocation when MaxMind lookup fails
- **Frequency:** Only when `IP2LOCATION_API_KEY` is set and MaxMind fails
- **Rate Limit:** Free tier: 30,000 requests/month
- **Caching:** Results cached persistently
### 3. Nominatim Geocoding API
- **URL:** `https://nominatim.openstreetmap.org/search?q={LOCATION}&format=json&limit=1`
- **Purpose:** Convert city/location names to coordinates
- **Frequency:** When user requests weather by city name
- **Rate Limit:** Maximum 1 request per second
- **Caching:** Results cached persistently
### 4. Met.no Weather API
- **URL:** `https://api.met.no/weatherapi/locationforecast/2.0/compact?lat={LAT}&lon={LON}`
- **Purpose:** Fetch weather forecast data
- **Frequency:** Every request (unless cached)
- **Rate Limit:** No explicit limit, respectful usage required
- **Caching:** Results cached in memory (LRU)
## Dependencies
**External (Zig packages):**
- HTTP Server: [http.zig](https://github.com/karlseguin/http.zig)
- Time utilities: [zeit](https://github.com/rockorager/zeit)
**External (other):**
- Airport code -> location mapping: [Openflights](https://github.com/jpatokal/openflights)
- Ip address -> location mapping: [GeoLite2 City Database](https://github.com/maxmind/libmaxminddb)
- Moon phase calculations (vendored): [Phoon](https://acme.com/software/phoon/)
- Astronomical calculations (vendored): [Sunriset](http://www.stjarnhimlen.se/comp/sunriset.c)
- Note, a small change was made to the original to provide the ability to
skip putting main() into the object file
## Performance Targets
**Latency:**
- Cache hit: <1ms
- Cache miss: <100ms
- P95: <150ms
- P99: <300ms
**Throughput:**
- >10,000 req/s (cached)
- >1,000 req/s (uncached)
**Memory:**
- Base: <50MB (currently 9MB)
- With 10,000 cache entries: <200MB
- Binary size: <5MB (currently <2MB in ReleaseSmall)