add todo regarding cache ttl on old merge data

This commit is contained in:
Emil Lerch 2026-05-21 08:41:52 -07:00
parent 16048489dd
commit 7deb49254a
Signed by: lobo
GPG key ID: A7B62D657EF764F8

39
TODO.md
View file

@ -452,6 +452,45 @@ Eastern rather than a rolling window. This would avoid unnecessary refetches
during the trading day and ensure a fetch shortly after close gets fresh data.
Probably alleviated by the cron job approach.
## Cache TTL semantics on merge writes — priority LOW
The `writeMerged` primitive in `cache/store.zig` rewrites `dividends.srf` /
`splits.srf` with `expires = now + ttl` whenever it adds a new record or
upgrades fields on an existing one. This is conceptually wrong: TTL should
reflect "when do we expect new information from the primary provider?",
which is a property of the conversation with that provider — not of the
file's last-modification time. Adding a 25-year-old historical dividend
that Tiingo just supplied tells us nothing about Polygon's freshness; we
shouldn't bump the file's expiry as a side effect.
The cleaner design:
- Cache file's `#!expires=` reflects "when did Polygon (the primary) last
say `here's everything I have`?"
- Tiingo merge writes preserve the existing expires, only rewriting records.
- Only `fetchCached`'s post-Polygon-fetch write bumps expires.
In practice the current behavior caused exactly one observable problem: a
one-time TTL herd on 2026-06-04 when the new merge code's first run added
pre-2010 Tiingo backfill across 23+ symbols in a single overnight burst,
and they all inherited that day's clock for `expires = now + 14d`. We
manually re-staggered (`stagger_cache_ttls.py`) and moved on.
Steady-state risk: minimal. The merge primitive's "skip if nothing changed"
branch means no-op refreshes don't bump expires. New entries from genuinely
new dividends are spread across the calendar by the dividends themselves
(quarterly cadence varies per ticker). Field upgrades stop firing once
Polygon's metadata is in place.
When this could matter again:
- Adding a third source for div/splits (TTL semantics get murkier).
- Wiping and rebuilding the server cache (one-time herd recurs).
- A long pause in nightly refreshes followed by a backlog of merge writes.
Fix would be small: thread `?expires_override` into `writeMerged` and have
the merge path call `serializeWithMeta` with the existing expires (from the
read) when source_hint isn't the primary.
## On-demand server-side fetch for new symbols
Currently the server's SRF endpoints (`/candles`, `/dividends`, etc.) are pure