From 7deb49254af57f1cba35f07a4889ef23b3ae537d Mon Sep 17 00:00:00 2001 From: Emil Lerch Date: Thu, 21 May 2026 08:41:52 -0700 Subject: [PATCH] add todo regarding cache ttl on old merge data --- TODO.md | 39 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 39 insertions(+) diff --git a/TODO.md b/TODO.md index b5afca0..bf32f33 100644 --- a/TODO.md +++ b/TODO.md @@ -452,6 +452,45 @@ Eastern rather than a rolling window. This would avoid unnecessary refetches during the trading day and ensure a fetch shortly after close gets fresh data. Probably alleviated by the cron job approach. +## Cache TTL semantics on merge writes — priority LOW + +The `writeMerged` primitive in `cache/store.zig` rewrites `dividends.srf` / +`splits.srf` with `expires = now + ttl` whenever it adds a new record or +upgrades fields on an existing one. This is conceptually wrong: TTL should +reflect "when do we expect new information from the primary provider?", +which is a property of the conversation with that provider — not of the +file's last-modification time. Adding a 25-year-old historical dividend +that Tiingo just supplied tells us nothing about Polygon's freshness; we +shouldn't bump the file's expiry as a side effect. + +The cleaner design: + +- Cache file's `#!expires=` reflects "when did Polygon (the primary) last + say `here's everything I have`?" +- Tiingo merge writes preserve the existing expires, only rewriting records. +- Only `fetchCached`'s post-Polygon-fetch write bumps expires. + +In practice the current behavior caused exactly one observable problem: a +one-time TTL herd on 2026-06-04 when the new merge code's first run added +pre-2010 Tiingo backfill across 23+ symbols in a single overnight burst, +and they all inherited that day's clock for `expires = now + 14d`. We +manually re-staggered (`stagger_cache_ttls.py`) and moved on. + +Steady-state risk: minimal. The merge primitive's "skip if nothing changed" +branch means no-op refreshes don't bump expires. New entries from genuinely +new dividends are spread across the calendar by the dividends themselves +(quarterly cadence varies per ticker). Field upgrades stop firing once +Polygon's metadata is in place. + +When this could matter again: +- Adding a third source for div/splits (TTL semantics get murkier). +- Wiping and rebuilding the server cache (one-time herd recurs). +- A long pause in nightly refreshes followed by a backlog of merge writes. + +Fix would be small: thread `?expires_override` into `writeMerged` and have +the merge path call `serializeWithMeta` with the existing expires (from the +read) when source_hint isn't the primary. + ## On-demand server-side fetch for new symbols Currently the server's SRF endpoints (`/candles`, `/dividends`, etc.) are pure