zfin/docs/explanation/caching.md
Emil Lerch 74fc219afd
All checks were successful
Generic zig build / build (push) Successful in 5m48s
Generic zig build / publish-macos (push) Successful in 11s
Generic zig build / deploy (push) Successful in 23s
add docs/guides
2026-06-22 14:53:53 -07:00

5.6 KiB

Caching and data freshness

zfin makes a lot of API calls on your behalf -- prices, dividends, earnings, ETF holdings -- against providers with strict free-tier limits. Aggressive caching is what keeps it fast and keeps you well under those limits. This page explains how it works so the --refresh-data flag makes sense.

The fetch path

Every data request walks the same tiers, stopping at the first one that can satisfy it:

  1. Local cache. Look for ~/.cache/zfin/<SYMBOL>/<type>.srf. If the file exists and is within its TTL, deserialize and return -- no network at all.
  2. Shared server (optional). On a miss or stale entry, if ZFIN_SERVER is set, zfin asks that server before any provider; a hit is written into your local cache and served from there, so no provider call happens. See Server sync.
  3. Provider. Otherwise zfin fetches from the upstream provider, writes the result to the cache, and returns it.

Freshness is decided by the cache file's modification time versus the TTL for that data type. The cache directory defaults to ~/.cache/zfin and is set with ZFIN_CACHE_DIR.

The --refresh-data policy decides which tiers run:

  • auto (default) walks all three.
  • force skips the local cache and the server, going straight to the provider, then re-caches the result.
  • never stops at the local cache: it returns cached data even if stale, and never touches the server or a provider.

Time-to-live by data type

Different data ages at different rates, so each type has its own TTL:

Data type TTL Why
Daily candles ~24h (23h45m) One bar per trading day; slightly under 24h for cron jitter
Dividends 14 days Declared well in advance
Splits 14 days Rare corporate events
Options 1 hour Prices move continuously when markets are open
Earnings 30 days* Quarterly; smart-refreshed around announcements
ETF profiles ~30 days Holdings and weights change slowly
Quotes never cached Meant to be a live price check

* Earnings smart refresh: even inside the 30-day window, cached earnings re-fetch automatically once an earnings date has passed but the cache still lacks the actual result -- so numbers appear promptly after an announcement without daily polling.

Quotes are never cached

Because quotes exist to give you a live price, they're never served from cache. The practical consequence: in offline mode (--refresh-data=never) the quote command has nothing to serve, while candle-based commands like perf work fine from cached history.

Incremental candle updates

Price history isn't re-downloaded wholesale. On a cache miss, zfin fetches only candles newer than the last cached date and appends them, using a small candles_meta.srf companion file to track the last date and source provider. A ten-year history costs one big fetch the first time and tiny top-ups thereafter.

Negative caching

When a provider permanently fails for a symbol -- a nonexistent ticker, say -- zfin records a negative cache entry so it doesn't retry the same dead lookup on every run. (Transient failures like rate limits are not cached this way; they're retried.)

Rate limiting

Each provider has a client-side token-bucket limiter sized to its free-tier ceiling (e.g. Polygon 5/min, FMP 250/day). When you'd exceed the rate, zfin blocks until a token is available rather than firing a request that would 429. This is why a --refresh-data=force run across many symbols can pace itself instead of failing. Limits are listed in Data providers and API keys.

Server sync (ZFIN_SERVER)

ZFIN_SERVER points zfin at an optional zfin-server instance -- a shared cache that sits between your local cache and the upstream providers, and is the second tier of the fetch path. On a local miss, zfin requests GET {ZFIN_SERVER}/<SYMBOL>/<type> (candles, dividends, splits, options, earnings, classification, ETF metrics, and EDGAR entity facts), and a hit is written straight into your local cache.

Why bother: the server is warmed once -- say by a cron job on one machine -- and then every client draws from it instead of each spending its own provider quota, so a household or a fleet of machines shares one set of API-key budgets and gets faster cold starts. For the portfolio price load, the server is queried in parallel across symbols, with per-symbol provider fallback only for what it can't supply.

It is entirely optional: when ZFIN_SERVER is unset, every server-sync path silently no-ops and zfin runs local-cache-then-provider. Live quotes are never served by the server (they aren't cached anywhere), and --refresh-data=force bypasses the server to re-fetch from the provider.

Controlling it

You rarely need to intervene -- auto does the right thing. When you do:

  • --refresh-data=force re-fetches everything (after a close, or to clear suspected bad data).
  • --refresh-data=never goes fully offline.
  • zfin cache stats shows what's cached; zfin cache clear wipes it (everything re-fetches next run).

See Offline use and refreshing data.