121 lines
5.6 KiB
Markdown
121 lines
5.6 KiB
Markdown
# Caching and data freshness
|
|
|
|
zfin makes a lot of API calls on your behalf -- prices, dividends,
|
|
earnings, ETF holdings -- against providers with strict free-tier
|
|
limits. Aggressive caching is what keeps it fast and keeps you well
|
|
under those limits. This page explains how it works so the
|
|
[`--refresh-data`](../guides/offline-and-refresh.md) flag makes sense.
|
|
|
|
## The fetch path
|
|
|
|
Every data request walks the same tiers, stopping at the first one that
|
|
can satisfy it:
|
|
|
|
1. **Local cache.** Look for `~/.cache/zfin/<SYMBOL>/<type>.srf`. If the
|
|
file exists and is within its TTL, deserialize and return -- no
|
|
network at all.
|
|
2. **Shared server** *(optional)*. On a miss or stale entry, if
|
|
`ZFIN_SERVER` is set, zfin asks that server before any provider; a
|
|
hit is written into your local cache and served from there, so no
|
|
provider call happens. See [Server sync](#server-sync-zfin_server).
|
|
3. **Provider.** Otherwise zfin fetches from the upstream provider,
|
|
writes the result to the cache, and returns it.
|
|
|
|
Freshness is decided by the cache file's modification time versus the
|
|
TTL for that data type. The cache directory defaults to `~/.cache/zfin`
|
|
and is set with `ZFIN_CACHE_DIR`.
|
|
|
|
The `--refresh-data` policy decides which tiers run:
|
|
|
|
- `auto` (default) walks all three.
|
|
- `force` skips the local cache and the server, going straight to the
|
|
provider, then re-caches the result.
|
|
- `never` stops at the local cache: it returns cached data even if
|
|
stale, and never touches the server or a provider.
|
|
|
|
## Time-to-live by data type
|
|
|
|
Different data ages at different rates, so each type has its own TTL:
|
|
|
|
| Data type | TTL | Why |
|
|
|---------------|---------------|-------------------------------------------------------------|
|
|
| Daily candles | ~24h (23h45m) | One bar per trading day; slightly under 24h for cron jitter |
|
|
| Dividends | 14 days | Declared well in advance |
|
|
| Splits | 14 days | Rare corporate events |
|
|
| Options | 1 hour | Prices move continuously when markets are open |
|
|
| Earnings | 30 days\* | Quarterly; smart-refreshed around announcements |
|
|
| ETF profiles | ~30 days | Holdings and weights change slowly |
|
|
| Quotes | never cached | Meant to be a live price check |
|
|
|
|
\* **Earnings smart refresh:** even inside the 30-day window, cached
|
|
earnings re-fetch automatically once an earnings date has passed but
|
|
the cache still lacks the actual result -- so numbers appear promptly
|
|
after an announcement without daily polling.
|
|
|
|
## Quotes are never cached
|
|
|
|
Because quotes exist to give you a live price, they're never served
|
|
from cache. The practical consequence: in offline mode
|
|
(`--refresh-data=never`) the [`quote`](../reference/cli/quote.md)
|
|
command has nothing to serve, while candle-based commands like
|
|
[`perf`](../reference/cli/perf.md) work fine from cached history.
|
|
|
|
## Incremental candle updates
|
|
|
|
Price history isn't re-downloaded wholesale. On a cache miss, zfin
|
|
fetches only candles newer than the last cached date and appends them,
|
|
using a small `candles_meta.srf` companion file to track the last date
|
|
and source provider. A ten-year history costs one big fetch the first
|
|
time and tiny top-ups thereafter.
|
|
|
|
## Negative caching
|
|
|
|
When a provider permanently fails for a symbol -- a nonexistent
|
|
ticker, say -- zfin records a negative cache entry so it doesn't retry
|
|
the same dead lookup on every run. (Transient failures like rate limits
|
|
are not cached this way; they're retried.)
|
|
|
|
## Rate limiting
|
|
|
|
Each provider has a client-side token-bucket limiter sized to its
|
|
free-tier ceiling (e.g. Polygon 5/min, FMP 250/day). When you'd exceed
|
|
the rate, zfin blocks until a token is available rather than firing a
|
|
request that would 429. This is why a `--refresh-data=force` run across
|
|
many symbols can pace itself instead of failing. Limits are listed in
|
|
[Data providers and API keys](../reference/providers.md).
|
|
|
|
## Server sync (`ZFIN_SERVER`)
|
|
|
|
`ZFIN_SERVER` points zfin at an optional
|
|
[zfin-server](https://git.lerch.org/lobo/zfin-server) instance -- a
|
|
shared cache that sits between your local cache and the upstream
|
|
providers, and is the second tier of [the fetch path](#the-fetch-path).
|
|
On a local miss, zfin requests `GET {ZFIN_SERVER}/<SYMBOL>/<type>`
|
|
(candles, dividends, splits, options, earnings, classification, ETF
|
|
metrics, and EDGAR entity facts), and a hit is written straight into
|
|
your local cache.
|
|
|
|
Why bother: the server is warmed once -- say by a cron job on one
|
|
machine -- and then every client draws from it instead of each spending
|
|
its own provider quota, so a household or a fleet of machines shares one
|
|
set of API-key budgets and gets faster cold starts. For the portfolio
|
|
price load, the server is queried in parallel across symbols, with
|
|
per-symbol provider fallback only for what it can't supply.
|
|
|
|
It is entirely optional: when `ZFIN_SERVER` is unset, every server-sync
|
|
path silently no-ops and zfin runs local-cache-then-provider. Live
|
|
quotes are never served by the server (they aren't cached anywhere), and
|
|
`--refresh-data=force` bypasses the server to re-fetch from the provider.
|
|
|
|
## Controlling it
|
|
|
|
You rarely need to intervene -- `auto` does the right thing. When you
|
|
do:
|
|
|
|
- `--refresh-data=force` re-fetches everything (after a close, or to
|
|
clear suspected bad data).
|
|
- `--refresh-data=never` goes fully offline.
|
|
- [`zfin cache stats`](../reference/cli/cache.md) shows what's cached;
|
|
`zfin cache clear` wipes it (everything re-fetches next run).
|
|
|
|
See [Offline use and refreshing data](../guides/offline-and-refresh.md).
|