267 lines
13 KiB
Markdown
267 lines
13 KiB
Markdown
# Future Work
|
|
|
|
No work here is blocking - we're in a good state. Items below are
|
|
ordered roughly by priority within each section. Priority labels
|
|
(`HIGH` / `MEDIUM` / `LOW`) mark items that deserve explicit
|
|
ranking; unlabeled items are "someday, if the mood strikes."
|
|
|
|
## Projections: future enhancements
|
|
|
|
- **Goal-seek over distribution horizon for W1 - priority LOW.**
|
|
Today the W1 ("set spending, find date") workflow reports the
|
|
earliest retirement at each user-configured `(horizon, confidence)`
|
|
cell. The philosophically correct version asks "when have I
|
|
accumulated enough wealth that the projection shows a 95%
|
|
probability of success withdrawing X per year from retirement
|
|
until age-of-death?" - i.e. goal-seek across both `accumulation_years`
|
|
AND `distribution_years` simultaneously, anchored to a configured
|
|
age-of-death. NP-shaped search; not worth optimizing until
|
|
someone wants it.
|
|
- **Per-person retirement_age - priority LOW.**
|
|
V1 of the accumulation-phase spec chose Option A: a single
|
|
household retirement boundary derived from the oldest configured
|
|
birthdate. Households where one earner retires significantly
|
|
earlier than the other would benefit from per-person
|
|
`retirement_age` fields on each `type::birthdate` record, with
|
|
contributions stopped per-person.
|
|
- **Historical projection overlay follow-ups.** The base
|
|
`--overlay-actuals` overlay shipped (CLI tip + TUI primary surface).
|
|
Open enhancements:
|
|
- Historical `metadata.srf` / `projections.srf` for back-dated
|
|
runs. Today the overlay re-runs against current classifications
|
|
and assumptions; for historically faithful what-the-model-said-then
|
|
output we'd check out the git-tracked versions of those files
|
|
at the as-of commit and load those instead. Edge case until
|
|
classifications materially drift.
|
|
- Contribution-attribution overlay. Today's actuals line includes
|
|
contributions implicitly; the bands assume modeled contributions
|
|
that may or may not match reality. A "decompose actuals into
|
|
market return vs contributions" annotation would clarify how
|
|
much of the trajectory was the model being right vs new money
|
|
arriving on schedule.
|
|
- **Better composition basis for imported-only as-of.** Today
|
|
the imported-only path uses today's allocations scaled by
|
|
`imported_liquid / today_total_liquid`. That's the simplest
|
|
thing that could work, but it's "today's mix back-dated" -
|
|
it ignores everything we know about the historical context.
|
|
Specifically: `imported_values.srf` already carries an
|
|
`expected_return` field per row that the user captured at
|
|
that date in their source spreadsheet. We could:
|
|
- Use the imported `expected_return` as a sanity check
|
|
against the simulation's per-position weighted return
|
|
(warn or clamp if they diverge wildly - the spreadsheet's
|
|
number reflects what the user actually saw at the time).
|
|
- Use the imported `expected_return` to bias the
|
|
stock/bond split inference: a higher expected return
|
|
implies a higher historical equity weighting than today's
|
|
mix probably reflects.
|
|
- Reach further: derive a synthetic stock/bond split from
|
|
the imported `expected_return` directly, treating it as
|
|
a weighted average of SPY and AGG returns at that date
|
|
and solving for the weights. That gives a per-imported-
|
|
row composition that's locally faithful instead of
|
|
one-mix-fits-all.
|
|
None of these are urgent - the current "today's mix scaled"
|
|
approximation is documented as such and the bands still
|
|
render meaningfully - but each would tighten the historical
|
|
faithfulness one notch. Pick whichever has the highest
|
|
payoff vs. complexity when this gets revisited.
|
|
|
|
## Investigate: detailed 401(k) contributions data source
|
|
|
|
Found a more detailed contributions screen on at least one
|
|
employer-sponsored 401(k) provider portal - distinct from the
|
|
standard positions/holdings view we already pull from. Worth
|
|
investigating whether this unlocks better attribution than what
|
|
we get from the positions CSV alone, and whether other 401(k)
|
|
providers expose similar screens.
|
|
|
|
Open questions to answer when picking this up:
|
|
|
|
- Which screen specifically (path / URL within the portal)? Is there
|
|
an export option, or is it view-only / scrape territory?
|
|
- What fields does it expose (employee pre-tax, employer match,
|
|
after-tax / mega-backdoor, by-pay-period dates, per-fund
|
|
allocations)?
|
|
- Refresh cadence - per-paycheck, daily, on-demand?
|
|
- Can it be auto-discovered like the existing audit CSVs, or
|
|
is it manual-entry territory?
|
|
|
|
If the export is structured and recurring, this could feed a
|
|
401(k)-specific contributions classifier that bypasses the lot-diff
|
|
heuristic for that account, similar to how `cash_is_contribution`
|
|
opts ESPP/HSA accounts into cash-based attribution.
|
|
|
|
Related: ESPP-style accrual blind spot in the "Audit: manual-check
|
|
accounts mechanism" section above.
|
|
|
|
## On-demand server-side fetch for new symbols
|
|
|
|
Currently the server's SRF endpoints (`/candles`, `/dividends`, etc.) are pure
|
|
cache reads - they 404 if the data isn't already on disk. New symbols only get
|
|
populated when added to the portfolio and picked up by the next cron refresh.
|
|
|
|
Consider: on a cache miss, instead of blocking the HTTP response with a
|
|
multi-second provider fetch, kick off an async background fetch (or just
|
|
auto-add the symbol to the portfolio) and return 404 as usual. The next
|
|
request - or the next cron run - would then have the data. This gives
|
|
"instant-ish gratification" for new symbols without the downsides of
|
|
synchronous fetch-on-miss (latency, rate limit contention, unbounded cache
|
|
growth from arbitrary tickers).
|
|
|
|
Note that this process doesn't do anything to eliminate all the API keys
|
|
that are necessary for a fully functioning system. A more aggressive view
|
|
would be to treat ZFIN_SERVER as a 100% source of record, but that would
|
|
introduce some opacity to the process as we wait for candles (for example) to
|
|
populate. This could be solved on the server by spawning a thread to fetch the
|
|
data, then returning 202 Accepted, which could then be polled client side. Maybe
|
|
this is a better long term approach?
|
|
|
|
## Support Tiingo paid plan - priority LOW
|
|
|
|
zfin hardwires Tiingo to free-tier assumptions: the provider is
|
|
constructed with `RateLimiter.perHour(io, 50)` in `Tiingo.init`
|
|
(`providers/tiingo.zig`), and the only Tiingo surface is end-of-day
|
|
candles plus the corporate actions that ride along in the same
|
|
response. A user who pays for a Tiingo plan ($30/mo Power tier and
|
|
up) gets nothing for it today - the same 50/hour throttle, the same
|
|
EOD-only data. "Support the paid plan" is the umbrella for unlocking
|
|
what that subscription actually buys: higher rate limits and
|
|
real-time IEX quotes. The two are coupled (real-time polling only
|
|
makes sense once the limit is raised), which is why they belong in
|
|
one entry rather than two.
|
|
|
|
### Tier-aware rate limiting
|
|
|
|
The 50/hour cap is hardcoded in `Tiingo.init`
|
|
(`RateLimiter.perHour(io, 50)`), and the module docstring bakes in
|
|
"Free tier: 50 requests/hour and 1,000 requests/day." Paid tiers
|
|
raise both ceilings substantially, so a paying subscriber is being
|
|
throttled far below their entitlement. Today only the hourly bucket
|
|
is wired; the daily ceiling isn't enforced at all (the docstring
|
|
notes it's "far from binding" for bursty EOD usage - real-time
|
|
polling changes that calculus).
|
|
|
|
Work:
|
|
|
|
- Make the Tiingo limits configurable instead of hardcoded. Options:
|
|
explicit `ZFIN_TIINGO_RATE_PER_HOUR` (and per-day) numeric env
|
|
knobs, or a coarser `ZFIN_TIINGO_PLAN` = `free` (default) |
|
|
`power` | ... that maps to known limits. Lean toward explicit
|
|
numeric overrides so we aren't chasing Tiingo's published per-tier
|
|
numbers as they drift.
|
|
- `RateLimiter` already supports arbitrary `init(io, max, window_ns)`
|
|
plus `perDay`/`perHour` convenience ctors, so the limiter side is
|
|
cheap. Decide whether a paid plan needs both an hourly and a daily
|
|
bucket enforced, or whether hourly alone stays sufficient.
|
|
- Caveat from `RateLimiter`'s own docs: the bucket is in-memory and
|
|
per-process - it caps a single run's burst, not usage across
|
|
separate launches in the same window. Sustained real-time polling
|
|
(below) makes cross-process usage likelier, so revisit whether
|
|
per-process accounting is still good enough.
|
|
|
|
### Real-time IEX quotes (was: configurable live-quote provider)
|
|
|
|
The TUI refresh key (`r`) values the portfolio with live intraday
|
|
quotes via `DataService.loadLiveQuotes` (`service.zig`), which is
|
|
Yahoo-only: Yahoo is keyless, consolidated, and stays off every
|
|
rate-limit budget, so bursty refresh traffic costs nothing. The
|
|
tradeoffs are that Yahoo's unofficial feed is ~15-minute delayed and
|
|
"can break without notice."
|
|
|
|
Tiingo's IEX endpoint (`/iex/?tickers=A,B,C`) is a strong opt-in
|
|
alternative for a paid subscriber: it's genuinely real-time (IEX
|
|
last-sale, no 15-min delay), official/keyed, and bills per HTTP
|
|
request - one call returns the whole portfolio (confirmed
|
|
empirically: a 2-ticker batch decrements the daily quota by 1, not
|
|
2). Fields map cleanly: `tngoLast` to price, `prevClose` to
|
|
day-change. Caveats: IEX is a single venue (~2-3% of volume), so
|
|
`tngoLast` can sit stale between prints on illiquid names, and IEX
|
|
doesn't trade mutual funds, so those fall back to the candle close.
|
|
|
|
Proposal: a config knob (env var, e.g. `ZFIN_LIVE_QUOTE_PROVIDER` =
|
|
`yahoo` (default) | `tiingo`) that switches `loadLiveQuotes` to a new
|
|
`Tiingo.fetchQuotes(tickers)` batched call. A paid subscriber who
|
|
wants real-time and mashes `r` a lot (or once we add streaming)
|
|
reuses their existing `TIINGO_API_KEY` and gets real-time coverage;
|
|
everyone else keeps the keyless Yahoo default.
|
|
|
|
Implementation notes:
|
|
|
|
- `Tiingo.fetchQuotes` returns an array whose order is NOT guaranteed
|
|
to match the request order, so key results by the returned
|
|
`ticker` field, not by position.
|
|
- Live quotes share Tiingo's token bucket, so this is the concrete
|
|
reason the tier-aware rate-limiting work above has to land first
|
|
(or alongside): a batched quote call is only 1 request, but heavy
|
|
`r` use plus candle refreshes draining the free 50/hour bucket is
|
|
exactly the contention that raising the paid-tier limit relieves.
|
|
|
|
### Websocket streaming (follow-on)
|
|
|
|
Tiingo's IEX websocket would be the natural follow-on for true
|
|
push-based real-time, replacing poll-on-`r` entirely. Materially
|
|
bigger than the REST quote path (persistent connection, reconnect
|
|
handling, a background task feeding the TUI) and squarely a
|
|
paid-plan feature. Sequence it after the REST quote path proves out.
|
|
|
|
## Analysis: dividend equity / income-shaped equity - think about it
|
|
|
|
Dividend-equity ETFs (SCHD, VYM, DGRO, NOBL, SDY, VIG, etc.)
|
|
bucket as Equity in `analysis.bucketSector`. That's correct for
|
|
risk-exposure analysis - they drop with the market in a
|
|
2008-style crash, regardless of the dividend stream - but it
|
|
loses the income-vs-growth distinction that retirement-planning
|
|
tools care about.
|
|
|
|
Open question: is there a useful second dimension to add?
|
|
Possibilities:
|
|
|
|
- **Yield-weighted breakdown.** Aggregate `current_yield` per
|
|
position, weight by market value, report a portfolio-level
|
|
yield. Doesn't change the asset-class taxonomy; adds a new
|
|
metric.
|
|
- **Income coverage of expenses.** "My dividends + bond coupons
|
|
cover X% of projected retirement spending." Closer to what the
|
|
income-side framing actually wants - answers the question
|
|
rather than redefining the buckets.
|
|
- **Income-equity sub-bucket within Equity.** A sub-row in the
|
|
Asset Category breakdown, not a 5th top-level bucket. Would
|
|
need a way to mark funds as "income-shaped" - probably a
|
|
per-symbol opt-in in `metadata.srf`.
|
|
|
|
Not a bug. Not blocking anything. Could end up being a feature.
|
|
This is a note to revisit after using the 4-bucket view for a
|
|
while and seeing whether the missing dimension actually matters
|
|
in practice.
|
|
|
|
Resist the temptation to:
|
|
|
|
- **Add a 5th top-level bucket** ("Income Equity" / "Dividend
|
|
Equity"). The 4-bucket view is already the right answer for
|
|
"how much equity exposure do I have?". A 5th bucket
|
|
fragments the headline number.
|
|
- **Override SCHD to Fixed Income.** Wrong on risk grounds.
|
|
SCHD will lose 35-45% in an equity crash; treating it as FI
|
|
makes the user think they have downside protection they don't.
|
|
- **Add per-symbol "intent" metadata** (`held_for_income::true`).
|
|
Smell of putting framing into data. Intent is a property of
|
|
the holder's strategy, not the security.
|
|
|
|
If a fix lands, it's probably a separate analysis section (yield
|
|
breakdown, income coverage) - not a change to the asset-class
|
|
taxonomy.
|
|
|
|
The following items are acknowledged but not prioritized. Listed here
|
|
so they don't get lost; pick up opportunistically.
|
|
|
|
### Infra / performance
|
|
|
|
- **HTTP connection pooling.** Parallel server sync in `loadAllPrices`
|
|
spawns up to 8 threads, each with its own HTTP connection. Could
|
|
reuse connections to reduce TCP handshake overhead. Only matters
|
|
with very large portfolios (100+ symbols) hitting ZFIN_SERVER.
|
|
- **Streaming cache deserialization.** Cache store reads entire files
|
|
into memory (`readFileAlloc` with 50MB limit). For portfolios with
|
|
10+ years of daily candles, this could use significant memory.
|
|
Keep current approach unless memory becomes a real problem.
|