IO-as-an-interface refactor across the codebase. The big shifts: - std.io → std.Io, std.fs → std.Io.Dir/File, std.process.Child → spawn/run. - Juicy Main: pub fn main(init: std.process.Init) gives gpa, io, arena, environ_map up front. main.zig + the build/ scripts use it directly. - Threading io through everywhere that touches the outside world (HTTP, files, stderr, sleep, terminal detection). Functions taking `io` now announce side effects at the call site — the smell is the feature. - date math takes `as_of: Date`, not `today: Date`. Caller resolves `--as-of` flag vs wall-clock at the boundary; the function operates on whatever date it's given. Every "today" parameter renamed and the as_of: ?Date + today: Date pattern collapsed. - now_s: i64 (or before_s/after_s pairs) for sub-second metadata fields like snapshot captured_at, audit cadence, formatAge/fmtTimeAgo. Also pure and testable. - legitimate Timestamp.now callers (cache TTL math, FetchResult timestamps, rate limiter, per-frame TUI "now" captures) gain `// wall-clock required: ...` comments justifying the read. Test discovery: replaced the local refAllDeclsRecursive with bare std.testing.refAllDecls(@This()). Sema-pulling main.zig's top-level decls reaches every test file transitively through the import graph; no explicit _ = @import(...) lines needed. Cleanup along the way: - Dropped DataService.allocator()/io() accessor methods; renamed the fields to drop the base_ prefix. Callers use self.allocator and self.io directly. - Dropped now-vestigial io parameters from buildSnapshot, analyzePortfolio, compareSchwabSummary, compareAccounts, buildPortfolioData, divs.display, quote.display, parsePortfolioOpts, aggregateLiveStocks, renderEarningsLines, capitalGainsIndicator, aggregateDripLots, printLotRow, portfolio.display, printSnapNote. - Dropped the unused contributions.computeAttribution date-form wrapper (only computeAttributionSpec is called). - formatAge/fmtTimeAgo take (before_s, after_s) instead of io and reading the clock internally. - parseProjectionsConfig uses an internal stack-buffer FixedBufferAllocator instead of an allocator parameter. - ThreadSafeAllocator wrappers in cache concurrency tests dropped (0.16's DebugAllocator is thread-safe by default). - analyzePortfolio bug surfaced by the rename: snapshot.zig was passing wall-clock today instead of as_of, mis-valuing cash/CDs for historical backfills. 83 new unit tests added due to removal of IO, bringing coverage from 58% -> 64%
292 lines
14 KiB
Markdown
292 lines
14 KiB
Markdown
# Future Work
|
|
|
|
No work here is blocking — we're in a good state. Items below are
|
|
ordered roughly by priority within each section. Priority labels
|
|
(`HIGH` / `MEDIUM` / `LOW`) mark items that deserve explicit
|
|
ranking; unlabeled items are "someday, if the mood strikes."
|
|
|
|
**Next up:** configurable benchmark symbols (low-effort win) and the
|
|
manual-check accounts mechanism (medium effort, real user value).
|
|
|
|
## Projections: future enhancements
|
|
|
|
- **Configurable benchmark symbols — priority MEDIUM.** Currently
|
|
hardcoded SPY + AGG. Route through `projections.srf` as a
|
|
`type::config,benchmark::SYMBOL` record (or similar). Low effort.
|
|
- **Configurable return cap per position — priority MEDIUM.**
|
|
Default: none; cap outliers like NVDA. Should route through
|
|
`projections.srf` cleanly.
|
|
- **"Not Retired Yet" accumulation-phase mode — priority HIGH.**
|
|
Accumulation phase with contributions before retirement.
|
|
Separate from life events — the simulation has two distinct phases:
|
|
accumulation (base spending = 0, contributions applied) and
|
|
distribution (searched-for withdrawal + spending model). Config:
|
|
`retirement_age::60` or `retirement_in::10`, plus
|
|
`annual_contribution::100000`. Safe withdrawal search only applies
|
|
to the distribution phase. Chart shows portfolio growing during
|
|
accumulation, peaking at retirement, then drawing down.
|
|
- Configurable MIN period selection (currently 3Y/5Y/10Y, exclude 1Y)
|
|
- Multiple spending models: flat (current), decreasing (1-2% real annual decrease,
|
|
Blanchett "spending smile"). Late-life healthcare better modeled as a life event.
|
|
- Unclassified position handling in allocation split (warn user)
|
|
- Historical projection comparison: re-run projections from any past snapshot date,
|
|
overlay actual portfolio trajectory from subsequent snapshots onto the projected
|
|
percentile bands. Shows how reality tracked against the model. Data is already
|
|
available in history/*.srf snapshots — just need to load a historical portfolio
|
|
value and re-run `computePercentileBands` with that starting point, then plot
|
|
actual values from later snapshots as a line overlaid on the bands.
|
|
|
|
`zfin projections --as-of <DATE>` already reruns the simulation
|
|
against a past snapshot (the prerequisite for this overlay). What's
|
|
missing is the overlay itself — loading multiple downstream snapshots
|
|
and plotting their net-worth trajectory on the same chart.
|
|
|
|
**Deferred to ~2027.** Needs a practical volume of real snapshots
|
|
(currently building up; meaningful backtest requires 12+ months).
|
|
Backfilling from git history is not viable — the lot-level state on
|
|
portfolio.srf at a past commit is insufficient to reconstruct the
|
|
full transaction+contribution picture. Revisit once there are 12+
|
|
months of continuous snapshot data.
|
|
|
|
Also consider: `metadata.srf` and `projections.srf` classifications /
|
|
assumptions drift over time. For back-dated runs we currently use
|
|
the current versions of both; historical git-tracked versions could
|
|
be checked out and loaded instead. Edge case for now.
|
|
|
|
## Audit: manual-check accounts mechanism (NYL, Kelly's ESPP, etc.) — priority HIGH
|
|
|
|
Some accounts/positions can't be reconciled from broker CSVs and need a
|
|
human-in-the-loop reminder at the audit step. Examples:
|
|
|
|
- **NY Life** — no CSV export at all. Values only live in periodic
|
|
statements.
|
|
- **Kelly's ESPP** — accrued payroll-deduction cash doesn't appear in the
|
|
Fidelity positions CSV until the purchase date hits (typically every
|
|
6 months). Between purchases the cash is a real contribution that
|
|
`zfin audit` can't see.
|
|
- Future: treat as an open category.
|
|
|
|
The existing `update_cadence::weekly|monthly|quarterly|none` field already
|
|
sort-of covers this, but has two gaps:
|
|
|
|
1. It fires off the last *git-detected change*, not the last *human
|
|
review*. For NYL, the value sometimes hasn't changed in months — so
|
|
git never fires, cadence never trips.
|
|
2. ESPP needs weekly-ish attention while accumulating cash between
|
|
purchases, but the accrued balance is invisible to the CSV audit.
|
|
|
|
### Options
|
|
|
|
A. **New `update_cadence::manual` variant** — always fires every audit
|
|
run until silenced. Blunt but zero design work.
|
|
|
|
B. **`last_refreshed::YYYY-MM-DD` field on `accounts.srf`** — explicit
|
|
human-review timestamp, decoupled from git-detected changes. Audit
|
|
compares `today - last_refreshed` against the cadence. User bumps
|
|
the field when they check the statement. Probably the most
|
|
correct fit for NYL.
|
|
|
|
C. **Sticky TODO list** — a `todos.srf` or `todo::` field on accounts
|
|
that audit always surfaces until cleared. General-purpose; also
|
|
covers "remember to rebalance on 5/15".
|
|
|
|
### ESPP-specific follow-through
|
|
|
|
ESPP is also a contribution-attribution blind spot. If Kelly's paystub
|
|
deducts $X/week but the cash lot doesn't reach `portfolio.srf` until
|
|
the purchase date, the attribution math is under-counting contributions
|
|
and over-counting the purchase-week gain. Possible fixes are discussed
|
|
in the "Contributions diff" TODO below — option C there (per-account
|
|
`cash_is_contribution`) would make manually-entered ESPP
|
|
cash additions count correctly.
|
|
|
|
## In-kind transfer support (`type::in_kind`) — priority MEDIUM
|
|
|
|
`transaction_log.srf` parses `type::in_kind` records but the
|
|
contributions matcher always rejects them with "in-kind transfers
|
|
not yet supported in v1." In-kind movements need per-symbol
|
|
matching across accounts: an in-kind transfer of 100 VTI shares
|
|
from Acct A to Acct B shows up as `lot_removed` on A + `new_stock`
|
|
on B (or a `rollup_delta` share increase if B already had a VTI
|
|
lot), neither of which can be matched by the current
|
|
amount-based cash matcher.
|
|
|
|
Proposed: a second pass in `matchTransfers` that iterates
|
|
`type::in_kind` records and looks for same-symbol matches across
|
|
`lot_removed` on `from` + `new_stock`/`rollup_delta` on `to`
|
|
within the window. Gated on share-count and open_price sanity so
|
|
a partial transfer doesn't false-positive against an unrelated
|
|
edit.
|
|
|
|
Driver: when the user starts moving positions between accounts
|
|
directly (e.g. Roth conversion of already-held shares, 401k →
|
|
rollover IRA in-kind) rather than liquidating and re-buying.
|
|
|
|
## Torn SRF files from server sync (root cause unknown)
|
|
|
|
**Status:** Root cause still unidentified. We have mitigations and
|
|
diagnostics in place that keep torn responses from corrupting the
|
|
cache, but we don't yet know *why* responses arrive torn. Until we
|
|
have a root cause, this is not resolved — it's mitigated.
|
|
|
|
Mitigations landed so far:
|
|
|
|
- `syncFromServer` (`src/service.zig`) validates responses via
|
|
`cache.Store.looksCompleteSrf` before `writeRaw`. Torn HTTP bodies
|
|
(empty, missing `#!srfv1` header, or no trailing newline) are
|
|
rejected with a warn-level log and NOT written to cache.
|
|
- HTTP responses are checked for an `ETag` sha256 header; on mismatch
|
|
we retry the request once before giving up and falling back to the
|
|
provider.
|
|
- Read-path self-heal: on SRF parse failure during read, the cache
|
|
entry is invalidated so a subsequent refresh can repair without
|
|
user intervention.
|
|
- Diagnostics: richer error capture around the sync path. So far,
|
|
HTTP transit is the dominant source of torn responses — but that's
|
|
an observation, not a root cause.
|
|
|
|
**Remaining work:**
|
|
|
|
- Identify root cause. Candidates to investigate: proxy/load-balancer
|
|
behavior, HTTP keepalive reuse, partial reads on the server side,
|
|
client-side buffer handling. The etag retry tells us whether the
|
|
problem is per-request or persistent; dig into the diagnostics
|
|
output when the next occurrence is captured.
|
|
- Once root cause is known, decide whether the current mitigations
|
|
are sufficient or whether a targeted fix is needed. The
|
|
mitigations may end up being the whole answer, but we can't
|
|
conclude that without understanding the underlying cause.
|
|
|
|
(Content-Length validation was considered and rejected: once the
|
|
server starts compressing response bodies, Content-Length reflects
|
|
the compressed byte count, not the decoded payload, so it's not a
|
|
reliable integrity check.)
|
|
|
|
## Market-aware cache TTL for daily candles
|
|
|
|
Daily candle TTL is currently 23h45m, but candle data only becomes meaningful
|
|
after the market close. Investigate keying the cache freshness to ~4:30 PM
|
|
Eastern rather than a rolling window. This would avoid unnecessary refetches
|
|
during the trading day and ensure a fetch shortly after close gets fresh data.
|
|
Probably alleviated by the cron job approach.
|
|
|
|
## On-demand server-side fetch for new symbols
|
|
|
|
Currently the server's SRF endpoints (`/candles`, `/dividends`, etc.) are pure
|
|
cache reads — they 404 if the data isn't already on disk. New symbols only get
|
|
populated when added to the portfolio and picked up by the next cron refresh.
|
|
|
|
Consider: on a cache miss, instead of blocking the HTTP response with a
|
|
multi-second provider fetch, kick off an async background fetch (or just
|
|
auto-add the symbol to the portfolio) and return 404 as usual. The next
|
|
request — or the next cron run — would then have the data. This gives
|
|
"instant-ish gratification" for new symbols without the downsides of
|
|
synchronous fetch-on-miss (latency, rate limit contention, unbounded cache
|
|
growth from arbitrary tickers).
|
|
|
|
Note that this process doesn't do anything to eliminate all the API keys
|
|
that are necessary for a fully functioning system. A more aggressive view
|
|
would be to treat ZFIN_SERVER as a 100% source of record, but that would
|
|
introduce some opacity to the process as we wait for candles (for example) to
|
|
populate. This could be solved on the server by spawning a thread to fetch the
|
|
data, then returning 202 Accepted, which could then be polled client side. Maybe
|
|
this is a better long term approach?
|
|
|
|
## Low-priority items
|
|
|
|
The following items are acknowledged but not prioritized. Listed here
|
|
so they don't get lost; pick up opportunistically.
|
|
|
|
### UX
|
|
|
|
- **CLI options command UX.** The `options` command auto-expands only
|
|
the nearest monthly expiration and lists others collapsed. Reconsider
|
|
the interaction model — e.g. allow specifying an expiration date,
|
|
showing all monthlies expanded by default, or filtering by strategy
|
|
(covered calls, spreads).
|
|
- **TUI: toggle to last symbol keybind.** A single-key toggle that
|
|
flips between the current symbol and the previously selected one
|
|
(like `cd -` in bash or `Ctrl+^` in vim). Store `last_symbol` on
|
|
`App`; on symbol change, stash the previous. Useful for
|
|
eyeball-comparing performance/risk data between two symbols.
|
|
|
|
### Data quality
|
|
|
|
- **Fix `enrich` command for international funds.** `deriveMetadata`
|
|
in `src/commands/enrich.zig` misclassifies international ETFs:
|
|
1. `geo` uses Alpha Vantage's `Country` field, which is the *fund
|
|
issuer's* domicile (USA for all US-listed ETFs), not the fund's
|
|
investment geography. Every US-domiciled international fund gets
|
|
`geo::US`.
|
|
2. `asset_class` short-circuits to `"ETF"` when
|
|
`asset_type == "ETF"`, or falls through to a US-market-cap
|
|
heuristic that always produces `"US Large Cap"` /
|
|
`"US Mid Cap"` / `"US Small Cap"`.
|
|
|
|
Known misclassified tickers (all came back as
|
|
`geo::US, asset_class::US Large Cap`):
|
|
- **FRDM** — Freedom 100 Emerging Markets ETF → should be
|
|
`geo::Emerging Markets, asset_class::Emerging Markets`
|
|
- **HFXI** — NYLI FTSE International Equity Currency Neutral ETF
|
|
→ should be
|
|
`geo::International Developed, asset_class::International Developed`
|
|
- **IDMO** — Invesco S&P International Developed Momentum ETF →
|
|
should be
|
|
`geo::International Developed, asset_class::International Developed`
|
|
- **IVLU** — iShares MSCI International Developed Value Factor
|
|
ETF → should be
|
|
`geo::International Developed, asset_class::International Developed`
|
|
|
|
The Alpha Vantage OVERVIEW endpoint doesn't provide fund geography
|
|
data. Options: use the ETF_PROFILE holdings/country data to infer
|
|
geography, parse the fund name for keywords ("International",
|
|
"Emerging", "ex-US"), or accept that `enrich` is a scaffold and
|
|
emit a `# TODO` comment for ETFs instead of silently misclassifying.
|
|
|
|
### Options / valuation
|
|
|
|
- **Per-account covered call adjustment.** `adjustForCoveredCalls` in
|
|
`valuation.zig` operates on portfolio-wide aggregated allocations.
|
|
It matches sold calls against total underlying shares across all
|
|
accounts. This is wrong — calls in one account can only cover
|
|
shares in that same account. Fixing means restructuring
|
|
`portfolioSummary`, since `Allocation` is currently
|
|
account-agnostic. Low priority — naked calls are rare, and calls
|
|
are typically in the same account as the underlying.
|
|
- **Covered call adjustment O(N*M) loop.** `adjustForCoveredCalls`
|
|
has a nested loop — for each allocation, it iterates all lots to
|
|
find matching option contracts. Fine for personal portfolios
|
|
(<1000 lots). Pre-indexing options by underlying would help if
|
|
someone had a very large options-heavy portfolio.
|
|
|
|
### Analysis / correctness
|
|
|
|
- **Analysis account/asset-class total mismatch.** The "By Account"
|
|
and "By Tax Type" sections in the analysis command sum to slightly
|
|
more than "Asset Class" (~0.6% error). Likely a discrepancy between
|
|
how the lot-level account loop values cash, CDs, or options vs how
|
|
the asset-class section computes them via `portfolio.totalCash()` /
|
|
`totalCdFaceValue()`. Per-account values themselves are correct
|
|
after the price_ratio fix.
|
|
|
|
### Audit
|
|
|
|
- **Audit large-lot threshold tuning.** `src/commands/audit.zig` uses
|
|
`audit_large_lot_threshold: f64 = 10_000.0` as the cutoff for
|
|
"surface this new lot for confirmation." Revisit if $10k proves too
|
|
aggressive (ESPP accruals spam the report) or too permissive (large
|
|
DRIP confirmations slip past). If runtime tuning becomes necessary,
|
|
a `--large-lot <amount>` flag or a global
|
|
`audit_large_lot_threshold` field on `accounts.srf` would be
|
|
reasonable extensions.
|
|
|
|
### Infra / performance
|
|
|
|
- **HTTP connection pooling.** Parallel server sync in `loadAllPrices`
|
|
spawns up to 8 threads, each with its own HTTP connection. Could
|
|
reuse connections to reduce TCP handshake overhead. Only matters
|
|
with very large portfolios (100+ symbols) hitting ZFIN_SERVER.
|
|
- **Streaming cache deserialization.** Cache store reads entire files
|
|
into memory (`readFileAlloc` with 50MB limit). For portfolios with
|
|
10+ years of daily candles, this could use significant memory.
|
|
Keep current approach unless memory becomes a real problem.
|