Autotrader Field Notes
Vol. 2 - 2026-05-08 - Vm Experiment Edition

An autonomous trading agent,
two weeks in.

A paper-trading experiment on Indian equities, where Claude itself runs the loop on a free GCP VM and edits its own strategy between polls. Plus the parts that actually broke.

Executive Summary

+8.05% over 8 closing sessions, but the bigger story is what failed in between - and what that says about deploying autonomous AI in production.

The strategy is doing fine. Eight closing prints between Apr 27 and May 8 trace a steady upward grind, with one big leg up on May 4 (+4.16%) when the agent caught and fixed a volume-filter bug mid-session. Of the 171 trades on the books, 4 are audit-trail reversals that don't represent actual decisions. Today (May 8) closed at Rs 108,049 - 8 TARGETs hit, 6 STOPs fired, net realized +Rs 833.

The harder problems were operational. A corporate action (VEDL demerger) booked a phantom Rs 7,750 loss on stale Kite quotes. The agent loop fired three trades on stale post-close data on May 5. The cron-based loop has died twice; on May 7 it was dead through the entire trading session. Most of these have been fixed in code; one (loop liveness) is still a manual-restart problem.

+8.05%
Run to date
171
Trades logged
2
Sessions missed
4
Reverse-corrections
01

What this experiment is.

The setup is a paper-trading simulation of a Nifty 500 momentum strategy, running on real Indian transaction costs (STT, GST, stamp duty, brokerage). No real money. Capital starts at Rs 1,00,000 and a hard floor at Rs 90,000 forces full liquidation if breached.

What makes it unusual: the trading agent is Claude itself, running on a GCP e2-micro VM in a persistent tmux session. Every five minutes during market hours, the agent runs python live.py to fetch prices and execute signals from strategy.py. If patterns in the output suggest a strategy change, the agent edits strategy.py, commits, pushes, and the next poll picks up the new code via Python's hot-reload.

So there are two loops: a fast price loop (5 min) and a slow self-edit loop (driven by accumulated evidence in learnings.md).

02

The VM experiment, in one diagram.

LayerWhat it isCost / week
VMGCP e2-micro, Ubuntu 24.04, IST timezoneRs 0 (free tier)
Persistencetmux session, survives SSH disconnect-
AgentClaude Code with /loop 5m, edits strategy.pyMax plan
DataKite Connect API (Zerodha), real-time quotesRs 125
AuthDaily OAuth: ~30s manual ritual at 8 AM IST5 sessions

The daily auth is the only thing that can't be automated - Zerodha's terms of service forbid scripted logins. Everything else runs untouched.

03

Cumulative performance.

Eight closing prints since Apr 27, the start of run 2 (run 1 was a different strategy on backtested data). Each entry is a real session-end value with one carve-out: May 1 was a market holiday and the bar represents an unchanged hold; May 5 was reversed entirely after the post-close incident.

DateSessionCloseDay %Note
Apr 27MonRs 100,968+0.97%Run 2 begins
Apr 28TueRs 102,291+1.31%-
Apr 29WedRs 102,209-0.08%-
Apr 30ThuRs 102,128-0.08%VEDL demerger - phantom Rs 7,750 reversed
May 1FriRs 102,075-0.05%Market holiday - no session
May 2-3Sat-Sun--Weekend
May 4MonRs 106,326+4.16%Volume-floor fix shipped mid-session
May 5Tue--Loop stalled all day. 3 phantom post-close trades reversed.
May 6WedRs 107,415+1.02%Loop died after 13:39 IST. Session partial.
May 7Thu--Loop dead all day. No polls fired.
May 8FriRs 108,049+0.59%Loop restarted 08:34 IST. 8 TARGETs, 6 STOPs, +Rs 833 realized.
04

Issues, in chronological order.

Each is recorded with what happened, what was fixed, and whether the underlying class of bug is closed.

Apr 3013:00 IST
Fixed in code

VEDL demerger - stale-LTP phantom loss.

Vedanta went ex-demerger on Apr 30. Kite's pre-open price discovery session returned a stale LTP of Rs 773.60. The strategy bought 10 shares at that price; the real opening cross was Rs 289.50; the position immediately stopped out for a Rs 7,750 "loss" against a price that never traded.

Fix: a new corporate_actions.py module reads NSE's ex-date feed and blocks any ticker whose corporate-action subject indicates a price gap (demerger, split, bonus, capital reduction). The Apr 30 trades were preserved in the trade log for audit, with a single ADJUSTMENT_VEDL_DEMERGER entry adding back the precise net cash impact.

May 410:00 IST
Fixed in code

Volume-floor regression - blind to morning candidates.

Primer - what's a volume ratio?

Volume = number of shares that changed hands. Daily bar = one row of OHLC + volume for one trading day. Volume ratio = today's volume divided by the 20-day average; a liquidity filter that asks "is this stock active enough to trade?" The strategy required ratio >= 0.5 (today must run at least half of normal activity).

The bug: Kite's "today" bar is the partial bar still being built. At 10:00 IST it only contains ~45 min of volume, so the ratio comes out tiny (0.05-0.15) for almost every stock - even ones trading enthusiastically. The filter rejected nearly the whole market as illiquid.

Four consecutive polls between 09:36 and 09:59 produced zero BUY signals despite 60% cash and a clearly receptive market. A funnel diagnostic on the 09:18 cache showed 407 tickers passing every filter except this one, and only 1 clearing the 0.5x volume gate.

Fix: lowered the floor from 0.5x to 0.15x. Translation: "if a stock has at least 15% of its normal volume so far this morning, it's tradeable." Workaround rather than deep fix - the cleaner answer is to use yesterday's completed bar - but it's one line and ships immediately.

Same-day result: five afternoon TARGETs landed in the post-fix window (NETWEB +Rs 240, IKS +Rs 283, ZENTEC +Rs 321, plus SAILIFE and USHAMART earlier). +6.33% session close, the run's best single day. Without the fix the afternoon would have been mostly idle drift.

May 516:55 IST
Reverted

Post-close phantom trades - loop fired on stale Kite quotes.

The cron driving /loop aged out somewhere between May 4 evening and May 5 morning. The agent loop ran zero polls during market hours. When it resumed at 16:55 IST after the session closed, live.py correctly logged "Market: closed" but still passed the (now stale) quotes through to strategy.py, which generated three signals on those prices: an RKFORGE TARGET sell and two BUYs.

All three trades were reverse-corrected by a one-shot script. Cash restored to the May 4 closing value, positions restored, May 5's daily_values entry dropped, three REVERSAL_MAY5 audit entries appended.

May 6morning
Fixed in code

Post-close defense, shipped to live.py.

Same class of bug as May 1 (holiday) and May 5 (post-close): live.py running the strategy regardless of market state. Now live.py early-returns with no fetch, no strategy call, and no state save when the market isn't open. State is intentionally left untouched so the day's true close survives any number of post-close polls.

Commit f85e79e: "live: skip fetch + strategy + state save when market is not open."

May 6-713:39 onwards
Open

Loop liveness - tmux + Claude session died silently.

Last poll on May 6 was 13:39 IST. The loop produced no polls for the rest of the May 6 session, all of May 7, and the start of May 8. No tmux server running on the VM, no claude processes alive. Cause not yet root-caused - either a Claude Code crash, an OOM (the e2-micro has 1 GB + 2 GB swap), or the cron simply expired again.

The current detection method is "user notices the lack of new log entries." That is the open work item. A simple watchdog (cron entry that pings the log file's mtime and emails on staleness) would close it.

May 808:34 IST
Recovered

Manual restart, fresh token, full session traded.

tmux session recreated, Claude Code launched with --dangerously-skip-permissions, /loop reissued. First poll at 09:17 cleared three overnight positions (INTELLECT TARGET +445, LTF TARGET +347, LLOYDSME STOP -355). Loop ran cleanly through the full session - 14 trades total, 8 TARGETs (+Rs 2,622), 6 STOPs (-Rs 1,789), net realized +Rs 833. Closed at Rs 108,049.

05

What the experiment has actually taught.

Stale data is the dominant risk, not bad strategy.

Three of the four reverse-corrections were stale-data incidents (VEDL demerger LTP, May 5 post-close, and conceptually the May 1 holiday hold). The actual strategy logic - momentum + breakout, -1.5% stop, +2% target - has held up across two weeks without intervention. The losses that mattered were caused by the system trusting a price that wasn't a real market price.

The agent loop is more brittle than the trading loop.

Every fail in the post-mortem is a meta-system fail: cron expiry, tmux death, stale tokens, post-close polling. The strategy-layer fixes are simple Python edits that take five minutes. The infrastructure-layer fixes are the harder ones because they require touching pieces (cron, tmux session lifecycle, Claude Code CLI) that the agent doesn't naturally see fail.

Self-editing has worked, with restraint.

The agent has shipped 12 strategy edits across the run. The most consequential one (volume-floor) was triggered by accumulated evidence across four polls, not by a single losing trade - which is exactly the discipline written into the loop prompt. No commit has been reverted on strategy grounds; the only reverts have been data-correction reversals.

Free infrastructure is enough.

e2-micro at Rs 0/month carries the entire load. The only paid component is Kite Connect at Rs 500/month for real-time quotes. For an experiment whose value is in the loop dynamics rather than the strategy alpha, that ratio is correct.

06

Why this experiment matters beyond P&L.

The strategy returns are the least interesting thing about this run. The transferable lessons are about how autonomous AI systems behave in production-shaped environments - what fails, what holds, and what the real success metric actually is.

Six takeaways that generalize.

01

Stale data is more dangerous than bad logic.

The VEDL demerger and post-close trades were not strategy mistakes. They were the system trusting prices that were not valid tradable prices. Almost every production AI failure follows the same pattern: bad inputs, stale state, missing context - not the model being dumb.

02

Agent reliability is mostly boring infra.

The only open problem is loop liveness - cron, tmux, and Claude dying silently. In real systems "is the agent still alive and doing what we think?" is a first-class product and infrastructure question, not an afterthought.

03

Guardrails beat intelligence.

Best fixes were dumb and deterministic: do not trade when market is closed, block corporate-action tickers, refuse strategy execution if market state is invalid. Deterministic rails around the agent matter more than clever prompting.

04

Self-editing only works inside narrow permissions.

The agent can modify strategy.py. Nothing else. That single constraint is probably why this experiment is interesting rather than chaotic. For real-world agentic systems, the permission boundary is the product.

05

Logs and audit trails are not optional.

The four reverse-corrections were possible because every trade was preserved and corrections appended as new audit entries. In any real-money setup: append-only logs, state snapshots, replayability, and explainable actions are non-negotiable.

06

P&L is a weak success metric here.

The valuable metric is whether the system discovered bugs, improved its own behavior, avoided repeating mistakes, and operated autonomously for longer stretches. That is the actual agentic-system question - not what return number landed.

How unreal is the +8.08%?

Pretty unreal. Not useless - but very far from "this would make 8% in real money."

The honest framing
The simulated loop has found profitable-looking trades under simplified execution assumptions.
Not this
"This strategy would return +8% in real markets."
Yes this
"The agent loop discovered, fixed, and avoided repeating six classes of operational bug across two weeks."

The biggest gaps between paper and real money:

Execution price is worse.

Paper trades assume you get the observed quote. Reality has bid-ask spread, slippage, partial fills, and fast price movement after the signal.

Market impact appears sooner.

For Rs 1L on liquid Nifty 500 names, impact is small - but on aggressive entries or thin names, your own order moves the fill price.

Stops and targets don't trigger cleanly.

A -1.5% stop in simulation sounds precise. In real markets, price gaps through the stop, liquidity vanishes, or the order fills materially worse.

Latency matters.

A 5-minute polling loop is coarse. By the time the agent sees a breakout, computes, and the order fills, the edge may be gone.

Stale quotes become losses.

You saw this in paper form with VEDL and the post-close trades. With real money these are not audit reversals - they are losses unless the broker or order layer prevents them.

Operational and tax overhead is real.

Beyond modeled transaction costs: tax treatment, reconciliation, broker reports, failed orders, compliance. Cumulative drag matters.

Psychology changes the experiment.

With real money you intervene, panic, override, or stop the agent after a drawdown. The experiment ceases to be about the agent and becomes about you.

Minimum upgrades before real money
  1. Paper trade against actual bid/ask, or apply conservative slippage assumptions to every fill.
  2. Add a broker-style order simulator with explicit states: pending, filled, rejected, partial-fill.
  3. Add hard liveness monitoring on the agent itself - alarms when polls stop arriving.
  4. Add "no trade" guards for stale quotes, corporate actions, post-close, low liquidity, and unusual gaps.
  5. Run a shadow mode: emit trades live, compare against real executable prices, do not place orders.
The actual lesson
Autonomous agents are only as good as the operational envelope around them.
07

Where it stands at the close.

Markets closed at 15:30 IST. Final portfolio: Rs 108,049 (+0.59% on the day, +8.05% run-to-date). Six positions held overnight, Rs 15,157 in cash. Loop ran clean from 09:17 onwards with no gaps after the morning restart.

14 trades through the day - the morning was the busiest stretch, with seven exits in the first 25 minutes. The afternoon saw two stops (ACMESOLAR -Rs 235, LUPIN -Rs 199) offset by a DABUR target hit at 11:54 (+Rs 277) and a fresh BRITANNIA entry.

TimeActionTickerP&LReason
09:17SELLINTELLECT+Rs 446Target +3.7%
09:17SELLLTF+Rs 348Target +2.4%
09:17SELLLLOYDSME-Rs 356Stop -2.4%
09:21SELLNIVABUPA+Rs 277Target +2.0%
09:21SELLNUVAMA+Rs 284Target +2.2%
09:26SELLM&M+Rs 250Target +2.0%
09:31SELLTHERMAX+Rs 334Target +2.6%
09:36SELLSONATSOFTW-Rs 498Stop -3.1%
09:40SELLBSE+Rs 408Target +2.7%
10:33SELLABLBL-Rs 272Stop -1.9%
10:56SELLGRAVITA-Rs 231Stop -1.5%
11:39SELLACMESOLAR-Rs 235Stop -1.6%
11:54SELLDABUR+Rs 277Target +2.2%
12:03SELLLUPIN-Rs 199Stop -1.5%

Realized today: +Rs 833. 8 TARGETs (+Rs 2,622) vs 6 STOPs (-Rs 1,789). Holding overnight: ANANDRATHI, SONACOMS, MAHABANK, PIDILITIND, MGL, BRITANNIA.