All work
Project 08·Real-Time & Distributed Systems

IB Data Automation System

Resilient large-scale data pipeline.

01 · The challenge

A financial brokerage affiliate needed full data on thousands of referred trading accounts - but the source portal had no API, only a slow, paginated web interface that had to be checked by hand.

02 · What I built

A scheduled automation system that logs into the portal, extracts every account record across all pages and sub-accounts, deduplicates the data, and delivers it as a structured feed on a recurring cycle - with no manual involvement.

03 · The hard part

Reliable extraction from a hostile, API-less interface. I engineered around bot detection, dynamic UI overlays, and a subtle pagination-state bug that silently skipped records on account switches - using event-driven waits rather than fragile fixed delays so every page is captured correctly.

04 · The outcome

4,300+ account records, each with twenty-plus data fields, extracted reliably every cycle. A manual, repeated task became a zero-touch automated pipeline.

05 · Stack

Node.jsheadless browser automationscheduled jobsstructured data delivery