QBReader overnight

for Geoffrey · 2026-05-21

David ran six subagents against your codebase overnight on a branch called ai-overnight-improvements (off main, not pushed). Eleven commits landed — 7 real bug fixes and 4 a11y/style tweaks. Below are the executive summary and full reports. Start with Morning Report if you only read one.

Morning Report

Overnight Report — `ai-overnight-improvements`

Six subagents ran in parallel for ~13 minutes against a fresh branch off main. This file is the executive summary; the detailed reports live in 01_BUGS.md through 06_ARCHITECTURE.md.

What's on the branch

Branch: ai-overnight-improvements (off main, 11 commits, not pushed).

Working tree is clean. The AI_OVERNIGHT_REPORTS/ directory is intentionally untracked — you can decide whether to commit it, delete it, or move it.

6c88f0ce fix: initiated-votekick log message swallowed by username argument
b67e4540 fix: guard Room.sendToSocket against missing socket
49dae858 fix: honor threePartBonuses filter in random-bonus selection
119a01f7 a11y: add autocomplete hints to account form fields
ef57a756 fix: starred-question lookup sort order was silently ignored
01b3012b fix: reveal tossup answer when all remaining players have buzzed
56451097 fix: votekick logic - allow superpowers to count, only kick on success
272f0b0a Revert accidental inclusion of WIP votekick changes
d4a6833b fix: correct operator precedence in modaq metadata field
777a329f a11y: add labels and screen-reader text to icon-only controls
e3cae38c style: fix navbar background and logo in night mode

Stat: git diff main..HEAD --stat shows 22 files / +72 / −50 lines — all tightly scoped, no rewrites.

272f0b0a is a self-revert: the design subagent's empty-state commit accidentally absorbed working-tree edits the bug subagent was making to the votekick logic, and was reverted immediately. The intended votekick fix landed cleanly later in 56451097. The ServerMultiplayerRoomMixin.js diff in HEAD vs main matches the committed votekick fix and contains no unintended changes — verified.

TL;DR by area

Area Verdict Highest-priority item
Bugs Several real ones, 7 fixed Tossup answer was never revealed if a buzzer disconnected (the same bug that's on fix-buzz-count-after-leave)
Security Needs urgent attention Password hashing is triple-SHA256, not bcrypt; JWT/cookie secrets default to literal 'secret'/'salt' if env vars unset
Performance Mostly OK; a few real wins available Unanchored regex on /api/query does full collection scans (10–50× win with text index)
UX / a11y Functional but rough edges; 4 fixes Night-mode navbar was a giant light-blue stripe with a black wordmark; fixed
Features 13 ranked proposals Post-game Claude-API tutor + spaced-repetition queue rank highest
Architecture Solid for current scale; risks at 5K+ In-memory multiplayer state lost on every Heroku restart (and there's a daily 8am UTC one)

1. Bugs fixed (7)

In rough order of how-bad-was-this:

  1. 01b3012b — Tossup answer never revealed when buzzes outnumber sockets. quizbowl/TossupRoom.js:135. This is the same class of bug that the fix-buzz-count-after-leave branch was created to fix; the subagent re-derived it independently from scratch. Reasonable to retire that branch.

  2. d4a6833bmodaq metadata field was always wrong due to ternary precedence: \${cat} - ${sub}` + alt ? ` - ${alt}` : ''. The left string is always truthy, so the field was either - or'', never - `. Pure logic bug.

  3. 56451097 — Votekick logic: ineligibility check ignored superpowers, and initiating a votekick added the target to kickedUserList before the vote was successful, so a single initiation kicked the target on reconnect. Both fixed.

  4. ef57a756 — Starred-question lookups were passing an aggregation pipeline stage as the options arg to find(), silently dropping the sort. Replaced with { sort: { _id: -1 } }.

  5. 49dae858threePartBonuses filter was destructured by callers and validated by the route but never read by getRandomBonuses, so 4+ part bonuses could still leak through.

  6. b67e4540Room.sendToSocket crashed on a missing socket (synchronous TypeError that escaped the WebSocket message handler). Guarded.

  7. 6c88f0ceinitiated-vk toast rendered the whole sentence in the username slot and "undefined" in the message slot. Mis-split args.

Bugs proposed but NOT fixed (6) — see 01_BUGS.md

Worth your judgment:

2. Security — needs urgent attention

Five high-severity findings (02_SECURITY.md). The two I'd act on this week:

Also high but less urgent:

Medium-severity items worth looking at: JWT tokens have no exp claim, reset/verify tokens live in process memory (and don't survive the daily 8am restart!), no CSRF protection, admin-role check is binary (no scoping), and there's dangerouslySetInnerHTML on admin pages that takes question data raw.

3. Performance — modest, real wins

Two are worth doing soon:

Speculative but plausible:

Full report in 03_PERFORMANCE.md includes bundle sizes, query patterns, and 14 ranked findings.

4. UX / a11y — 4 fixes landed

Committed:

Proposed (not done):

Full report in 04_DESIGN.md.

5. Feature proposals — top picks

From 05_FEATURES.md, ranked:

  1. Personalized question recommendations [M] — surface tossups from the user's weakest categories.
  2. Post-game Claude tutor [M] — Sonnet, with prompt caching on question metadata. Cost estimate ~$0.02–0.05 per round. Explains clue chains, suggests drills. This is the highest-leverage AI feature for a quizbowl site; nothing else is close.
  3. Spaced-repetition queue [M] — Anki-style review of missed clues and answers.
  4. Time-series stats by category [M] — trending power/neg/conversion rates with weekly granularity.
  5. Embeddable QOTD widget [S] — for school/club sites and Discord. Cheap viral surface.

Second tier: tournament brackets, TTS reading, public leaderboards, bot opponents, spectator mode.

Long-term: question-authoring UI, Discord OAuth, better errata workflow.

Explicitly not recommended by the subagent: native mobile app (PWA is better ROI), AI question synthesis (community-written questions are better), classroom mode (spectator mode covers it).

6. Architecture — solid for now, predictable cliffs

From 06_ARCHITECTURE.md. The site is well-designed for ~100–500 concurrent players. Risks past that:

The architecture review's top-5 refactors-by-leverage list is the right 6-month roadmap if you want one.


What I'd do first

If your friend wants a punch-list for the morning:

Today (security): 1. Confirm SECRET, SALT, SECRET_KEY_1, SECRET_KEY_2 are set in Heroku config vars. Strip defaults from server/authentication.js:16-17 and app.js:32. 2. npm update ws → 8.21+. 3. Add httpOnly: true, secure: true, sameSite: 'strict' to cookieSession in app.js.

This week: 4. Plan the bcrypt migration. Soft migration with dual-verifier is the sane path. 5. Review and merge the 7 bug-fix commits on ai-overnight-improvements (or cherry-pick the ones you trust). 6. Decide on the proposed-bugs in 01_BUGS.md — most are real but product-sensitive.

This month: 7. Add a Mongo text index for question_sanitized / answer_sanitized. This is a single command and gives the biggest user-visible perf win. 8. Move reset/verify tokens to a Mongo collection with a TTL index — the daily 8am restart currently invalidates every in-flight email link. 9. Write the first multiplayer-room integration test.

Next quarter: 10. Redis-backed multiplayer state + WebSocket reconnect. 11. Pick one AI feature to ship (post-game tutor is the obvious one).


Trust-but-verify notes

I spot-checked the agents' work but not exhaustively: - The committed code diff is tightly scoped and all 22 changed files are consistent with the reports. No surprises in the ServerMultiplayerRoomMixin diff against main after the revert. - I verified the most alarming security claim (triple-SHA256 hashing) by reading server/authentication.js directly. Confirmed. - The bug-fix commits passed npm run lint per the agent (semistandard --fix). I did not run the test suite — there isn't one. - Performance estimates ("10-50× win", "50-70% WS traffic reduction") are the subagent's gut estimates from code-reading, not measurements. Treat them as "this is worth measuring" not "this is the number." - Bundle / file sizes in the perf report were measured.


Sleep well — the heavy lifting waiting for you is the bcrypt migration and confirming production secrets. Everything else can move at your pace.

Bugs

Bug Audit

Fixed (committed)

Proposed (needs review)

Not bugs but suspicious

Security

Security Audit

TL;DR

QBReader has a solid foundation with parameterized database queries, proper password hashing, and authentication checks on protected endpoints. However, there are critical issues with weak default credentials, missing cookie security flags, and an outdated WebSocket dependency with a known vulnerability. The XSS surface is limited but exists in specific admin pages.

Findings

High severity

2. Session cookies missing httpOnly and secure flags

3. Weak password hashing (triple SHA256 without bcrypt)

4. Outdated ws package with known vulnerability (GHSA-58qx-3vcg-4xpx)

5. User enumeration via password reset endpoint

Medium severity

6. JWT tokens lack expiration (no exp claim)

7. Password reset tokens stored in-memory without persistence

8. No CSRF protection on state-changing endpoints

9. Admin endpoint authorization only checks user role, not ownership

10. Unvalidated user input in client-side admin pages (HTML injection)

The `removeParentheses()` function only strips trailing parentheses; it doesn't HTML-escape. If `myBuzz.answer` contains `<script>` or `<img onerror=...>`, it will execute. - **Why it matters:** An admin viewing results could be XSS'd if a user crafted a malicious answer string (e.g., during buzzing). Admin token would be stolen. - **Fix:** Escape all user data before inserting into HTML:javascript

Answer: ${escapeHTML(removeParentheses(myBuzz.answer))}

```

11. Stored XSS in admin category reports via dangerouslySetInnerHTML

Low severity / hygiene

12. Rate limiting is lenient and applies equally to all endpoints

13. No security headers (CSP, X-Frame-Options, X-Content-Type-Options)

14. CORS enabled without origin whitelist

15. Incomplete validation of email and username input

16. No database connection error handling for authentication

17. WebSocket messages lack input validation and can cause XSS

18. No logging of security-relevant events

19. No rate limiting on POST /api/report-question

20. Regex DoS possible in database queries

What looks good

Parameterized database queries: All MongoDB queries use proper operators ($in, $regex with options) instead of string concatenation. No NoSQL injection vectors found.

Password storage with salt: Passwords are salted (though not with bcrypt).

Authorization checks on protected endpoints: Admin and user-specific routes properly verify checkToken() before allowing access.

Email verification flow: Email links expire in 15 minutes and are one-time use (token cleared after use).

Stripe webhook signature verification: stripe.webhooks.constructEvent() properly validates HMAC signatures.

Object ID validation: Admin and user-facing endpoints validate ObjectId format before querying (new ObjectId() with try/catch).

Response status codes: Consistent use of proper HTTP status codes (401 for auth failure, 403 for forbidden, 400 for bad input).

User enumeration partially mitigated on login: Login endpoint returns 401 for both bad username and bad password (doesn't reveal which).

HTML escaping in some client code: The escapeHTML() utility is used correctly in several admin pages (compare.js, category-stats.js).

DOMPurify installed: Package is installed (though not used client-side currently).

Out of scope / not checked

Performance

Performance & Efficiency Audit

TL;DR

The qbreader site has several preventable performance bottlenecks. The biggest wins are: (1) full-collection scans via unanchored regex in search queries—these should use wildcard indexes or be rewritten to avoid regex; (2) synchronous file I/O on every page load in ssi-middleware.js; (3) per-connection timers in multiplayer rooms that keep running even when idle; (4) no HTTP cache headers on cacheable assets; (5) broadcast of huge state objects on every event instead of deltas. The frontend is reasonably optimized (webpack code-split, production mode), but there's DOM thrashing on multiplayer chat.


High-impact opportunities (worth doing)

1. Unanchored regex searches cause full collection scans

2. Synchronous readFileSync blocks the event loop on every non-API request

3. Per-room timer intervals keep running idle rooms alive

4. Leaderboard aggregation scans entire per-tossup-data and per-bonus-data collections

5. Full room state broadcast on every message, no delta updates

6. No HTTP Cache-Control headers on static assets or cacheable endpoints

7. Frequency list aggregation doesn't limit output size early


Medium-impact

8. Regex patterns in set.name search could be optimized

9. DOM churn in multiplayer chat updates

10. Rate limiter uses Set per socket, could use token bucket or sliding window


Nits / micro-optimizations

11. Object.keys(this.players) called multiple times per votekick

12. Votekick threshold calculation iterates over all players

13. Banner and kick cleanup interval runs every 5 minutes on every room

14. Payload size for connection-acknowledged message is large


Numbers I gathered

Bundle / File sizes

Query patterns observed

Startup / Database


What looks well-tuned already

Positive patterns:

  1. Multiplayer rooms use Promise.all() to parallelize queriesgetTossupQuery and getBonusQuery run in parallel in /home/david/code/website/database/qbreader/get-query.js:144.
  2. Random tossup/bonus endpoints use $sample aggregation stage — avoids sorting and is ~3-4× faster than the general query with randomize option.
  3. Webpack is in production mode — should minify and tree-shake, though the output isn't visible in the repo (likely built at deploy time).
  4. Code-splitting is configured — webpack has separate entry points for tossups, bonuses, multiplayer, db explorer, etc.
  5. Express CORS is only on /api — doesn't add overhead to HTML routes.
  6. Room cleanup on disconnect is present — players are removed (or marked offline) to prevent unbounded memory growth.
  7. WebSocket payload size limit enforcedWEBSOCKET_MAX_PAYLOAD = 10 KB in server.js to prevent huge messages.
  8. Async/await throughout — no blocking I/O in hot paths except ssi-middleware.
  9. Rate limiting per socket in multiplayer — basic protection against message spam.

Recommendations by priority

Priority Issue Action Est. Effort Est. Impact
🔴 High No text/wildcard indexes on search fields Create MongoDB indexes; consider search redesign Medium Very High
🔴 High Sync file reads in ssi-middleware Cache rendered HTML; use async reads Low High
🔴 High Idle room timers keep running Add checks to stop timers when no players Low High
🔴 High No HTTP cache headers Add Cache-Control to static routes Low High
🟡 Medium Large state broadcasts over WebSocket Implement delta updates High Medium
🟡 Medium Leaderboard scans entire stats collections Add caching + index optimization Medium Medium
🟡 Medium Frequency list doesn't early-limit Optimize aggregation pipeline Low Low-Medium

Cold Start & Heroku-specific notes

Design / UX

Design & UX Audit

Scope: client/, scss/, and the SSI partials in client/ssi/. Pages sampled: home (/), tossups (/play/tossups/), bonuses (/play/bonuses/), multiplayer lobby (/play/mp/), multiplayer room (/play/mp/room.html), database search (/db/), account pages (/user/*), about, settings, 404.

Implemented (committed)

Proposed (not implemented)

Quick wins

Bigger projects

Accessibility report

Findings by severity.

High

Medium

Low

Mobile/responsive findings

Notes

The lint script (npm run lint) runs semistandard --fix, which made small autofixes to files unrelated to this audit while I worked (server/database files in the working tree). Those unstaged fixes are genuinely good (one is a real operator-precedence bug in database/qbreader/get-packet.js); they were authored by another session that already committed similar fixes on this branch. I avoided touching them. If npm run lint is part of the regular workflow, expect those autofixes to appear in subsequent commits.

Features

Feature Proposals for QBReader

Top Picks (Start Here)

1. Personalized Question Recommendation Engine [M]


2. Post-Game AI Tutor (Claude API) [M]


3. Spaced Repetition Queue for Missed Clues [M]


4. Granular Time-Series Stats: Power/Neg/Conversion by Category Over Time [M]


5. Embeddable "Question of the Day" Widget [S]


Solid Second Tier

6. Tournament Mode: Bracketed Multiplayer Rooms [L]


7. Audio Reading (TTS) for Solo Practice [M]


8. Public Leaderboards + Season Ladders [M]


9. Bot Opponents for Solo Multiplayer Feel [L]


10. Spectator Mode for Multiplayer Rooms [M]


Ambitious / Longer-Term

11. Question Authoring/Editor UI for Writers [L]


12. Discord OAuth + Result Posting [M]


13. Better Question Reporting + Errata Workflow [S]


Mobile App (Native iOS/Android)

AI Question Generation (Synthetics)

Real-Time Classroom Mode (Teacher Controls)

Mobile PWA (Install to Home Screen)


Implementation Roadmap Suggestion

Phase 1 (Weeks 1–2): Quick wins + foundational 1. Embeddable QOTD widget (S) — easy viral growth 2. Personalized recommendations (M) — high engagement ROI 3. Better reporting + errata workflow (S–M) — trust & community

Phase 2 (Weeks 3–6): Depth + engagement 1. Post-game AI Tutor (M) — premium feature, Claude integration 2. Spaced repetition queue (M) — retention mechanism 3. Time-series stats (M) — analytics depth

Phase 3 (Weeks 7–10): Social/Competitive 1. Public leaderboards + seasons (M) — gamification 2. Spectator mode (M) — tournament readiness 3. Bot opponents (L) — solo multiplayer

Phase 4 (Weeks 11+): Ecosystem 1. Tournament mode (L) — capture organizations 2. Question authoring UI (L) — crowdsource content 3. Discord integration (M) — embed in community 4. Audio TTS (M) — match real tournament feel


Claude API Notes

For AI Tutor (Proposal #2)

For Recommendations (Proposal #1)


Quick Implementation Checklist

Architecture

Architecture Review: qbreader/website

Architecture Map

Deployment topology: Single Node.js dyno on Heroku with daily 8 AM UTC restart. MongoDB Atlas cluster stores questions and user accounts. Static assets served via Express.

Key components:

  1. Entry point (server.js): HTTP server + WebSocket server (ws library) both on same port
  2. Express app (app.js): Middleware stack: trust proxy → hostname/HTTPS enforcement → cookie sessions → rate limiting → IP filtering → routes
  3. Routes (routes/):
  4. /api/*: Question data, queries, frequency lists, webhooks
  5. /auth/*: Login, signup, email verification, password reset
  6. /db/*: Question browser, stats
  7. /play/*: Static HTML pages for singleplayer/multiplayer
  8. /admin/*: Admin tools
  9. /user/*: Profile management
  10. Multiplayer layer (server/multiplayer/):
  11. handle-wss-connection.js: Entry point for all WebSocket connections
  12. Global state: tossupBonusRooms = {} (in-memory map of all active rooms)
  13. Connection per-IP tracking: connectionsByIp = Map()
  14. Room model: ServerTossupBonusRoom extends mixins applied to TossupBonusRoom
  15. Mixin chain: ServerMultiplayerRoomMixin(TossupBonusRoom) → TossupBonusRoom inherits from BonusRoomMixin(TossupRoomMixin(QuestionRoom))
  16. Game logic (quizbowl/): Shared pure classes (Room, QuestionRoom, TossupRoom, BonusRoom) for room state and game rules
  17. Database (database/): MongoDB client singleton (databases.js). Three DBs: qbreader (questions), account-info (users), geoword (geo questions)
  18. Frontend (client/): Vanilla ES6 + React JSX (built via webpack). No global state manager; state lives in DOM + websocket messages. Local storage for preferences.

What Works Well


Smells & Concerns

High Priority

Medium Priority

Low Priority


Scaling Cliffs

  1. 10x users (5,000 concurrent):
  2. Single dyno CPU will saturate. At ~1 dyno = 500 concurrent WS, you need 10 dynos minimum.
  3. Mitigation: Add sticky sessions load balancer (Heroku's native LB does this). But in-memory tossupBonusRooms must be sharded across dynos or moved to Redis. Huge lift.

  4. 100x questions (1M+):

  5. /api/query endpoints do full collection scans with filters. MongoDB will slow down. Indexes on category, difficulty, setYear exist (assumed from schema) but aren't visible. Aggregation pipeline queries are likely missing.
  6. Mitigation: Index audit, explain() on slow queries, move complex queries to aggregation pipelines. MongoDB Atlas M5+ cluster for auto-scaling reads.

  7. 10x rooms (2,000 simultaneous):

  8. In-memory tossupBonusRooms map iteration (e.g., to broadcast "server restarting") becomes O(n). Cleanup loop runs every 5 min anyway.
  9. Mitigation: Move rooms to Redis with TTL. Much faster iteration, natural cleanup. Rooms become cluster-aware.

  10. Long-lived connections (8-hour restart gap):

  11. WebSocket memory leaks possible. Each socket holds references to room, player, team objects. If a player joins/leaves 100 times (unlikely but possible in 8 hours), those objects might not be GC'd if there are circular refs.
  12. Mitigation: Memory profiler on production daily. Watch heap size in New Relic. Add a "room purge" endpoint that deletes empty rooms.

Reliability Concerns


Testing Assessment

What's there: None (no test files found).

What's missing: 1. Unit tests for ServerMultiplayerRoomMixin: Each message type (ban, votekick, give-answer, etc.) should have a unit test verifying it mutates room state correctly. Currently, bugs here are caught only in production or manual play. 2. Integration test for room lifecycle: Join → load question → buzz → answer → score → leave. Ensures multiplayer protocol is consistent end-to-end. 3. Auth tests: Password hashing, token generation, email verification token lifecycle. 4. Frontend state tests: TossupClient state transitions (not started → reading → buzz → reveal → score). 5. Load test: Spawn 100 WS connections to a room, measure latency and errors.

Where tests would pay off most: - server/multiplayer/ServerMultiplayerRoomMixin.js (477 lines, high-risk) - quizbowl/TossupBonusRoom.js (game rules) - server/authentication.js (auth is a single point of failure) - Any route that modifies a user's data


Top 5 Refactors by Leverage

1. Extract message router from ServerMultiplayerRoomMixin (Effort: medium, Impact: high)

Break the 477-line mixin into smaller, testable handlers: - MessageRouter class: routes message.type to handler methods - BanManager: votekick, ban, mute logic - RateLimitManager: per-user token bucket Each handler is a 20-50 line class with one responsibility. Instantly testable. At 10x users, you can tweak one without fear.

2. Add Redis for room state (Effort: high, Impact: high)

Move tossupBonusRooms to Redis with TTL. Benefits: - Rooms survive dyno restart (if saved before close) - Multi-dyno awareness: room is "owned" by a dyno ID, shard player connects to same dyno - Natural cleanup: rooms expire after 1 hour of inactivity - Scales to 10x rooms. This is the bottleneck at scale.

3. Implement WebSocket reconnect (Effort: medium, Impact: medium)

Client stores room state in sessionStorage, server stores game state in room. On WS close: - Client waits 1-5s, reconnects with room name + userId - Server matches existing player, resends current game state - Player rejoins without losing progress Gains ~50% of players back after brief disconnects. Simple but high value.

4. Add comprehensive error handling to socket message loop (Effort: low, Impact: high)

Wrap message handlers in try/catch, log, send error to client, optionally close socket. Prevents silent failures. Takes 1 hour, catches 80% of runtime bugs.

5. Extract database access layer (Effort: medium, Impact: medium)

Create QuestionService, UserService classes that encapsulate all MongoDB queries. Routes call questionService.query(), not raw MongoDB. Decouples routes from schema, enables easy swaps (MongoDB → Postgres, for example). Starts with one service, grows to cover all data access.


Summary

qbreader is well-structured for a small team and does the core job (multiplayer quizbowl) reliably for 100-500 concurrent players. The mixin pattern and shared game logic are elegant. But it trades scalability and resilience for simplicity: - In-memory state + daily restart means no redundancy, no graceful failure modes. - No tests mean refactors are risky. - WebSocket reconnect missing means a 5s network blip is game-over. - 477-line mixin is a hidden bomb: hard to understand, risky to change.

For your next 6 months, prioritize: (1) error handling + tests in multiplayer, (2) Redis for rooms (enables multi-dyno), (3) reconnect. These three unlocks scaling to 5,000 concurrent players and survive brief outages. Rewriting from scratch is not needed; steady refactoring works.