Changelog
Issues and PRs opened autonomously by the agents — no human involvement
#56 perf: replace full-table raw-story scan in fetch_all() with capped DB query
Why:The previous implementation loaded all raw stories into memory and sorted them in Python, causing performance degradation with large datasets. This change addresses the O(n) memory and CPU overhead by pushing filtering and ordering to the database.
What:Replaced `get_raw_stories()` with a new `get_raw_stories_top(conn, limit)` function that uses SQLite's ORDER BY and LIMIT clauses to fetch only the top MAX_BATCH stories by engagement_score, eliminating full-table materialization.
#55 fix: render relatedStoryIds from story dict in render_ts_file()
Why:The render_ts_file() function was ignoring the relatedStoryIds data that build_day_stories() had already computed, instead hardcoding an empty array for every story. This caused the embedding relationships to be lost during export.
What:Replaced the hardcoded empty array with dynamic lookup using story.get('relatedStoryIds', []) and added proper quoting/joining logic to format the values as a TS array literal. Added 5 comprehensive tests to verify correct handling of populated IDs, missing keys, and edge cases.
#54 fix: update FakeSource stubs to accept last_item_id, add cursor pass-through test
Why:The item-cursor feature added a new `last_item_id` parameter to `Source.fetch()`, but test stubs weren't updated to match, causing TypeErrors that blocked CI. This PR fixes the signature mismatch and adds test coverage for the feature.
What:Updated `FakeSource` and `AnotherFakeSource` test stubs to include `last_item_id: str | None = None` in their `fetch()` signatures, and added a new test to verify that `fetch_all()` correctly passes the `last_fetched_item_id` through to the source.
#50 feat: add composite index on (status, fetched_at) for covered queue scans
Why:Four frequently-used queue queries were slow because they filtered on `status` then sorted by `fetched_at` without index support, forcing SQLite to materialize and sort all matching rows. This adds a composite index to enable covered index scans for these hot-path operations.
What:Added a composite index `idx_stories_status_fetched(status, fetched_at DESC)` to the schema and implemented an auto-migration that applies it to existing databases on startup, turning four queue queries into efficient covered index scans with no sort step.
#49 fix: add per-run caps to prevent runaway API usage
Why:To prevent runaway API costs and database flooding from high-volume feeds, misconfigured batches, or backlogs that could trigger hundreds of API calls in a single run.
What:Added per-run caps across five modules: MAX_ENTRIES_PER_FEED (100), MAX_FILTER_PER_RUN (50), MAX_SCORE_PER_RUN (25), MAX_CLASSIFY_PER_RUN (75), and --batch limit (50). Capped batches are picked up on the next cron cycle rather than blocking operations.
#48 fix: embed story in hot path after successful score
Why:Story embeddings were never being generated during the hot path scoring process, only through manual backfill, causing the digest's related-coverage feature to always fail silently. This fix ensures embeddings are created immediately after successful scoring so downstream features can access them.
What:Added an `embed_story()` call to `score_one()` after successful score commits (relevance ≥ 4), with embedding failures caught and logged as warnings to prevent blocking the scoring process. Added three tests to verify embeddings are created for qualifying stories, skipped for culled ones, and that failures don't break the function.
#47 fix: increment failed counter in filter_all() error fallback path
Why:The `failed` counter in `filter_all()` wasn't being incremented when `filter_story()` returned `None` due to timeouts or errors, causing the summary log to always report `0 errors` and masking service degradation issues.
What:Incremented the `failed` counter in the error-fallback branch of `filter_all()` and updated the summary log format to explicitly display `via error fallback` counts, making degraded scenarios immediately visible. Added two test cases to verify the counter behaves correctly under error and non-error conditions.
#46 fix: push_stats.py use get_connection() instead of bare sqlite3.connect()
Why:The push_stats.py script was bypassing the standard database connection setup, causing it to skip critical configurations like WAL mode, the sqlite-vec extension, and all schema migrations. This created silent failures and inconsistencies in the database state.
What:Replaced direct `sqlite3.connect()` calls with `get_connection()` from `src.utils.db` and removed the duplicate `DB_PATH` constant, ensuring all database setup and migrations are properly applied. Added 5 tests to verify the connection uses proper WAL mode, includes migrations, and correctly processes data.
#45 docs: update CLAUDE.md with entity system and exports
Why:The CLAUDE.md documentation was missing key information about the entity system, export functionality, and infrastructure scripts that are critical to understanding the codebase architecture.
What:Added comprehensive documentation for the entity extraction system (entities and entity_mentions tables), backfill scripts, shared API client (claude.py), and updated the architecture section to include filter.py and export_site.py modules.
#44 feat: trends data export
Why:The oracle-pi application needed trend data to display to users, including topic volume, trending entities, and source quality metrics. This PR adds the capability to automatically generate and export this trends data.
What:A new `export_trends()` function was added to `export_site.py` that generates a `trends.ts` file containing daily topic volume, top 20 trending entities over a 7-day window, and a source quality leaderboard. The function is integrated into the main export pipeline alongside glossary and changelog exports.
#43 feat: entity backfill + description generation scripts
Why:The system lacked historical entity data and descriptions for previously scored stories. This PR adds automated backfill capabilities to populate missing entity information retroactively.
What:Added two scripts—one to extract entities from historical stories via Anthropic's Haiku API, and another to generate concise descriptions for entities lacking them, with optional re-generation at mention thresholds. Also updated Claude model IDs to latest versions.
#42 feat: glossary entity extraction + pipeline improvements
Why:The system needed a way to automatically extract and track entities (people, companies, products, etc.) during the scoring process without incurring additional API calls, and to provide structured glossary data to the oracle-pi frontend.
What:Added a glossary system with new `entities` and `entity_mentions` database tables that auto-migrate on connection, integrated entity extraction into the scoring pipeline via `store_entities()`, and built an export function that generates a `glossary.ts` file for the frontend, while also improving pipeline metrics and lowering relevance thresholds.
#38 feat: add site_sentence and site_tag columns to SCHEMA and migration
Why:Fresh database instances and CI runs were breaking because `export_site.py` and `predictions.py` expected `site_sentence` and `site_tag` columns that didn't exist. These columns only existed in production due to a prior successful run, making the codebase fragile for new installs.
What:Added `site_sentence TEXT` and `site_tag TEXT` columns to the schema DDL in `init_db.py` and implemented a migration function `ensure_site_classification_columns()` following the existing pattern to add missing columns to legacy databases.
#34 feat: validate and normalise source config JSON at add time
Why:Source configuration errors were silently breaking the pipeline at fetch time instead of failing immediately. This PR adds early validation to catch malformed configs at source creation time.
What:Added a `_validate_config()` helper that parses JSON, rejects non-objects, and applies type-specific validation rules (e.g., for `hn_api`: checking min_points and max_stories constraints). Validated configs are normalized and stored in the DB before insertion, with 14 new tests covering all validation branches.
#33 feat: reject unsupported source types at add time (closes #17)
Why:The system was allowing invalid source types to be added to the database, creating invalid rows. This change ensures that only supported types are accepted by validating input at the point of addition.
What:Added validation in `add_source()` to check the `--type` parameter against `SOURCE_TYPES` before any database insert, with corresponding test coverage for both valid and invalid type scenarios.
#32 fix: FIFO queue ordering — flip ORDER BY fetched_at to ASC
Why:The queue was operating as LIFO instead of FIFO because queries ordered by fetched_at DESC, causing newer stories to always jump ahead. This meant older stories could get stuck indefinitely while new arrivals were continuously prioritized.
What:Reversed the ORDER BY clause from DESC to ASC in four database query functions (get_next_raw_story, get_next_unscored_story, and their bulk helpers) to ensure oldest stories are processed first, with added tests verifying correct ordering.
#28 fix: guarantee temp file cleanup on timeout and bad exit (closes #22)
Why:Temporary files created during story processing were leaking when subprocess timeouts or non-zero exit codes occurred, causing orphaned files to accumulate indefinitely on systems processing dozens of stories per hour.
What:Restructured cleanup logic by wrapping the entire function body in an outer try/finally block that unconditionally deletes the temp file, ensuring cleanup runs regardless of whether TimeoutExpired, bad exit codes, or normal completion occurs.
#21 feat: decouple digest delivery state from story status
Why:Stories marked as 'digested' disappeared from future digest queries, causing weekly digests to be incomplete by construction. The delivery state needed to be decoupled from story status to allow the same story to appear in multiple digest types.
What:Stories now remain permanently in 'scored' status, with delivery tracking moved exclusively to the `digest_stories` linking table. `get_digest_stories()` now accepts a `digest_type` parameter and uses a JOIN to exclude already-delivered stories, while `mark_stories_digested()` only inserts into the linking table without mutating story status.
#20 feat: mark batch-scored failures as score_failed instead of retrying forever
Why:Stories that failed scoring (due to timeouts, malformed JSON, or invalid schemas) were never marked as failed in the database, causing them to be retried indefinitely on every future run. This PR fixes that by persisting failure states to SQLite.
What:Modified `score_all()` in `src/score.py` to call `update_story_status(..., 'score_failed')` for all three failure modes and commit per-story instead of batching commits. Added 6 new tests to cover failure paths and mixed batch scenarios.
#19 feat: replace full-queue scans in process_one with single-row selectors
Why:The `process_one` worker was scanning entire result sets from the database every minute just to pick one item, causing unnecessary O(n) queries that scaled with queue depth. This was inefficient for logging and work selection in the steady-state cron path.
What:Introduced four new database helpers (`get_next_raw_story`, `get_next_unscored_story`, `count_raw_stories`, `count_unscored_stories`) that use LIMIT 1 SELECTs and COUNT(*) queries instead of materializing full result sets, reducing the cron path to O(1) operations while preserving the old bulk helpers for other callers.
#15 feat: fail poisoned raw stories out of process_one queue
Why:Raw stories were getting stuck in the `raw` state when filtering operations failed, preventing the system from progressing. This change ensures failed stories are properly marked and logged so they don't block the queue.
What:Modified the `process_one` queue handler to mark raw stories as `filter_failed` instead of `raw` when filtering raises an exception, and added logging with story context for diagnostics. Added test coverage for both the failure path and recovery on subsequent runs.
#10 fix: normalize source timestamps to aware UTC
Why:Feed fetching was crashing due to timezone-aware `since` values from SQLite being compared against naive UTC datetimes from source adapters. This prevented many feeds from being processed during fetch operations.
What:RSS and Hacker News source adapters were updated to create timezone-aware UTC datetimes instead of naive ones, with new regression tests added to prevent recurrence.
#9 feat: add single-writer queue for Tech Oracle DB writes
Why:The Tech Oracle DB was experiencing `database is locked` errors due to concurrent SQLite writes. This PR addresses that by introducing a serialization layer to ensure only one process writes to the database at a time.
What:A single-writer queue architecture was implemented using daily JSONL files with file-level locking, along with a dedicated drainer process that reads queue events in order and applies them idempotently to SQLite. Fetch operations now emit events to the queue instead of writing directly, while a new drain step in the processing pipeline ensures all queued mutations are applied before downstream operations.
#7 feat: track source health for broken feeds
Why:Broken or failing feeds need to be automatically managed to prevent wasted resources on persistently problematic sources. This change addresses the need to track and handle feeds that consistently fail to fetch.
What:Added health tracking to sources to monitor consecutive fetch failures, last errors, and empty-feed streaks, with automatic feed disabling after 5 consecutive failures and health status exposure via a --health flag. Includes schema migrations and comprehensive test coverage for the health tracking workflow.
#6 test: add core pipeline coverage
Why:The repository lacked test coverage for core pipeline functionality, making it difficult to validate code quality and catch regressions. This PR addresses that gap by establishing a testing foundation.
What:Added pytest-based unit tests covering the dedup, db, fetch, filter, score, digest, and process_one modules with mocked external calls, plus shared temp DB fixtures and formatting tests. Integrated pytest-cov for coverage reporting with configuration to track coverage across key utility modules.
#5 feat: add semantic embeddings with sqlite-vec
Why:The system needed semantic search capabilities to find similar stories. This PR adds that functionality by implementing vector embeddings.
What:Integrated sqlite-vec with all-MiniLM-L6-v2 embeddings to store and query story vectors locally. Added backfill scripts, similarity search queries, and embedding generation during the story processing pipeline with comprehensive test coverage.