Petr Polezhaev fd32be729f Replace config-driven HtmlScraperPlugin with specific archive classes
Each archive scraper now has its own class with hardcoded URL and parsing
logic; config only carries auto_queue, timeout, and rate_limit_seconds.

- html_scraper: refactor to base class with public shared utilities
  (YEAR_RE, AUTHOR_PREFIX_PAT, cls_inner_texts, img_alts)
- rusneb.py (new): RusnebPlugin extracts year per list item rather than
  globally, eliminating wrong page-level dates
- alib.py (new): AlibPlugin extracts year from within each <p><b> entry
  rather than globally, fixing nonsensical year values
- shpl.py (new): ShplPlugin retains the dead ШПИЛ endpoint with hardcoded
  params; config type updated from html_scraper to shpl
- config: remove config: subsections from rusneb, alib_web, shpl entries;
  update type fields to rusneb, alib_web, shpl respectively
- plugins/__init__.py: register new specific types, remove html_scraper
- tests: use specific plugin classes; assert all CandidateRecord fields
  (source, title, author, year, isbn, publisher) with appropriate constraints

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-10 00:03:17 +03:00
2026-03-09 14:17:13 +03:00
2026-03-09 14:17:13 +03:00
2026-03-09 14:17:13 +03:00
2026-03-09 14:17:13 +03:00
2026-03-09 14:17:13 +03:00
2026-03-09 14:17:13 +03:00
2026-03-09 14:17:13 +03:00
2026-03-09 14:17:13 +03:00
2026-03-09 14:17:13 +03:00

Bookshelf

Photo-based book cataloger. Organizes books in a Room -> Cabinet -> Shelf -> Book hierarchy. Photographs shelf spines; AI plugins identify books and look up metadata in library archives.

Requirements

  • Python 3.11+, Poetry
  • An OpenAI-compatible API endpoint (OpenRouter recommended)

Setup

poetry install

Create config/credentials.user.yaml with your API key:

credentials:
  openrouter:
    api_key: "sk-or-your-key-here"

Start the server:

poetry run serve

Open http://localhost:8000 in a browser.

Configuration

Config is loaded from config/*.default.yaml merged with config/*.user.yaml overrides. User files take precedence; dicts merge recursively, lists replace entirely. User files are gitignored.

File Purpose
credentials.default.yaml API endpoints and keys
models.default.yaml Model selection and prompts per AI function
functions.default.yaml Plugin definitions (boundary detection, text recognition, identification, archive search)
ui.default.yaml UI display settings

To use a different model for a function, create config/models.user.yaml:

models:
  vl_recognize:
    credentials: openrouter
    model: "google/gemini-2.0-flash"

To add an alternative provider, add it to config/credentials.user.yaml and reference it in models.user.yaml.

Usage

  1. Add a room, then cabinets and shelves using the tree in the sidebar.
  2. Upload a photo of each cabinet or shelf.
  3. Drag boundary lines on the photo to segment shelves (or books within a shelf). The AI boundary detector can suggest splits automatically.
  4. Run the text recognizer on a book to extract spine text, then the book identifier to match it against library archives.
  5. Review and approve AI suggestions in the detail panel. Use the batch button to process all unidentified books at once.
  6. On mobile, use the photo queue button on a cabinet or shelf to photograph books one by one with automatic AI processing.

Development

poetry run presubmit   # black check + flake8 + pyright + pytest + JS tests
poetry run fmt         # auto-format Python with black
npm install            # install JS dev tools (ESLint, Prettier) — requires network
npm run lint           # ESLint
npm run fmt            # Prettier

Tests are in tests/ (Python) and tests/js/ (JavaScript).

Description
bookshelf scan and management program (ai-written)
Readme 301 KiB
Languages
Python 65.4%
JavaScript 29.4%
CSS 4%
HTML 1.2%