fd32be729f531eba12fc36f3338aac1eddd22929
Each archive scraper now has its own class with hardcoded URL and parsing logic; config only carries auto_queue, timeout, and rate_limit_seconds. - html_scraper: refactor to base class with public shared utilities (YEAR_RE, AUTHOR_PREFIX_PAT, cls_inner_texts, img_alts) - rusneb.py (new): RusnebPlugin extracts year per list item rather than globally, eliminating wrong page-level dates - alib.py (new): AlibPlugin extracts year from within each <p><b> entry rather than globally, fixing nonsensical year values - shpl.py (new): ShplPlugin retains the dead ШПИЛ endpoint with hardcoded params; config type updated from html_scraper to shpl - config: remove config: subsections from rusneb, alib_web, shpl entries; update type fields to rusneb, alib_web, shpl respectively - plugins/__init__.py: register new specific types, remove html_scraper - tests: use specific plugin classes; assert all CandidateRecord fields (source, title, author, year, isbn, publisher) with appropriate constraints Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Bookshelf
Photo-based book cataloger. Organizes books in a Room -> Cabinet -> Shelf -> Book hierarchy. Photographs shelf spines; AI plugins identify books and look up metadata in library archives.
Requirements
- Python 3.11+, Poetry
- An OpenAI-compatible API endpoint (OpenRouter recommended)
Setup
poetry install
Create config/credentials.user.yaml with your API key:
credentials:
openrouter:
api_key: "sk-or-your-key-here"
Start the server:
poetry run serve
Open http://localhost:8000 in a browser.
Configuration
Config is loaded from config/*.default.yaml merged with config/*.user.yaml overrides. User files take precedence; dicts merge recursively, lists replace entirely. User files are gitignored.
| File | Purpose |
|---|---|
credentials.default.yaml |
API endpoints and keys |
models.default.yaml |
Model selection and prompts per AI function |
functions.default.yaml |
Plugin definitions (boundary detection, text recognition, identification, archive search) |
ui.default.yaml |
UI display settings |
To use a different model for a function, create config/models.user.yaml:
models:
vl_recognize:
credentials: openrouter
model: "google/gemini-2.0-flash"
To add an alternative provider, add it to config/credentials.user.yaml and reference it in models.user.yaml.
Usage
- Add a room, then cabinets and shelves using the tree in the sidebar.
- Upload a photo of each cabinet or shelf.
- Drag boundary lines on the photo to segment shelves (or books within a shelf). The AI boundary detector can suggest splits automatically.
- Run the text recognizer on a book to extract spine text, then the book identifier to match it against library archives.
- Review and approve AI suggestions in the detail panel. Use the batch button to process all unidentified books at once.
- On mobile, use the photo queue button on a cabinet or shelf to photograph books one by one with automatic AI processing.
Development
poetry run presubmit # black check + flake8 + pyright + pytest + JS tests
poetry run fmt # auto-format Python with black
npm install # install JS dev tools (ESLint, Prettier) — requires network
npm run lint # ESLint
npm run fmt # Prettier
Tests are in tests/ (Python) and tests/js/ (JavaScript).
Description
Languages
Python
65.4%
JavaScript
29.4%
CSS
4%
HTML
1.2%