# iTunes/Music → app DB date & stats backfill (one-time) Date: 2026-05-30 ## Problem `ScannerService.extractMetadata` sets `dateAdded: Date()` at scan time (`Music/Services/ScannerService.swift:186`), so every track's "added date" in the app DB is really its *scan* date, not the date the user originally added it in Apple Music. The user wants the **true `Date Added`** (and, since Music.app tracks them too, `Play Count`, `Rating`, and last-played) copied from their Apple Music library into the app's SQLite database. ## Context (verified) - The app is **sandboxed**. Current `PRODUCT_BUNDLE_IDENTIFIER` is `com.staxriver.mu` (HEAD and working tree; not part of the uncommitted diff). The live DB is therefore at: `~/Library/Containers/com.staxriver.mu/Data/Library/Application Support/Music/db.sqlite` - **The real app and real library are on a different computer.** This machine only has a 3-track dev DB. The script must be portable and is intended to run on the other Mac. - The user confirmed the audio files are the **same files** Apple Music references (they live inside Apple Music's media folder, e.g. `~/Music/Music/Media.localized/Music/...`). So the join key is the **file path**. - `tracks.dateAdded` is a GRDB `.datetime` column, stored as the string `YYYY-MM-DD HH:MM:SS.SSS` in UTC (confirmed from existing rows, e.g. `2026-05-24 06:46:01.713`). GRDB is lenient on read, so `....000` round-trips. - App rating scale is **0–5 stars** (`TrackTableView.swift:284` renders `String(repeating: "★", count: track.rating)`). Music.app stores 0–100, so map `// 20`. ## Approach A single **stdlib-only Python 3 script**, run once. Source of truth is the Music.app **File ▸ Library ▸ Export Library…** XML plist (chosen over live AppleScript: no Automation prompt, no timeouts, trivially parseable with `plistlib`). On matched tracks it does a **blunt overwrite** of all four fields, with Music.app as the source of truth. ## Matching (the bug-prone part) Join Music.app `Location` to `tracks.fileURL` on a **normalized decoded POSIX path**, not raw URL strings. `norm_path()`: 1. strip leading `file://`, then optional `localhost` host segment, 2. percent-decode (`urllib.parse.unquote`), 3. `unicodedata.normalize("NFC", …)` — neutralizes the accented-filename NFC/NFD mismatch between APFS storage and the two URL string sources, 4. strip a trailing slash. Music tracks with no `Location` (Apple Music streaming entries) are skipped. ## Field mapping (matched rows only; blunt overwrite) | Column | XML key | Rule | |----------------|------------------|-------------------------------------------------------------| | `dateAdded` | `Date Added` | `%Y-%m-%d %H:%M:%S.000` UTC. If absent, keep existing (col is NOT NULL). | | `playCount` | `Play Count` | integer, `0` if absent. | | `rating` | `Rating` (0–100) | `// 20` → 0–5, `0` if absent. | | `lastPlayedAt` | `Play Date UTC` | same date format, or `NULL` if absent. | ## Safety - **Dry-run by default**: prints match rate, a sample of before→after changes, and the counts + samples of unmatched-in-DB and unmatched-in-XML. Writes nothing. - `--apply`: first copies `db.sqlite` + `-wal` + `-shm` to a timestamped backup, then performs all writes in a single transaction, then `PRAGMA wal_checkpoint(TRUNCATE)`. Reversible by restoring the backup. - The app must be **quit** before running so the sandbox DB isn't mid-write. ## CLI ``` python3 scripts/backfill_itunes_dates.py --xml [--db ] [--apply] [--self-test] ``` - Default `--db`: `~/Library/Containers/com.staxriver.mu/Data/Library/Application Support/Music/db.sqlite` computed from `$HOME`, so it resolves on the other Mac too. ## Testing `scripts/test_backfill_itunes_dates.py` (stdlib `unittest`): - `norm_path`: NFC/NFD equivalence, `file://localhost/` form, percent-encoding, filenames with spaces/`#`/parentheses/apostrophes. - `build_updates`: date formatting, rating `// 20`, playCount & lastPlayed present/absent, unmatched-row handling. - Integration: a temp SQLite DB with the real `tracks` schema seeded with the user's actual 3 track paths + scan dates and a synthetic Library.xml → `apply` → assert rows updated. ## Delivery Script + test live in the repo under `scripts/`. The user commits (via `/commit`), pushes, pulls on the real machine, runs **File ▸ Library ▸ Export Library…** there, then runs the dry-run, eyeballs the match rate, and re-runs with `--apply`. ## Out of scope Scanning the library into the app (the user does that in-app first), ongoing/automatic sync, and non-file (streaming) tracks.