You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
97 lines
4.7 KiB
97 lines
4.7 KiB
# iTunes/Music → app DB date & stats backfill (one-time)
|
|
|
|
Date: 2026-05-30
|
|
|
|
## Problem
|
|
|
|
`ScannerService.extractMetadata` sets `dateAdded: Date()` at scan time
|
|
(`Music/Services/ScannerService.swift:186`), so every track's "added date" in the
|
|
app DB is really its *scan* date, not the date the user originally added it in
|
|
Apple Music. The user wants the **true `Date Added`** (and, since Music.app tracks
|
|
them too, `Play Count`, `Rating`, and last-played) copied from their Apple Music
|
|
library into the app's SQLite database.
|
|
|
|
## Context (verified)
|
|
|
|
- The app is **sandboxed**. Current `PRODUCT_BUNDLE_IDENTIFIER` is `com.staxriver.mu`
|
|
(HEAD and working tree; not part of the uncommitted diff). The live DB is therefore at:
|
|
`~/Library/Containers/com.staxriver.mu/Data/Library/Application Support/Music/db.sqlite`
|
|
- **The real app and real library are on a different computer.** This machine only has a
|
|
3-track dev DB. The script must be portable and is intended to run on the other Mac.
|
|
- The user confirmed the audio files are the **same files** Apple Music references (they
|
|
live inside Apple Music's media folder, e.g.
|
|
`~/Music/Music/Media.localized/Music/...`). So the join key is the **file path**.
|
|
- `tracks.dateAdded` is a GRDB `.datetime` column, stored as the string
|
|
`YYYY-MM-DD HH:MM:SS.SSS` in UTC (confirmed from existing rows, e.g.
|
|
`2026-05-24 06:46:01.713`). GRDB is lenient on read, so `....000` round-trips.
|
|
- App rating scale is **0–5 stars** (`TrackTableView.swift:284` renders
|
|
`String(repeating: "★", count: track.rating)`). Music.app stores 0–100, so map `// 20`.
|
|
|
|
## Approach
|
|
|
|
A single **stdlib-only Python 3 script**, run once. Source of truth is the Music.app
|
|
**File ▸ Library ▸ Export Library…** XML plist (chosen over live AppleScript: no
|
|
Automation prompt, no timeouts, trivially parseable with `plistlib`). On matched tracks
|
|
it does a **blunt overwrite** of all four fields, with Music.app as the source of truth.
|
|
|
|
## Matching (the bug-prone part)
|
|
|
|
Join Music.app `Location` to `tracks.fileURL` on a **normalized decoded POSIX path**,
|
|
not raw URL strings. `norm_path()`:
|
|
|
|
1. strip leading `file://`, then optional `localhost` host segment,
|
|
2. percent-decode (`urllib.parse.unquote`),
|
|
3. `unicodedata.normalize("NFC", …)` — neutralizes the accented-filename NFC/NFD mismatch
|
|
between APFS storage and the two URL string sources,
|
|
4. strip a trailing slash.
|
|
|
|
Music tracks with no `Location` (Apple Music streaming entries) are skipped.
|
|
|
|
## Field mapping (matched rows only; blunt overwrite)
|
|
|
|
| Column | XML key | Rule |
|
|
|----------------|------------------|-------------------------------------------------------------|
|
|
| `dateAdded` | `Date Added` | `%Y-%m-%d %H:%M:%S.000` UTC. If absent, keep existing (col is NOT NULL). |
|
|
| `playCount` | `Play Count` | integer, `0` if absent. |
|
|
| `rating` | `Rating` (0–100) | `// 20` → 0–5, `0` if absent. |
|
|
| `lastPlayedAt` | `Play Date UTC` | same date format, or `NULL` if absent. |
|
|
|
|
## Safety
|
|
|
|
- **Dry-run by default**: prints match rate, a sample of before→after changes, and the
|
|
counts + samples of unmatched-in-DB and unmatched-in-XML. Writes nothing.
|
|
- `--apply`: first copies `db.sqlite` + `-wal` + `-shm` to a timestamped backup, then
|
|
performs all writes in a single transaction, then `PRAGMA wal_checkpoint(TRUNCATE)`.
|
|
Reversible by restoring the backup.
|
|
- The app must be **quit** before running so the sandbox DB isn't mid-write.
|
|
|
|
## CLI
|
|
|
|
```
|
|
python3 scripts/backfill_itunes_dates.py --xml <Library.xml> [--db <path>] [--apply] [--self-test]
|
|
```
|
|
|
|
- Default `--db`: `~/Library/Containers/com.staxriver.mu/Data/Library/Application Support/Music/db.sqlite`
|
|
computed from `$HOME`, so it resolves on the other Mac too.
|
|
|
|
## Testing
|
|
|
|
`scripts/test_backfill_itunes_dates.py` (stdlib `unittest`):
|
|
|
|
- `norm_path`: NFC/NFD equivalence, `file://localhost/` form, percent-encoding,
|
|
filenames with spaces/`#`/parentheses/apostrophes.
|
|
- `build_updates`: date formatting, rating `// 20`, playCount & lastPlayed present/absent,
|
|
unmatched-row handling.
|
|
- Integration: a temp SQLite DB with the real `tracks` schema seeded with the user's actual
|
|
3 track paths + scan dates and a synthetic Library.xml → `apply` → assert rows updated.
|
|
|
|
## Delivery
|
|
|
|
Script + test live in the repo under `scripts/`. The user commits (via `/commit`), pushes,
|
|
pulls on the real machine, runs **File ▸ Library ▸ Export Library…** there, then runs the
|
|
dry-run, eyeballs the match rate, and re-runs with `--apply`.
|
|
|
|
## Out of scope
|
|
|
|
Scanning the library into the app (the user does that in-app first), ongoing/automatic sync,
|
|
and non-file (streaming) tracks.
|
|
|