4.7 KiB
iTunes/Music → app DB date & stats backfill (one-time)
Date: 2026-05-30
Problem
ScannerService.extractMetadata sets dateAdded: Date() at scan time
(Music/Services/ScannerService.swift:186), so every track's "added date" in the
app DB is really its scan date, not the date the user originally added it in
Apple Music. The user wants the true Date Added (and, since Music.app tracks
them too, Play Count, Rating, and last-played) copied from their Apple Music
library into the app's SQLite database.
Context (verified)
- The app is sandboxed. Current
PRODUCT_BUNDLE_IDENTIFIERiscom.staxriver.mu(HEAD and working tree; not part of the uncommitted diff). The live DB is therefore at:~/Library/Containers/com.staxriver.mu/Data/Library/Application Support/Music/db.sqlite - The real app and real library are on a different computer. This machine only has a 3-track dev DB. The script must be portable and is intended to run on the other Mac.
- The user confirmed the audio files are the same files Apple Music references (they
live inside Apple Music's media folder, e.g.
~/Music/Music/Media.localized/Music/...). So the join key is the file path. tracks.dateAddedis a GRDB.datetimecolumn, stored as the stringYYYY-MM-DD HH:MM:SS.SSSin UTC (confirmed from existing rows, e.g.2026-05-24 06:46:01.713). GRDB is lenient on read, so....000round-trips.- App rating scale is 0–5 stars (
TrackTableView.swift:284rendersString(repeating: "★", count: track.rating)). Music.app stores 0–100, so map// 20.
Approach
A single stdlib-only Python 3 script, run once. Source of truth is the Music.app
File ▸ Library ▸ Export Library… XML plist (chosen over live AppleScript: no
Automation prompt, no timeouts, trivially parseable with plistlib). On matched tracks
it does a blunt overwrite of all four fields, with Music.app as the source of truth.
Matching (the bug-prone part)
Join Music.app Location to tracks.fileURL on a normalized decoded POSIX path,
not raw URL strings. norm_path():
- strip leading
file://, then optionallocalhosthost segment, - percent-decode (
urllib.parse.unquote), unicodedata.normalize("NFC", …)— neutralizes the accented-filename NFC/NFD mismatch between APFS storage and the two URL string sources,- strip a trailing slash.
Music tracks with no Location (Apple Music streaming entries) are skipped.
Field mapping (matched rows only; blunt overwrite)
| Column | XML key | Rule |
|---|---|---|
dateAdded |
Date Added |
%Y-%m-%d %H:%M:%S.000 UTC. If absent, keep existing (col is NOT NULL). |
playCount |
Play Count |
integer, 0 if absent. |
rating |
Rating (0–100) |
// 20 → 0–5, 0 if absent. |
lastPlayedAt |
Play Date UTC |
same date format, or NULL if absent. |
Safety
- Dry-run by default: prints match rate, a sample of before→after changes, and the counts + samples of unmatched-in-DB and unmatched-in-XML. Writes nothing.
--apply: first copiesdb.sqlite+-wal+-shmto a timestamped backup, then performs all writes in a single transaction, thenPRAGMA wal_checkpoint(TRUNCATE). Reversible by restoring the backup.- The app must be quit before running so the sandbox DB isn't mid-write.
CLI
python3 scripts/backfill_itunes_dates.py --xml <Library.xml> [--db <path>] [--apply] [--self-test]
- Default
--db:~/Library/Containers/com.staxriver.mu/Data/Library/Application Support/Music/db.sqlitecomputed from$HOME, so it resolves on the other Mac too.
Testing
scripts/test_backfill_itunes_dates.py (stdlib unittest):
norm_path: NFC/NFD equivalence,file://localhost/form, percent-encoding, filenames with spaces/#/parentheses/apostrophes.build_updates: date formatting, rating// 20, playCount & lastPlayed present/absent, unmatched-row handling.- Integration: a temp SQLite DB with the real
tracksschema seeded with the user's actual 3 track paths + scan dates and a synthetic Library.xml →apply→ assert rows updated.
Delivery
Script + test live in the repo under scripts/. The user commits (via /commit), pushes,
pulls on the real machine, runs File ▸ Library ▸ Export Library… there, then runs the
dry-run, eyeballs the match rate, and re-runs with --apply.
Out of scope
Scanning the library into the app (the user does that in-app first), ongoing/automatic sync, and non-file (streaming) tracks.