You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
Music/docs/superpowers/specs/2026-05-30-itunes-date-back...

4.7 KiB

iTunes/Music → app DB date & stats backfill (one-time)

Date: 2026-05-30

Problem

ScannerService.extractMetadata sets dateAdded: Date() at scan time (Music/Services/ScannerService.swift:186), so every track's "added date" in the app DB is really its scan date, not the date the user originally added it in Apple Music. The user wants the true Date Added (and, since Music.app tracks them too, Play Count, Rating, and last-played) copied from their Apple Music library into the app's SQLite database.

Context (verified)

  • The app is sandboxed. Current PRODUCT_BUNDLE_IDENTIFIER is com.staxriver.mu (HEAD and working tree; not part of the uncommitted diff). The live DB is therefore at: ~/Library/Containers/com.staxriver.mu/Data/Library/Application Support/Music/db.sqlite
  • The real app and real library are on a different computer. This machine only has a 3-track dev DB. The script must be portable and is intended to run on the other Mac.
  • The user confirmed the audio files are the same files Apple Music references (they live inside Apple Music's media folder, e.g. ~/Music/Music/Media.localized/Music/...). So the join key is the file path.
  • tracks.dateAdded is a GRDB .datetime column, stored as the string YYYY-MM-DD HH:MM:SS.SSS in UTC (confirmed from existing rows, e.g. 2026-05-24 06:46:01.713). GRDB is lenient on read, so ....000 round-trips.
  • App rating scale is 0–5 stars (TrackTableView.swift:284 renders String(repeating: "★", count: track.rating)). Music.app stores 0–100, so map // 20.

Approach

A single stdlib-only Python 3 script, run once. Source of truth is the Music.app File ▸ Library ▸ Export Library… XML plist (chosen over live AppleScript: no Automation prompt, no timeouts, trivially parseable with plistlib). On matched tracks it does a blunt overwrite of all four fields, with Music.app as the source of truth.

Matching (the bug-prone part)

Join Music.app Location to tracks.fileURL on a normalized decoded POSIX path, not raw URL strings. norm_path():

  1. strip leading file://, then optional localhost host segment,
  2. percent-decode (urllib.parse.unquote),
  3. unicodedata.normalize("NFC", …) — neutralizes the accented-filename NFC/NFD mismatch between APFS storage and the two URL string sources,
  4. strip a trailing slash.

Music tracks with no Location (Apple Music streaming entries) are skipped.

Field mapping (matched rows only; blunt overwrite)

Column XML key Rule
dateAdded Date Added %Y-%m-%d %H:%M:%S.000 UTC. If absent, keep existing (col is NOT NULL).
playCount Play Count integer, 0 if absent.
rating Rating (0–100) // 20 → 0–5, 0 if absent.
lastPlayedAt Play Date UTC same date format, or NULL if absent.

Safety

  • Dry-run by default: prints match rate, a sample of before→after changes, and the counts + samples of unmatched-in-DB and unmatched-in-XML. Writes nothing.
  • --apply: first copies db.sqlite + -wal + -shm to a timestamped backup, then performs all writes in a single transaction, then PRAGMA wal_checkpoint(TRUNCATE). Reversible by restoring the backup.
  • The app must be quit before running so the sandbox DB isn't mid-write.

CLI

python3 scripts/backfill_itunes_dates.py --xml <Library.xml> [--db <path>] [--apply] [--self-test]
  • Default --db: ~/Library/Containers/com.staxriver.mu/Data/Library/Application Support/Music/db.sqlite computed from $HOME, so it resolves on the other Mac too.

Testing

scripts/test_backfill_itunes_dates.py (stdlib unittest):

  • norm_path: NFC/NFD equivalence, file://localhost/ form, percent-encoding, filenames with spaces/#/parentheses/apostrophes.
  • build_updates: date formatting, rating // 20, playCount & lastPlayed present/absent, unmatched-row handling.
  • Integration: a temp SQLite DB with the real tracks schema seeded with the user's actual 3 track paths + scan dates and a synthetic Library.xml → apply → assert rows updated.

Delivery

Script + test live in the repo under scripts/. The user commits (via /commit), pushes, pulls on the real machine, runs File ▸ Library ▸ Export Library… there, then runs the dry-run, eyeballs the match rate, and re-runs with --apply.

Out of scope

Scanning the library into the app (the user does that in-app first), ongoing/automatic sync, and non-file (streaming) tracks.