You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
Music/docs/superpowers/specs/2026-05-30-fix-zero-bitrate...

166 lines
6.2 KiB

# Fix `bitrate = 0` Tracks — Design
**Date:** 2026-05-30
**Status:** Approved (design)
## Problem
Many tracks in the library have a stored `bitrate` of `0`. Bitrate of `0` is
displayed literally and is meaningless. Users want these tracks to show their
real bitrate, and want new imports to stop producing the problem.
## Root Cause
`ScannerService.extractMetadata()` extracts bitrate via AVFoundation:
```swift
// Music/Services/ScannerService.swift (~line 151)
let estimatedRate = try await audioTrack.load(.estimatedDataRate)
bitrate = Int(estimatedRate / 1000) // bits/sec -> kbps
```
For some files (observed: long / VBR MP3s such as 2-hour DJ "Essential Mix"
recordings), `AVAsset.estimatedDataRate` returns `0`. `Int(0 / 1000)` is `0`, so
a literal `0` is written to the DB. The `duration` field is extracted correctly
for the same files, which is what makes recovery reliable.
Tracks with `bitrate IS NULL` exist as well — the importer never produced any
value. These display as "—" and are treated as equally "missing" for this work.
### Evidence
Validated against a real `bitrate = 0` row (a ~120 min MP3):
| Method | Result |
|------------------------------------------|---------------|
| `fileSize × 8 ÷ duration ÷ 1000` | 256.0 kbps |
| `ffprobe -show_entries format=bit_rate` | 256005 → 256.0 kbps |
The two agree to the kbps. For VBR, both methods yield the true **average**
bitrate, which is the meaningful value.
## Scope
Both halves of the problem:
1. **Backfill** existing rows where `bitrate = 0 OR bitrate IS NULL`.
2. **Fix the importer** so future imports never store `0` again.
No mass rescan is required: the script repairs today's rows; the importer fix
only needs to guarantee correctness for new imports.
## Component 1 — Importer Fix (`ScannerService`)
Extract the bitrate decision into a pure, unit-testable function:
```swift
/// Resolve a track's bitrate (kbps) from the OS estimate, with a
/// file-size/duration fallback. Returns nil when no value can be derived —
/// never 0.
static func resolveBitrate(estimatedDataRate: Double,
fileSizeBytes: Int64?,
durationSeconds: Double?) -> Int?
```
Logic:
- `estimatedDataRate > 0`
`Int((estimatedDataRate / 1000).rounded())` (current behaviour, now rounded
rather than truncated).
- else if `fileSizeBytes != nil` **and** `durationSeconds != nil && > 0`
`Int((Double(fileSizeBytes) * 8 / durationSeconds / 1000).rounded())`.
- else → `nil`.
**Key invariant: the importer never stores `0`.** When nothing can be derived it
stores `nil`, which the UI already renders as "—".
`extractMetadata()` is updated to:
- read the file size it can already obtain via `FileManager.attributesOfItem(atPath:)`,
- pass `estimatedDataRate`, file size, and the loaded `duration` into
`resolveBitrate`,
- assign the result to `bitrate`.
This keeps the AVFoundation I/O in `extractMetadata` and the arithmetic in a pure
function that tests can drive directly.
## Component 2 — Backfill Script (`scripts/backfill_bitrate.py`)
Mirrors the conventions of the existing `scripts/backfill_itunes_dates.py`:
- Same `DEFAULT_DB` resolution (`~/Library/Containers/com.staxriver.mu/Data/
Library/Application Support/Music/db.sqlite`), with `--db` override.
- Reuses the `norm_path` / percent-decoding approach to turn a stored `fileURL`
into a POSIX path.
- **Dry-run by default**; `--apply` writes after a timestamped backup of the DB.
- `--self-test` runs offline unit checks and exits.
- Stdlib only (`sqlite3`, `subprocess`, `os`, `urllib`, …), plus `ffprobe` as an
optional external tool.
### Selection
```sql
SELECT id, fileURL, duration, bitrate
FROM tracks
WHERE bitrate = 0 OR bitrate IS NULL;
```
### Per-row bitrate determination
1. Resolve `fileURL` → POSIX path. If the file does not exist on disk →
report as **skipped (missing file)**, do not update.
2. Run `ffprobe -v error -show_entries format=bit_rate -of default=nw=1:nk=1 <path>`.
- Parse an integer bps → `round(bps / 1000)` kbps.
3. If ffprobe is **absent**, **errors**, or returns **`N/A`/empty**, fall back to
the formula: `round(fileSizeBytes * 8 / durationSeconds / 1000)`.
- If there is also no usable duration → report as
**skipped (undeterminable)**, do not update.
### Output
- **Dry-run:** a table of `path · old → new` plus a summary.
- **`--apply`:** create `db.sqlite.bak-<timestamp>`, then `UPDATE tracks SET
bitrate = ? WHERE id = ?` per resolved row, in a single transaction.
- **Summary (both modes):** counts for updated, skipped-missing-file,
skipped-undeterminable, and whether ffprobe was available.
## Testing (TDD — tests written before implementation)
### Swift (`MusicTests`)
Unit tests for `ScannerService.resolveBitrate`:
1. Positive `estimatedDataRate` → rounded kbps (e.g. `320450.0``320`).
2. `estimatedDataRate == 0` with valid size + duration → formula result
(e.g. 230_358_479 bytes, 7198.54 s → `256`).
3. `estimatedDataRate == 0`, valid size, **no/zero duration**`nil`.
4. `estimatedDataRate == 0`, **no file size**`nil`.
5. Confirms the function never returns `0`.
Each test carries a step-by-step comment describing what it exercises.
### Python (`--self-test`)
1. ffprobe-output parsing: `"256005\n"``256`.
2. `N/A`/empty ffprobe output → triggers formula fallback.
3. Formula math: `(230_358_479 * 8 / 7198.54 / 1000)` rounds to `256`.
4. `norm_path` edge cases (percent-encoding, `file://localhost/`, NFC, trailing
slash) — mirrored from the existing script's expectations.
### Manual verification
Run the script in dry-run against the real library DB and eyeball a sample
(ffprobe vs formula agreement) before `--apply`.
## Operational Notes
- `--apply` writes directly to the SQLite file. **Quit the app first** to avoid
WAL/lock contention — same caveat as `backfill_itunes_dates.py`.
- A timestamped backup is created before any write; restore by copying it back.
## Out of Scope
- No new UI (no in-app "repair bitrates" command); the script covers existing
rows and the importer covers future ones.
- No change to how bitrate is displayed.
- No re-encoding or modification of audio files — read-only analysis only.