Béranger Thomas

FLAC Toolkit

A command-line utility for low-level validation, automated repair, audio duplicate detection, and ReplayGain normalization of FLAC files.

Technical sheet
Status Stable Release date 2025-10-15 Category Audio & speech processing Language Python 3.12+ Stack mutagen · pyloudnorm · numpy · pandas · rich · Tabulator.js License MIT

Context

Free Lossless Audio Codec (FLAC) is the standard of choice for lossless digital audio archiving. Managing a high-fidelity music library, however, involves more than just storing files; it requires maintaining consistent loudness levels, keeping metadata organized, identifying duplicate tracks, and ensuring underlying file structures remain intact over time. While modern audio players can play most files seamlessly, they do not provide the tools needed to detect audio duplicates, apply precise volume normalization, or verify low-level structural integrity across large collections.

Challenge

The goal was to build a reliable command-line utility in Python to automate the auditing and maintenance of FLAC music archives. The utility needed to inspect the internal binary structures of FLAC files according to the official RFC 9639 specification. Additionally, the tool had to perform safe re-encoding for corrupted files, calculate acoustic loudness normalization (ReplayGain), and generate readable, high-performance reports that scale to thousands of audio tracks.

Approach

To achieve reliable validation and repair, the implementation was structured around low-level file structure inspection and parallel execution:

Features

The FLAC Toolkit provides several modular features designed for automation and system integration:

Outcome