draft-ietf-cellar-flac-01.txt   draft-ietf-cellar-flac-02.txt 
cellar M. Sandelman cellar M. Richardson
Internet-Draft Internet-Draft
Intended status: Informational A. Weaver Intended status: Informational A. Weaver
Expires: 29 October 2021 27 April 2021 Expires: 2 May 2022 29 October 2021
Free Lossless Audio Codec Free Lossless Audio Codec
draft-ietf-cellar-flac-01 draft-ietf-cellar-flac-02
Abstract Abstract
This document defines FLAC, which stands for Free Lossless Audio This document defines FLAC, which stands for Free Lossless Audio
Codec, a free, open source codec for lossless audio compression and Codec, a free, open source codec for lossless audio compression and
decompression. decompression.
Status of This Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
skipping to change at page 1, line 32 skipping to change at page 1, line 32
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/. Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on 29 October 2021. This Internet-Draft will expire on 2 May 2022.
Copyright Notice Copyright Notice
Copyright (c) 2021 IETF Trust and the persons identified as the Copyright (c) 2021 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/ Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document. license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights Please review these documents carefully, as they describe your rights
and restrictions with respect to this document. Code Components and restrictions with respect to this document. Code Components
extracted from this document must include Simplified BSD License text extracted from this document must include Revised BSD License text as
as described in Section 4.e of the Trust Legal Provisions and are described in Section 4.e of the Trust Legal Provisions and are
provided without warranty as described in the Simplified BSD License. provided without warranty as described in the Revised BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Notation and Conventions . . . . . . . . . . . . . . . . . . 3 2. Notation and Conventions . . . . . . . . . . . . . . . . . . 3
3. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 3 3. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 3
4. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 4. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
5. Architecture . . . . . . . . . . . . . . . . . . . . . . . . 4 5. Architecture . . . . . . . . . . . . . . . . . . . . . . . . 4
6. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 5 6. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 5
7. Blocking . . . . . . . . . . . . . . . . . . . . . . . . . . 6 7. Blocking . . . . . . . . . . . . . . . . . . . . . . . . . . 7
8. Interchannel Decorrelation . . . . . . . . . . . . . . . . . 6 8. Interchannel Decorrelation . . . . . . . . . . . . . . . . . 7
9. Prediction . . . . . . . . . . . . . . . . . . . . . . . . . 7 9. Prediction . . . . . . . . . . . . . . . . . . . . . . . . . 8
10. Residual Coding . . . . . . . . . . . . . . . . . . . . . . . 8 10. Residual Coding . . . . . . . . . . . . . . . . . . . . . . . 9
11. Format . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 11. Format . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
11.1. Conventions . . . . . . . . . . . . . . . . . . . . . . 12 11.1. Principles . . . . . . . . . . . . . . . . . . . . . . . 10
11.2. STREAM . . . . . . . . . . . . . . . . . . . . . . . . . 13 11.2. Overview . . . . . . . . . . . . . . . . . . . . . . . . 11
11.3. METADATA_BLOCK . . . . . . . . . . . . . . . . . . . . . 13 11.3. Subset . . . . . . . . . . . . . . . . . . . . . . . . . 14
11.4. METADATA_BLOCK_HEADER . . . . . . . . . . . . . . . . . 13 11.4. Conventions . . . . . . . . . . . . . . . . . . . . . . 15
11.5. BLOCK_TYPE . . . . . . . . . . . . . . . . . . . . . . . 14 11.5. STREAM . . . . . . . . . . . . . . . . . . . . . . . . . 15
11.6. METADATA_BLOCK_DATA . . . . . . . . . . . . . . . . . . 14 11.6. METADATA_BLOCK . . . . . . . . . . . . . . . . . . . . . 15
11.7. METADATA_BLOCK_STREAMINFO . . . . . . . . . . . . . . . 15 11.7. METADATA_BLOCK_HEADER . . . . . . . . . . . . . . . . . 16
11.8. METADATA_BLOCK_PADDING . . . . . . . . . . . . . . . . . 16 11.8. BLOCK_TYPE . . . . . . . . . . . . . . . . . . . . . . . 16
11.9. METADATA_BLOCK_APPLICATION . . . . . . . . . . . . . . . 16 11.9. METADATA_BLOCK_DATA . . . . . . . . . . . . . . . . . . 17
11.10. METADATA_BLOCK_SEEKTABLE . . . . . . . . . . . . . . . . 16 11.10. METADATA_BLOCK_STREAMINFO . . . . . . . . . . . . . . . 17
11.11. SEEKPOINT . . . . . . . . . . . . . . . . . . . . . . . 17 11.11. METADATA_BLOCK_PADDING . . . . . . . . . . . . . . . . . 18
11.12. METADATA_BLOCK_VORBIS_COMMENT . . . . . . . . . . . . . 17 11.12. METADATA_BLOCK_APPLICATION . . . . . . . . . . . . . . . 18
11.13. METADATA_BLOCK_CUESHEET . . . . . . . . . . . . . . . . 18 11.13. METADATA_BLOCK_SEEKTABLE . . . . . . . . . . . . . . . . 19
11.14. CUESHEET_TRACK . . . . . . . . . . . . . . . . . . . . . 19 11.14. SEEKPOINT . . . . . . . . . . . . . . . . . . . . . . . 19
11.15. CUESHEET_TRACK_INDEX . . . . . . . . . . . . . . . . . . 20 11.15. METADATA_BLOCK_VORBIS_COMMENT . . . . . . . . . . . . . 20
11.16. METADATA_BLOCK_PICTURE . . . . . . . . . . . . . . . . . 20 11.16. METADATA_BLOCK_CUESHEET . . . . . . . . . . . . . . . . 20
11.17. PICTURE_TYPE . . . . . . . . . . . . . . . . . . . . . . 21 11.17. CUESHEET_TRACK . . . . . . . . . . . . . . . . . . . . . 21
11.18. FRAME . . . . . . . . . . . . . . . . . . . . . . . . . 22 11.18. CUESHEET_TRACK_INDEX . . . . . . . . . . . . . . . . . . 22
11.19. FRAME_HEADER . . . . . . . . . . . . . . . . . . . . . . 23 11.19. METADATA_BLOCK_PICTURE . . . . . . . . . . . . . . . . . 23
11.19.1. FRAME HEADER RESERVED . . . . . . . . . . . . . . . 23 11.20. PICTURE_TYPE . . . . . . . . . . . . . . . . . . . . . . 24
11.19.2. BLOCKING STRATEGY . . . . . . . . . . . . . . . . . 24 11.21. FRAME . . . . . . . . . . . . . . . . . . . . . . . . . 25
11.19.3. INTERCHANNEL SAMPLE BLOCK SIZE . . . . . . . . . . 24 11.22. FRAME_HEADER . . . . . . . . . . . . . . . . . . . . . . 25
11.19.4. SAMPLE RATE . . . . . . . . . . . . . . . . . . . . 25 11.22.1. FRAME HEADER RESERVED . . . . . . . . . . . . . . . 26
11.19.5. CHANNEL ASSIGNMENT . . . . . . . . . . . . . . . . 26 11.22.2. BLOCKING STRATEGY . . . . . . . . . . . . . . . . . 26
11.19.6. SAMPLE SIZE . . . . . . . . . . . . . . . . . . . . 27 11.22.3. INTERCHANNEL SAMPLE BLOCK SIZE . . . . . . . . . . 27
11.19.7. FRAME HEADER RESERVED2 . . . . . . . . . . . . . . 27 11.22.4. SAMPLE RATE . . . . . . . . . . . . . . . . . . . . 27
11.19.8. CODED NUMBER . . . . . . . . . . . . . . . . . . . 27 11.22.5. CHANNEL ASSIGNMENT . . . . . . . . . . . . . . . . 28
11.19.9. BLOCK SIZE INT . . . . . . . . . . . . . . . . . . 28 11.22.6. SAMPLE SIZE . . . . . . . . . . . . . . . . . . . . 30
11.19.10. SAMPLE RATE INT . . . . . . . . . . . . . . . . . . 28 11.22.7. FRAME HEADER RESERVED2 . . . . . . . . . . . . . . 30
11.19.11. FRAME CRC . . . . . . . . . . . . . . . . . . . . . 28 11.22.8. CODED NUMBER . . . . . . . . . . . . . . . . . . . 30
11.20. FRAME_FOOTER . . . . . . . . . . . . . . . . . . . . . . 28 11.22.9. BLOCK SIZE INT . . . . . . . . . . . . . . . . . . 31
11.21. SUBFRAME . . . . . . . . . . . . . . . . . . . . . . . . 29 11.22.10. SAMPLE RATE INT . . . . . . . . . . . . . . . . . . 31
11.22. SUBFRAME_HEADER . . . . . . . . . . . . . . . . . . . . 29 11.22.11. FRAME CRC . . . . . . . . . . . . . . . . . . . . . 31
11.22.1. SUBFRAME TYPE . . . . . . . . . . . . . . . . . . . 29 11.23. FRAME_FOOTER . . . . . . . . . . . . . . . . . . . . . . 31
11.22.2. WASTED BITS PER SAMPLE FLAG . . . . . . . . . . . . 30 11.24. SUBFRAME . . . . . . . . . . . . . . . . . . . . . . . . 32
11.25. SUBFRAME_HEADER . . . . . . . . . . . . . . . . . . . . 32
11.23. SUBFRAME_CONSTANT . . . . . . . . . . . . . . . . . . . 30 11.25.1. SUBFRAME TYPE . . . . . . . . . . . . . . . . . . . 32
11.24. SUBFRAME_FIXED . . . . . . . . . . . . . . . . . . . . . 30 11.25.2. WASTED BITS PER SAMPLE FLAG . . . . . . . . . . . . 33
11.25. SUBFRAME_LPC . . . . . . . . . . . . . . . . . . . . . . 31 11.26. SUBFRAME_CONSTANT . . . . . . . . . . . . . . . . . . . 33
11.26. SUBFRAME_VERBATIM . . . . . . . . . . . . . . . . . . . 31 11.27. SUBFRAME_FIXED . . . . . . . . . . . . . . . . . . . . . 34
11.27. RESIDUAL . . . . . . . . . . . . . . . . . . . . . . . . 31 11.28. SUBFRAME_LPC . . . . . . . . . . . . . . . . . . . . . . 34
11.27.1. RESIDUAL_CODING_METHOD . . . . . . . . . . . . . . 32 11.29. SUBFRAME_VERBATIM . . . . . . . . . . . . . . . . . . . 34
11.27.2. RESIDUAL_CODING_METHOD_PARTITIONED_EXP_GOLOMB . . . 32 11.30. RESIDUAL . . . . . . . . . . . . . . . . . . . . . . . . 35
11.27.3. RESIDUAL_CODING_METHOD_PARTITIONED_EXP_GOLOMB2 . . 33 11.30.1. RESIDUAL_CODING_METHOD . . . . . . . . . . . . . . 35
11.27.4. ENCODED RESIDUAL . . . . . . . . . . . . . . . . . 34 11.30.2. RESIDUAL_CODING_METHOD_PARTITIONED_EXP_GOLOMB . . . 35
12. Normative References . . . . . . . . . . . . . . . . . . . . 34 11.30.3. RESIDUAL_CODING_METHOD_PARTITIONED_EXP_GOLOMB2 . . 36
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 34 11.30.4. ENCODED RESIDUAL . . . . . . . . . . . . . . . . . 37
12. Security Considerations . . . . . . . . . . . . . . . . . . . 38
13. Normative References . . . . . . . . . . . . . . . . . . . . 38
14. Informative References . . . . . . . . . . . . . . . . . . . 38
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 39
1. Introduction 1. Introduction
This is a detailed description of the FLAC format. There is also a This is a detailed description of the FLAC format. There is also a
companion document that describes FLAC-to-Ogg mapping companion document that describes FLAC-to-Ogg mapping
(https://xiph.org/flac/ogg_mapping.html). (https://xiph.org/flac/ogg_mapping.html).
For a user-oriented overview, see About the FLAC Format For a user-oriented overview, see About the FLAC Format
(https://xiph.org/flac/documentation_format_overview.html). (https://xiph.org/flac/documentation_format_overview.html).
skipping to change at page 4, line 7 skipping to change at page 4, line 10
(https://web.archive.org/web/20040215005354/http://csi.usc.edu/ (https://web.archive.org/web/20040215005354/http://csi.usc.edu/
faculty/golomb.html) and Robert F. Rice; their universal codes are faculty/golomb.html) and Robert F. Rice; their universal codes are
used by FLAC's entropy coder. - N. Levinson and J. Durbin; the used by FLAC's entropy coder. - N. Levinson and J. Durbin; the
reference encoder uses an algorithm developed and refined by them for reference encoder uses an algorithm developed and refined by them for
determining the LPC coefficients from the autocorrelation determining the LPC coefficients from the autocorrelation
coefficients. - And of course, Claude Shannon coefficients. - And of course, Claude Shannon
(http://en.wikipedia.org/wiki/Claude_Shannon) (http://en.wikipedia.org/wiki/Claude_Shannon)
4. Scope 4. Scope
It is a known fact that no algorithm can losslessly compress all FLAC stands for Free Lossless Audio Codec: it is designed to reduce
possible input, so most compressors restrict themselves to a useful the amount of computer storage space needed to store digital audio
domain and try to work as well as possible within that domain. signals without needing to remove information in doing so (i.e.
FLAC's domain is audio data. Though it can losslessly code any lossless). FLAC is free in the sense that its specification is open,
input, only certain kinds of input will get smaller. FLAC exploits its reference implementation is open-source and it is not encumbered
the fact that audio data typically has a high degree of sample-to- by any known patent.
sample correlation.
Within the audio domain, there are many possible subdomains. For FLAC is able to achieve lossless compression because samples in audio
example: low bitrate speech, high-bitrate multi-channel music, etc. signals tend to be highly correlated with their close neighbors. In
FLAC itself does not target a specific subdomain, but many of the contrast with general purpose compressors, which often use
default parameters of the reference encoder are tuned to CD-quality dictionaries, do run-length coding or exploit long-term repetition,
music data (i.e. 44.1 kHz, 2 channel, 16 bits per sample). The FLAC removes redundancy solely in the very short term, looking back
effect of the encoding parameters on different kinds of audio data at most 32 samples.
will be examined later.
The FLAC format is suited for pulse-code modulated (PCM) audio with 1
to 8 channels, sample rates from 1 to 1048576 Hertz and bit depths
between 4 and 32 bits. Most tools for reading and writing the FLAC
format have been optimized for CD-audio, which is PCM audio with 2
channels, a sample rate of 44.1 kHz and a bit depth of 16 bits.
Compared to other lossless (audio) coding formats, FLAC is a format
with low complexity and can be coded to and from with little
computing resources. Decoding of FLAC has seen many independent
implementations on many different platforms, and both encoding and
decoding can be implemented without needing floating-point
arithmetic.
The coding methods provided by the FLAC format works best on PCM
audio signals of which the samples have a signed representation and
are centered around zero. Audio signals in which samples have an
unsigned representation must be transformed to a signed
representation as described in this document in order to achieve
reasonable compression. The FLAC format is not suited to compress
audio that is not PCM. Pulse-density modulated audio, e.g. DSD,
cannot be compressed by FLAC.
5. Architecture 5. Architecture
Similar to many audio coders, a FLAC encoder has the following Similar to many audio coders, a FLAC encoder has the following
stages: stages:
* "Blocking" (see section on Blocking (#blocking)). The input is * Blocking (see section on Blocking (#blocking)). The input is
broken up into many contiguous blocks. With FLAC, the blocks MAY broken up into many contiguous blocks. With FLAC, the blocks MAY
vary in size. The optimal size of the block is usually affected vary in size. The optimal size of the block is usually affected
by many factors, including the sample rate, spectral by many factors, including the sample rate, spectral
characteristics over time, etc. Though FLAC allows the block size characteristics over time, etc. Though FLAC allows the block size
to vary within a stream, the reference encoder uses a fixed block to vary within a stream, the reference encoder uses a fixed block
size. size.
* "Interchannel Decorrelation" (see section on Interchannel * Interchannel Decorrelation (see section on Interchannel
Decorrelation (#interchannel-decorrelation)). In the case of Decorrelation (#interchannel-decorrelation)). In the case of
stereo streams, the encoder will create mid and side signals based stereo streams, the encoder will create mid and side signals based
on the average and difference (respectively) of the left and right on the average and difference (respectively) of the left and right
channels. The encoder will then pass the best form of the signal channels. The encoder will then pass the best form of the signal
to the next stage. to the next stage.
* "Prediction" (see section on Prediction (#prediction)). The block * Prediction (see section on Prediction (#prediction)). The block
is passed through a prediction stage where the encoder tries to is passed through a prediction stage where the encoder tries to
find a mathematical description (usually an approximate one) of find a mathematical description (usually an approximate one) of
the signal. This description is typically much smaller than the the signal. This description is typically much smaller than the
raw signal itself. Since the methods of prediction are known to raw signal itself. Since the methods of prediction are known to
both the encoder and decoder, only the parameters of the predictor both the encoder and decoder, only the parameters of the predictor
need be included in the compressed stream. FLAC currently uses need be included in the compressed stream. FLAC currently uses
four different classes of predictors, but the format has reserved four different classes of predictors, but the format has reserved
space for additional methods. FLAC allows the class of predictor space for additional methods. FLAC allows the class of predictor
to change from block to block, or even within the channels of a to change from block to block, or even within the channels of a
block. block.
* "Residual Coding" (See section on Residual Coding (#residual- * Residual Coding (See section on Residual Coding (#residual-
coding)). If the predictor does not describe the signal exactly, coding)). If the predictor does not describe the signal exactly,
the difference between the original signal and the predicted the difference between the original signal and the predicted
signal (called the error or residual signal) MUST be coded signal (called the error or residual signal) MUST be coded
losslessly. If the predictor is effective, the residual signal losslessly. If the predictor is effective, the residual signal
will require fewer bits per sample than the original signal. FLAC will require fewer bits per sample than the original signal. FLAC
currently uses only one method for encoding the residual, but the currently uses only one method for encoding the residual, but the
format has reserved space for additional methods. FLAC allows the format has reserved space for additional methods. FLAC allows the
residual coding method to change from block to block, or even residual coding method to change from block to block, or even
within the channels of a block. within the channels of a block.
In addition, FLAC specifies a metadata system, which allows arbitrary In addition, FLAC specifies a metadata system, which allows arbitrary
information about the stream to be included at the beginning of the information about the stream to be included at the beginning of the
stream. stream.
6. Definitions 6. Definitions
Many terms like "block" and "frame" are used to mean different things * *Block*: A (short) section of linear pulse-code modulated audio,
in different encoding schemes. For example, a frame in MP3 with one or more channels.
corresponds to many samples across several channels, whereas an S/
PDIF frame represents just one sample for each channel. The
definitions we use for FLAC follow. Note that when we talk about
blocks and subblocks we are referring to the raw unencoded audio data
that is the input to the encoder, and when we talk about frames and
subframes, we are referring to the FLAC-encoded data.
* *Block*: One or more audio samples that span several channels. * *Subblock*: All samples within a corresponding block for 1
channel. One or more subblocks form a block, and all subblocks in
a certain block contain the same number of samples.
* *Subblock*: One or more audio samples within a channel. A block * *Frame*: A frame header plus one or more subframes. It encodes
contains one subblock for each channel, and all subblocks contain the contents of a corresponding block.
the same number of samples.
* *Blocksize*: The number of samples in any of a block's subblocks. * *Subframe*: An encoded subblock. All subframes within a frame
For example, a one second block sampled at 44.1 kHz has a code for the same number of samples. A subframe MAY correspond to
blocksize of 44100, regardless of the number of channels. a subblock, else it corresponds to either the addition or
subtraction of two subblocks, see section on interchannel
decorrelation (#interchannel-decorrelation).
* *Frame*: A frame header plus one or more subframes. * *Blocksize*: The total number of samples contained in a block or
coded in a frame, divided by the number of channels. In other
words, the number of samples in any subblock of a block, or any
subframe of a frame. This is also called *interchannel samples*.
* *Subframe*: A subframe header plus one or more encoded samples * *Bit depth* or *bits per sample*: the number of bits used to
from a given channel. All subframes within a frame will contain contain each sample. This MUST be the same for all subblocks in a
the same number of samples. block but MAY be different for different subframes in a frame
because of interchannel decorrelation (#interchannel-
decorrelation).
* *Exponential-Golomb coding*: One of Robert Rice's universal coding * *Predictor*: a model used to predict samples in an audio signal
schemes, FLAC's residual coder, compresses data by writing the based on past samples. FLAC uses such predictors to remove
number of bits to be read minus 1, before writing the actual redundancy in a signal in order to be able to compress it.
value.
* *LPC*: Linear predictive coding (https://en.wikipedia.org/wiki/ * *Linear predictor*: a predictor using linear prediction
Linear_predictive_coding). (https://en.wikipedia.org/wiki/Linear_prediction). This is also
called *linear predictive coding (LPC)*. With a linear predictor
each prediction is a linear combination of past samples, hence the
name. A linear predictor has a causal discrete-time finite
impulse response (https://en.wikipedia.org/wiki/
Finite_impulse_response).
* *Fixed predictor*: a linear predictor in which the model
parameters are the same across all FLAC files, and thus not need
to be stored.
* *Predictor order*: the number of past samples that a predictor
uses. For example, a 4th order predictor uses the 4 samples
directly preceding a certain sample to predict it. In FLAC,
samples used in a predictor are always consecutive, and are always
the samples directly before the sample that is being predicted
* *Residual*: The audio signal that remains after a predictor has
been subtracted from a subblock. If the predictor has been able
to remove redundancy from the signal, the samples of the remaining
signal (the *residual samples*) will have, on average, a smaller
numerical value than the original signal.
* *Rice code*: A variable-length code
(https://en.wikipedia.org/wiki/Variable-length_code) which
compresses data by making use of the observation that, after using
an effective predictor, most residual samples are closer to zero
than the original samples, while still allowing for a small part
of the samples to be much larger.
7. Blocking 7. Blocking
The size used for blocking the audio data has a direct effect on the The size used for blocking the audio data has a direct effect on the
compression ratio. If the block size is too small, the resulting compression ratio. If the block size is too small, the resulting
large number of frames mean that excess bits will be wasted on frame large number of frames mean that excess bits will be wasted on frame
headers. If the block size is too large, the characteristics of the headers. If the block size is too large, the characteristics of the
signal MAY vary so much that the encoder will be unable to find a signal MAY vary so much that the encoder will be unable to find a
good predictor. In order to simplify encoder/decoder design, FLAC good predictor. In order to simplify encoder/decoder design, FLAC
imposes a minimum block size of 16 samples, and a maximum block size imposes a minimum block size of 16 samples, and a maximum block size
skipping to change at page 6, line 43 skipping to change at page 7, line 42
size depending on the characteristics of the signal. size depending on the characteristics of the signal.
Blocked data is passed to the predictor stage one subblock (channel) Blocked data is passed to the predictor stage one subblock (channel)
at a time. Each subblock is independently coded into a subframe, and at a time. Each subblock is independently coded into a subframe, and
the subframes are concatenated into a frame. Because each channel is the subframes are concatenated into a frame. Because each channel is
coded separately, one channel of a stereo frame MAY be encoded as a coded separately, one channel of a stereo frame MAY be encoded as a
constant subframe, and the other an LPC subframe. constant subframe, and the other an LPC subframe.
8. Interchannel Decorrelation 8. Interchannel Decorrelation
In stereo streams, many times there is an exploitable amount of In many audio files, channels are correlated. The FLAC format can
correlation between the left and right channels. FLAC allows the exploit this correlation in stereo files by not directly coding
frames of stereo streams to have different channel assignments, and subblocks into subframes, but instead coding an average of all
an encoder MAY choose to use the best representation on a frame-by- samples in both subblocks (a mid channel) or the difference between
frame basis. all samples in both subblocks (a side channel). The following
combinations are possible:
* *Independent*. The left and right channels are coded * *Independent*. All channels are coded independently. All non-
independently. stereo files MUST be encoded this way.
* *Mid-side*. The left and right channels are transformed into mid * *Mid-side*. A left and right subblock are converted to mid and
and side channels. The mid channel is the midpoint (average) of side subframes. To calculate a sample for a mid subframe, the
the left and right signals, and the side is the difference signal corresponding left and right samples are summed and the result is
(left minus right). shifted right by 1 bit. To calculate a sample for a side
subframe, the corresponding right sample is subtracted from the
corresponding left sample. On decoding, the mid channel has to be
shifted left by 1 bit. Also, if the side channel is uneven, 1 has
to be added to the mid channel after the left shift. To
reconstruct the left channel, the corresponding samples in the mid
and side subframes are added and the result shifted right by 1
bit, while for the right channel the side channel has to be
subtracted from the mid channel and the result shifted right by 1
bit.
* *Left-side*. The left channel and side channel are coded. * *Left-side*. The left subblock is coded and the left and right
subblock are used to code a side subframe. The side subframe is
constructed in the same way as for mid-side. To decode, the right
subblock is restored by subtracting the samples in the side
subframe from the corresponding samples the left subframe.
* *Right-side*. The right channel and side channel are coded. * *Right-side*. The right subblock is coded and the left and right
subblock are used to code a side subframe. Note that the actual
coded subframe order is side-right. The side subframe is
constructed in the same way as for mid-side. To decode, the left
subblock is restored by adding the samples in the side subframe to
the corresponding samples in the left subframe.
Surprisingly, the left-side and right-side forms can be the most The side channel needs one extra bit of bit depth as the subtraction
efficient in many frames, even though the raw number of bits per can produce sample values twice as large as the maximum possible in
sample needed for the original signal is slightly more than that any given bit depth. The mid channel in mid-side stereo does not
needed for independent or mid-side coding. need one extra bit, as it is shifted left one bit. The left shift of
the mid channel does not lead to non-lossless behavior, because an
uneven sample in the mid subframe must always be accompanied by a
corresponding uneven sample in the side subframe, which means the
lost least significant bit can be restored by taking it from the
sample in the side subframe.
9. Prediction 9. Prediction
FLAC uses four methods for modeling the input signal: FLAC uses four methods for modeling the input signal:
1. *Verbatim*. This is essentially a zero-order predictor of the 1. *Verbatim*. This is essentially a zero-order predictor of the
signal. The predicted signal is zero, meaning the residual is signal. The predicted signal is zero, meaning the residual is
the signal itself, and the compression is zero. This is the the signal itself, and the compression is zero. This is the
baseline against which the other predictors are measured. If you baseline against which the other predictors are measured. If you
feed random data to the encoder, the verbatim predictor will feed random data to the encoder, the verbatim predictor will
skipping to change at page 9, line 7 skipping to change at page 10, line 31
The FLAC format has reserved space for other coding methods. Some The FLAC format has reserved space for other coding methods. Some
possibilities for volunteers would be to explore better context- possibilities for volunteers would be to explore better context-
modeling of the exp-golomb parameter, or Huffman coding. See LOCO-I modeling of the exp-golomb parameter, or Huffman coding. See LOCO-I
(http://www.hpl.hp.com/techreports/98/HPL-98-193.html) and pucrunch ( (http://www.hpl.hp.com/techreports/98/HPL-98-193.html) and pucrunch (
http://web.archive.org/web/20140827133312/http://www.cs.tut.fi/~alber http://web.archive.org/web/20140827133312/http://www.cs.tut.fi/~alber
t/Dev/pucrunch/packing.html) for descriptions of several universal t/Dev/pucrunch/packing.html) for descriptions of several universal
codes. codes.
11. Format 11. Format
This section specifies the FLAC bitstream format. FLAC has no format This section specifies the FLAC bitstream format.
version information, but it does contain reserved space in several
places. Future versions of the format MAY use this reserved space 11.1. Principles
safely without breaking the format of older streams. Older decoders
MAY choose to abort decoding or skip data encoded with newer methods. FLAC has no format version information, but it does contain reserved
Apart from reserved patterns, in places the format specifies invalid space in several places. Future versions of the format MAY use this
patterns, meaning that the patterns MAY never appear in any valid reserved space safely without breaking the format of older streams.
bitstream, in any prior, present, or future versions of the format. Older decoders MAY choose to abort decoding or skip data encoded with
These invalid patterns are usually used to make the synchronization newer methods. Apart from reserved patterns, in places the format
mechanism more robust. specifies invalid patterns, meaning that the patterns MAY never
appear in any valid bitstream, in any prior, present, or future
versions of the format. These invalid patterns are usually used to
make the synchronization mechanism more robust.
All numbers used in a FLAC bitstream MUST be integers; there are no All numbers used in a FLAC bitstream MUST be integers; there are no
floating-point representations. All numbers MUST be big-endian floating-point representations. All numbers MUST be big-endian
coded. All numbers MUST be unsigned unless otherwise specified. coded, except the length field used in Vorbis comments, which MUST be
little-endian coded. All numbers MUST be unsigned except linear
predictor coefficients, the linear prediction shift and numbers which
directly represent samples, which MUST be signed. None of these
restrictions apply to application metadata blocks.
All samples encoded to and decoded from the FLAC format MUST be in a
signed representation.
There are several ways to convert unsigned sample representations to
signed sample representations, but the coding methods provided by the
FLAC format work best on audio signals of which the numerical values
of the samples are centered around zero, i.e. have no DC offset. In
most unsigned audio formats, signals are centered around halfway the
range of the unsigned integer type used. If that is the case, all
sample representations SHOULD be converted by first copying the
number to a signed integer with sufficient range and then subtracting
half of the range of the unsigned integer type, which should result
in a signal with samples centered around 0.
11.2. Overview
Before the formal description of the stream, an overview might be Before the formal description of the stream, an overview might be
helpful. helpful.
* A FLAC bitstream consists of the "fLaC" (i.e. 0x664C6143) marker * A FLAC bitstream consists of the "fLaC" (i.e. 0x664C6143) marker
at the beginning of the stream, followed by a mandatory metadata at the beginning of the stream, followed by a mandatory metadata
block (called the STREAMINFO block), any number of other metadata block (called the STREAMINFO block), any number of other metadata
blocks, then the audio frames. blocks, then the audio frames.
* FLAC supports up to 128 kinds of metadata blocks; currently the * FLAC supports up to 128 kinds of metadata blocks; currently the
following are defined: following are defined:
- "STREAMINFO": This block has information about the whole - STREAMINFO: This block has information about the whole stream,
stream, like sample rate, number of channels, total number of like sample rate, number of channels, total number of samples,
samples, etc. It MUST be present as the first metadata block etc. It MUST be present as the first metadata block in the
in the stream. Other metadata blocks MAY follow, and ones that stream. Other metadata blocks MAY follow, and ones that the
the decoder doesn't understand, it will skip. decoder doesn't understand, it will skip.
- "PADDING": This block allows for an arbitrary amount of - PADDING: This block allows for an arbitrary amount of padding.
padding. The contents of a PADDING block have no meaning. The contents of a PADDING block have no meaning. This block is
This block is useful when it is known that metadata will be useful when it is known that metadata will be edited after
edited after encoding; the user can instruct the encoder to encoding; the user can instruct the encoder to reserve a
reserve a PADDING block of sufficient size so that when PADDING block of sufficient size so that when metadata is
metadata is added, it will simply overwrite the padding (which added, it will simply overwrite the padding (which is
is relatively quick) instead of having to insert it into the relatively quick) instead of having to insert it into the right
right place in the existing file (which would normally require place in the existing file (which would normally require
rewriting the entire file). rewriting the entire file).
- "APPLICATION": This block is for use by third-party - APPLICATION: This block is for use by third-party applications.
applications. The only mandatory field is a 32-bit identifier. The only mandatory field is a 32-bit identifier. This ID is
This ID is granted upon request to an application by the FLAC granted upon request to an application by the FLAC maintainers.
maintainers. The remainder is of the block is defined by the The remainder is of the block is defined by the registered
registered application. Visit the registration page application. Visit the registration page
(https://xiph.org/flac/id.html) if you would like to register (https://xiph.org/flac/id.html) if you would like to register
an ID for your application with FLAC. an ID for your application with FLAC.
- "SEEKTABLE": This is an OPTIONAL block for storing seek points. - SEEKTABLE: This is an OPTIONAL block for storing seek points.
It is possible to seek to any given sample in a FLAC stream It is possible to seek to any given sample in a FLAC stream
without a seek table, but the delay can be unpredictable since without a seek table, but the delay can be unpredictable since
the bitrate MAY vary widely within a stream. By adding seek the bitrate MAY vary widely within a stream. By adding seek
points to a stream, this delay can be significantly reduced. points to a stream, this delay can be significantly reduced.
Each seek point takes 18 bytes, so 1% resolution within a Each seek point takes 18 bytes, so 1% resolution within a
stream adds less than 2K. There can be only one SEEKTABLE in a stream adds less than 2K. There can be only one SEEKTABLE in a
stream, but the table can have any number of seek points. stream, but the table can have any number of seek points.
There is also a special 'placeholder' seekpoint which will be There is also a special 'placeholder' seekpoint which will be
ignored by decoders but which can be used to reserve space for ignored by decoders but which can be used to reserve space for
future seek point insertion. future seek point insertion.
- "VORBIS_COMMENT": This block is for storing a list of human- - VORBIS_COMMENT: This block is for storing a list of human-
readable name/value pairs. Values are encoded using UTF-8. It readable name/value pairs. Values are encoded using UTF-8. It
is an implementation of the Vorbis comment specification is an implementation of the Vorbis comment specification
(http://xiph.org/vorbis/doc/v-comment.html) (without the (http://xiph.org/vorbis/doc/v-comment.html) (without the
framing bit). This is the only officially supported tagging framing bit). This is the only officially supported tagging
mechanism in FLAC. There MUST be only zero or one mechanism in FLAC. There MUST be only zero or one
VORBIS_COMMENT blocks in a stream. In some external VORBIS_COMMENT blocks in a stream. In some external
documentation, Vorbis comments are called FLAC tags to lessen documentation, Vorbis comments are called FLAC tags to lessen
confusion. confusion.
- "CUESHEET": This block is for storing various information that - CUESHEET: This block is for storing various information that
can be used in a cue sheet. It supports track and index can be used in a cue sheet. It supports track and index
points, compatible with Red Book CD digital audio discs, as points, compatible with Red Book CD digital audio discs, as
well as other CD-DA metadata such as media catalog number and well as other CD-DA metadata such as media catalog number and
track ISRCs. The CUESHEET block is especially useful for track ISRCs. The CUESHEET block is especially useful for
backing up CD-DA discs, but it can be used as a general purpose backing up CD-DA discs, but it can be used as a general purpose
cueing mechanism for playback. cueing mechanism for playback.
- "PICTURE": This block is for storing pictures associated with - PICTURE: This block is for storing pictures associated with the
the file, most commonly cover art from CDs. There MAY be more file, most commonly cover art from CDs. There MAY be more than
than one PICTURE block in a file. The picture format is one PICTURE block in a file. The picture format is similar to
similar to the APIC frame in ID3v2 (http://www.id3.org/ the APIC frame in ID3v2 (http://www.id3.org/id3v2.4.0-frames).
id3v2.4.0-frames). The PICTURE block has a type, MIME type, The PICTURE block has a type, MIME type, and UTF-8 description
and UTF-8 description like ID3v2, and supports external linking like ID3v2, and supports external linking via URL (though this
via URL (though this is discouraged). The differences are that is discouraged). The differences are that there is no
there is no uniqueness constraint on the description field, and uniqueness constraint on the description field, and the MIME
the MIME type is mandatory. The FLAC PICTURE block also type is mandatory. The FLAC PICTURE block also includes the
includes the resolution, color depth, and palette size so that resolution, color depth, and palette size so that the client
the client can search for a suitable picture without having to can search for a suitable picture without having to scan them
scan them all. all.
* The audio data is composed of one or more audio frames. Each * The audio data is composed of one or more audio frames. Each
frame consists of a frame header, which contains a sync code, frame consists of a frame header, which contains a sync code,
information about the frame like the block size, sample rate, information about the frame like the block size, sample rate,
number of channels, et cetera, and an 8-bit CRC. The frame header number of channels, et cetera, and an 8-bit CRC. The frame header
also contains either the sample number of the first sample in the also contains either the sample number of the first sample in the
frame (for variable-blocksize streams), or the frame number (for frame (for variable-blocksize streams), or the frame number (for
fixed-blocksize streams). This allows for fast, sample-accurate fixed-blocksize streams). This allows for fast, sample-accurate
seeking to be performed. Following the frame header are encoded seeking to be performed. Following the frame header are encoded
subframes, one for each channel, and finally, the frame is zero- subframes, one for each channel, and finally, the frame is zero-
skipping to change at page 12, line 8 skipping to change at page 14, line 21
* Individual subframes (one for each channel) are coded separately * Individual subframes (one for each channel) are coded separately
within a frame, and appear serially in the stream. In other within a frame, and appear serially in the stream. In other
words, the encoded audio data is NOT channel-interleaved. This words, the encoded audio data is NOT channel-interleaved. This
reduces decoder complexity at the cost of requiring larger decode reduces decoder complexity at the cost of requiring larger decode
buffers. Each subframe has its own header specifying the buffers. Each subframe has its own header specifying the
attributes of the subframe, like prediction method and order, attributes of the subframe, like prediction method and order,
residual coding parameters, etc. The header is followed by the residual coding parameters, etc. The header is followed by the
encoded audio data for that channel. encoded audio data for that channel.
* "FLAC" specifies a subset of itself as the Subset format. The 11.3. Subset
purpose of this is to ensure that any streams encoded according to
the Subset are truly "streamable", meaning that a decoder that
cannot seek within the stream can still pick up in the middle of
the stream and start decoding. It also makes hardware decoder
implementations more practical by limiting the encoding parameters
such that decoder buffer sizes and other resource requirements can
be easily determined. *flac* generates Subset streams by default
unless the "--lax" command-line option is used. The Subset makes
the following limitations on what MAY be used in the stream:
* The blocksize bits in the "FRAME_HEADER" (see FRAME_HEADER section FLAC specifies a subset of itself as the Subset format. The purpose
of this is to ensure that any streams encoded according to the Subset
are truly "streamable", meaning that a decoder that cannot seek
within the stream can still pick up in the middle of the stream and
start decoding. It also makes hardware decoder implementations more
practical by limiting the encoding parameters such that decoder
buffer sizes and other resource requirements can be easily
determined. *flac* generates Subset streams by default unless the "--
lax" command-line option is used. The Subset makes the following
limitations on what MAY be used in the stream:
* The blocksize bits in the FRAME_HEADER (see FRAME_HEADER section
(#frameheader)) MUST be 0b0001-0b1110. The blocksize MUST be <= (#frameheader)) MUST be 0b0001-0b1110. The blocksize MUST be <=
16384; if the sample rate is <= 48000 Hz, the blocksize MUST be <= 16384; if the sample rate is <= 48000 Hz, the blocksize MUST be <=
4608 = 2^9 * 3^2. 4608 = 2^9 * 3^2.
* The sample rate bits in the "FRAME_HEADER" MUST be 0b0001-0b1110. * The sample rate bits in the FRAME_HEADER MUST be 0b0001-0b1110.
* The bits-per-sample bits in the "FRAME_HEADER" MUST be * The bits-per-sample bits in the FRAME_HEADER MUST be 0b001-0b111.
0b001-0b111.
* If the sample rate is <= 48000 Hz, the filter order in "LPC * If the sample rate is <= 48000 Hz, the filter order in LPC
subframes" (see SUBFRAME_LPC section (#subframelpc)) MUST be less subframes (see SUBFRAME_LPC section (#subframelpc)) MUST be less
than or equal to 12, i.e. the subframe type bits in the than or equal to 12, i.e. the subframe type bits in the
"SUBFRAME_HEADER" (see SUBFRAME_HEADER section (#subframeheader)) SUBFRAME_HEADER (see SUBFRAME_HEADER section (#subframeheader))
SHOULD NOT be 0b101100-0b111111. SHOULD NOT be 0b101100-0b111111.
* The Rice partition order in an "exp-golomb coded residual section" * The Rice partition order (see Coded residual section (#coded-
(see RESIDUAL_CODING_METHOD_PARTITIONE_EXP_GOLOMB section residual)) MUST be less than or equal to 8.
(#residualcodingmethodpartitionedexpgolomb)) MUST be less than or
equal to 8.
11.1. Conventions 11.4. Conventions
The following tables constitute a formal description of the FLAC The following tables constitute a formal description of the FLAC
format. Values expressed as "u(n)" represent unsigned big-endian format. Values expressed as u(n) represent unsigned big-endian
integer using "n" bits. "n" may be expressed as an equation using "*" integer using n bits. n may be expressed as an equation using *
(multiplication), "/" (division), "+" (addition), or "-" (multiplication), / (division), + (addition), or - (subtraction). An
(subtraction). An inclusive range of the number of bits expressed inclusive range of the number of bits expressed may be represented
may be represented with an ellipsis, such as "u(m...n)". The name of with an ellipsis, such as u(m...n). The name of a value followed by
a value followed by an asterisk "*" indicates zero or more an asterisk * indicates zero or more occurrences of the value. The
occurrences of the value. The name of a value followed by a plus name of a value followed by a plus sign + indicates one or more
sign "+" indicates one or more occurrences of the value. occurrences of the value.
11.2. STREAM 11.5. STREAM
+===========================+=====================================+ +===========================+=====================================+
| Data | Description | | Data | Description |
+===========================+=====================================+ +===========================+=====================================+
| u(32) | "fLaC", the FLAC stream marker in | | u(32) | "fLaC", the FLAC stream marker in |
| | ASCII, meaning byte 0 of the stream | | | ASCII, meaning byte 0 of the stream |
| | is 0x66, followed by 0x4C 0x61 0x43 | | | is 0x66, followed by 0x4C 0x61 0x43 |
+---------------------------+-------------------------------------+ +---------------------------+-------------------------------------+
| METADATA_BLOCK_STREAMINFO | This is the mandatory STREAMINFO | | METADATA_BLOCK_STREAMINFO | This is the mandatory STREAMINFO |
| | metadata block that has the basic | | | metadata block that has the basic |
| | properties of the stream. | | | properties of the stream. |
+---------------------------+-------------------------------------+ +---------------------------+-------------------------------------+
| "METADATA_BLOCK"* | Zero or more metadata blocks | | METADATA_BLOCK* | Zero or more metadata blocks |
+---------------------------+-------------------------------------+ +---------------------------+-------------------------------------+
| "FRAME"+ | One or more audio frames | | FRAME+ | One or more audio frames |
+---------------------------+-------------------------------------+ +---------------------------+-------------------------------------+
Table 1 Table 1
11.3. METADATA_BLOCK 11.6. METADATA_BLOCK
+=======================+========================================+ +=======================+========================================+
| Data | Description | | Data | Description |
+=======================+========================================+ +=======================+========================================+
| METADATA_BLOCK_HEADER | A block header that specifies the type | | METADATA_BLOCK_HEADER | A block header that specifies the type |
| | and size of the metadata block data. | | | and size of the metadata block data. |
+-----------------------+----------------------------------------+ +-----------------------+----------------------------------------+
| METADATA_BLOCK_DATA | | | METADATA_BLOCK_DATA | |
+-----------------------+----------------------------------------+ +-----------------------+----------------------------------------+
Table 2 Table 2
11.4. METADATA_BLOCK_HEADER 11.7. METADATA_BLOCK_HEADER
+=======+=========================================================+ +=======+=========================================================+
| Data | Description | | Data | Description |
+=======+=========================================================+ +=======+=========================================================+
| u(1) | Last-metadata-block flag: '1' if this block is the last | | u(1) | Last-metadata-block flag: '1' if this block is the last |
| | metadata block before the audio blocks, '0' otherwise. | | | metadata block before the audio blocks, '0' otherwise. |
+-------+---------------------------------------------------------+ +-------+---------------------------------------------------------+
| u(7) | BLOCK_TYPE | | u(7) | BLOCK_TYPE |
+-------+---------------------------------------------------------+ +-------+---------------------------------------------------------+
| u(24) | Length (in bytes) of metadata to follow (does not | | u(24) | Length (in bytes) of metadata to follow (does not |
| | include the size of the "METADATA_BLOCK_HEADER") | | | include the size of the METADATA_BLOCK_HEADER) |
+-------+---------------------------------------------------------+ +-------+---------------------------------------------------------+
Table 3 Table 3
11.5. BLOCK_TYPE 11.8. BLOCK_TYPE
+=========+====================================================+ +=========+====================================================+
| Value | Description | | Value | Description |
+=========+====================================================+ +=========+====================================================+
| 0 | STREAMINFO | | 0 | STREAMINFO |
+---------+----------------------------------------------------+ +---------+----------------------------------------------------+
| 1 | PADDING | | 1 | PADDING |
+---------+----------------------------------------------------+ +---------+----------------------------------------------------+
| 2 | APPLICATION | | 2 | APPLICATION |
+---------+----------------------------------------------------+ +---------+----------------------------------------------------+
skipping to change at page 14, line 31 skipping to change at page 17, line 5
+---------+----------------------------------------------------+ +---------+----------------------------------------------------+
| 6 | PICTURE | | 6 | PICTURE |
+---------+----------------------------------------------------+ +---------+----------------------------------------------------+
| 7 - 126 | reserved | | 7 - 126 | reserved |
+---------+----------------------------------------------------+ +---------+----------------------------------------------------+
| 127 | invalid, to avoid confusion with a frame sync code | | 127 | invalid, to avoid confusion with a frame sync code |
+---------+----------------------------------------------------+ +---------+----------------------------------------------------+
Table 4 Table 4
11.6. METADATA_BLOCK_DATA 11.9. METADATA_BLOCK_DATA
+====================================+=============+ +===================================================+==============+
| Data | Description | | Data | Description |
+====================================+=============+ +===================================================+==============+
| "METADATA_BLOCK_STREAMINFO" || | The block | | METADATA_BLOCK_STREAMINFO || | The block |
| "METADATA_BLOCK_PADDING" || | data MUST | | METADATA_BLOCK_PADDING || | data MUST |
| "METADATA_BLOCK_APPLICATION" || | match the | | METADATA_BLOCK_APPLICATION || | match the |
| "METADATA_BLOCK_SEEKTABLE" || | block type | | METADATA_BLOCK_SEEKTABLE || | block type |
| "METADATA_BLOCK_VORBIS_COMMENT" || | in the | | METADATA_BLOCK_VORBIS_COMMENT || | in the block |
| "METADATA_BLOCK_CUESHEET" || | block | | METADATA_BLOCK_CUESHEET || METADATA_BLOCK_PICTURE | header. |
| "METADATA_BLOCK_PICTURE" | header. | +---------------------------------------------------+--------------+
+------------------------------------+-------------+
Table 5 Table 5
11.7. METADATA_BLOCK_STREAMINFO 11.10. METADATA_BLOCK_STREAMINFO
+========+=================================================+ +========+=================================================+
| Data | Description | | Data | Description |
+========+=================================================+ +========+=================================================+
| u(16) | The minimum block size (in samples) used in the | | u(16) | The minimum block size (in samples) used in the |
| | stream. | | | stream. |
+--------+-------------------------------------------------+ +--------+-------------------------------------------------+
| u(16) | The maximum block size (in samples) used in the | | u(16) | The maximum block size (in samples) used in the |
| | stream. (Minimum blocksize == maximum | | | stream. (Minimum blocksize == maximum |
| | blocksize) implies a fixed-blocksize stream. | | | blocksize) implies a fixed-blocksize stream. |
+--------+-------------------------------------------------+ +--------+-------------------------------------------------+
| u(24) | The minimum frame size (in bytes) used in the | | u(24) | The minimum frame size (in bytes) used in the |
| | stream. A value of "0" signifies that the | | | stream. A value of 0 signifies that the value |
| | value is not known. | | | is not known. |
+--------+-------------------------------------------------+ +--------+-------------------------------------------------+
| u(24) | The maximum frame size (in bytes) used in the | | u(24) | The maximum frame size (in bytes) used in the |
| | stream. A value of "0" signifies that the | | | stream. A value of 0 signifies that the value |
| | value is not known. | | | is not known. |
+--------+-------------------------------------------------+ +--------+-------------------------------------------------+
| u(20) | Sample rate in Hz. Though 20 bits are | | u(20) | Sample rate in Hz. Though 20 bits are |
| | available, the maximum sample rate is limited | | | available, the maximum sample rate is limited |
| | by the structure of frame headers to 655350 Hz. | | | by the structure of frame headers to 655350 Hz. |
| | Also, a value of 0 is invalid. | | | Also, a value of 0 is invalid. |
+--------+-------------------------------------------------+ +--------+-------------------------------------------------+
| u(3) | (number of channels)-1. FLAC supports from 1 | | u(3) | (number of channels)-1. FLAC supports from 1 |
| | to 8 channels | | | to 8 channels |
+--------+-------------------------------------------------+ +--------+-------------------------------------------------+
| u(5) | (bits per sample)-1. FLAC supports from 4 to | | u(5) | (bits per sample)-1. FLAC supports from 4 to |
skipping to change at page 15, line 52 skipping to change at page 18, line 18
| | means the number of total samples is unknown. | | | means the number of total samples is unknown. |
+--------+-------------------------------------------------+ +--------+-------------------------------------------------+
| u(128) | MD5 signature of the unencoded audio data. | | u(128) | MD5 signature of the unencoded audio data. |
| | This allows the decoder to determine if an | | | This allows the decoder to determine if an |
| | error exists in the audio data even when the | | | error exists in the audio data even when the |
| | error does not result in an invalid bitstream. | | | error does not result in an invalid bitstream. |
+--------+-------------------------------------------------+ +--------+-------------------------------------------------+
Table 6 Table 6
NOTE FLAC specifies a minimum block size of 16 and a maximum block size of
* FLAC specifies a minimum block size of 16 and a maximum block size 65535, meaning the bit patterns corresponding to the numbers 0-15 in
of 65535, meaning the bit patterns corresponding to the numbers the minimum blocksize and maximum blocksize fields are invalid.
0-15 in the minimum blocksize and maximum blocksize fields are
invalid.
11.8. METADATA_BLOCK_PADDING The MD5 signature is made by performing an MD5 transformation on the
samples of all channels interleaved, represented in signed, little-
endian form. This interleaving is on a per-sample basis, so for a
stereo file this means first the first sample of the first channel,
then the first sample of the second channel, then the second sample
of the first channel etc. Before performing the MD5 transformation,
all samples must be byte-aligned. So, in case the bit depth is not a
whole number of bytes, additional zero bits are inserted at the most-
significant position until each sample representation is a whole
number of bytes.
11.11. METADATA_BLOCK_PADDING
+======+========================================+ +======+========================================+
| Data | Description | | Data | Description |
+======+========================================+ +======+========================================+
| u(n) | n '0' bits (n MUST be a multiple of 8) | | u(n) | n '0' bits (n MUST be a multiple of 8) |
+------+----------------------------------------+ +------+----------------------------------------+
Table 7 Table 7
11.9. METADATA_BLOCK_APPLICATION 11.12. METADATA_BLOCK_APPLICATION
+=======+===========================================+ +=======+===========================================+
| Data | Description | | Data | Description |
+=======+===========================================+ +=======+===========================================+
| u(32) | Registered application ID. (Visit the | | u(32) | Registered application ID. (Visit the |
| | registration page (https://xiph.org/flac/ | | | registration page (https://xiph.org/flac/ |
| | id.html) to register an ID with FLAC.) | | | id.html) to register an ID with FLAC.) |
+-------+-------------------------------------------+ +-------+-------------------------------------------+
| u(n) | Application data (n MUST be a multiple of | | u(n) | Application data (n MUST be a multiple of |
| | 8) | | | 8) |
+-------+-------------------------------------------+ +-------+-------------------------------------------+
Table 8 Table 8
11.10. METADATA_BLOCK_SEEKTABLE 11.13. METADATA_BLOCK_SEEKTABLE
+==============+==========================+ +============+==========================+
| Data | Description | | Data | Description |
+==============+==========================+ +============+==========================+
| "SEEKPOINT"+ | One or more seek points. | | SEEKPOINT+ | One or more seek points. |
+--------------+--------------------------+ +------------+--------------------------+
Table 9 Table 9
NOTE - The number of seek points is implied by the metadata header NOTE - The number of seek points is implied by the metadata header
'length' field, i.e. equal to length / 18. 'length' field, i.e. equal to length / 18.
11.11. SEEKPOINT 11.14. SEEKPOINT
+=======+==========================================================+ +=======+==========================================================+
| Data | Description | | Data | Description |
+=======+==========================================================+ +=======+==========================================================+
| u(64) | Sample number of first sample in the target frame, or | | u(64) | Sample number of first sample in the target frame, or |
| | "0xFFFFFFFFFFFFFFFF" for a placeholder point. | | | 0xFFFFFFFFFFFFFFFF for a placeholder point. |
+-------+----------------------------------------------------------+ +-------+----------------------------------------------------------+
| u(64) | Offset (in bytes) from the first byte of the first frame | | u(64) | Offset (in bytes) from the first byte of the first frame |
| | header to the first byte of the target frame's header. | | | header to the first byte of the target frame's header. |
+-------+----------------------------------------------------------+ +-------+----------------------------------------------------------+
| u(16) | Number of samples in the target frame. | | u(16) | Number of samples in the target frame. |
+-------+----------------------------------------------------------+ +-------+----------------------------------------------------------+
Table 10 Table 10
NOTES NOTES
skipping to change at page 17, line 36 skipping to change at page 20, line 9
* Seek points within a table MUST be sorted in ascending order by * Seek points within a table MUST be sorted in ascending order by
sample number. sample number.
* Seek points within a table MUST be unique by sample number, with * Seek points within a table MUST be unique by sample number, with
the exception of placeholder points. the exception of placeholder points.
* The previous two notes imply that there MAY be any number of * The previous two notes imply that there MAY be any number of
placeholder points, but they MUST all occur at the end of the placeholder points, but they MUST all occur at the end of the
table. table.
11.12. METADATA_BLOCK_VORBIS_COMMENT 11.15. METADATA_BLOCK_VORBIS_COMMENT
+======+===========================================================+ +======+===========================================================+
| Data | Description | | Data | Description |
+======+===========================================================+ +======+===========================================================+
| u(n) | Also known as FLAC tags, the contents of a vorbis comment | | u(n) | Also known as FLAC tags, the contents of a vorbis comment |
| | packet as specified here (http://www.xiph.org/vorbis/doc/ | | | packet as specified here (http://www.xiph.org/vorbis/doc/ |
| | v-comment.html) (without the framing bit). Note that the | | | v-comment.html) (without the framing bit). Note that the |
| | vorbis comment spec allows for on the order of 2^64 bytes | | | vorbis comment spec allows for on the order of 2^64 bytes |
| | of data where as the FLAC metadata block is limited to | | | of data where as the FLAC metadata block is limited to |
| | 2^24 bytes. Given the stated purpose of vorbis comments, | | | 2^24 bytes. Given the stated purpose of vorbis comments, |
skipping to change at page 18, line 4 skipping to change at page 20, line 26
| | v-comment.html) (without the framing bit). Note that the | | | v-comment.html) (without the framing bit). Note that the |
| | vorbis comment spec allows for on the order of 2^64 bytes | | | vorbis comment spec allows for on the order of 2^64 bytes |
| | of data where as the FLAC metadata block is limited to | | | of data where as the FLAC metadata block is limited to |
| | 2^24 bytes. Given the stated purpose of vorbis comments, | | | 2^24 bytes. Given the stated purpose of vorbis comments, |
| | i.e. human-readable textual information, this limit is | | | i.e. human-readable textual information, this limit is |
| | unlikely to be restrictive. Also note that the 32-bit | | | unlikely to be restrictive. Also note that the 32-bit |
| | field lengths are little-endian coded according to the | | | field lengths are little-endian coded according to the |
| | vorbis spec, as opposed to the usual big-endian coding of | | | vorbis spec, as opposed to the usual big-endian coding of |
| | fixed-length integers in the rest of FLAC. | | | fixed-length integers in the rest of FLAC. |
+------+-----------------------------------------------------------+ +------+-----------------------------------------------------------+
Table 11 Table 11
11.13. METADATA_BLOCK_CUESHEET 11.16. METADATA_BLOCK_CUESHEET
+===================+==============================================+ +=================+================================================+
| Data | Description | | Data | Description |
+===================+==============================================+ +=================+================================================+
| u(128*8) | Media catalog number, in ASCII printable | | u(128*8) | Media catalog number, in ASCII printable |
| | characters 0x20-0x7E. In general, the media | | | characters 0x20-0x7E. In general, the media |
| | catalog number SHOULD be 0 to 128 bytes | | | catalog number SHOULD be 0 to 128 bytes long; |
| | long; any unused characters SHOULD be right- | | | any unused characters SHOULD be right-padded |
| | padded with NUL characters. For CD-DA, this | | | with NUL characters. For CD-DA, this is a |
| | is a thirteen digit number, followed by 115 | | | thirteen digit number, followed by 115 NUL |
| | NUL bytes. | | | bytes. |
+-------------------+----------------------------------------------+ +-----------------+------------------------------------------------+
| u(64) | The number of lead-in samples. This field | | u(64) | The number of lead-in samples. This field has |
| | has meaning only for CD-DA cuesheets; for | | | meaning only for CD-DA cuesheets; for other |
| | other uses it SHOULD be 0. For CD-DA, the | | | uses it SHOULD be 0. For CD-DA, the lead-in |
| | lead-in is the TRACK 00 area where the table | | | is the TRACK 00 area where the table of |
| | of contents is stored; more precisely, it is | | | contents is stored; more precisely, it is the |
| | the number of samples from the first sample | | | number of samples from the first sample of the |
| | of the media to the first sample of the | | | media to the first sample of the first index |
| | first index point of the first track. | | | point of the first track. According to the |
| | According to the Red Book, the lead-in MUST | | | Red Book, the lead-in MUST be silence and CD |
| | be silence and CD grabbing software does not | | | grabbing software does not usually store it; |
| | usually store it; additionally, the lead-in | | | additionally, the lead-in MUST be at least two |
| | MUST be at least two seconds but MAY be | | | seconds but MAY be longer. For these reasons |
| | longer. For these reasons the lead-in | | | the lead-in length is stored here so that the |
| | length is stored here so that the absolute | | | absolute position of the first track can be |
| | position of the first track can be computed. | | | computed. Note that the lead-in stored here |
| | Note that the lead-in stored here is the | | | is the number of samples up to the first index |
| | number of samples up to the first index | | | point of the first track, not necessarily to |
| | point of the first track, not necessarily to | | | INDEX 01 of the first track; even the first |
| | INDEX 01 of the first track; even the first | | | track MAY have INDEX 00 data. |
| | track MAY have INDEX 00 data. | +-----------------+------------------------------------------------+
+-------------------+----------------------------------------------+ | u(1) | 1 if the CUESHEET corresponds to a Compact |
| u(1) | "1" if the CUESHEET corresponds to a Compact | | | Disc, else 0. |
| | Disc, else "0". | +-----------------+------------------------------------------------+
+-------------------+----------------------------------------------+ | u(7+258*8) | Reserved. All bits MUST be set to zero. |
| u(7+258*8) | Reserved. All bits MUST be set to zero. | +-----------------+------------------------------------------------+
+-------------------+----------------------------------------------+ | u(8) | The number of tracks. Must be at least 1 |
| u(8) | The number of tracks. Must be at least 1 | | | (because of the requisite lead-out track). |
| | (because of the requisite lead-out track). | | | For CD-DA, this number MUST be no more than |
| | For CD-DA, this number MUST be no more than | | | 100 (99 regular tracks and one lead-out |
| | 100 (99 regular tracks and one lead-out | | | track). |
| | track). | +-----------------+------------------------------------------------+
+-------------------+----------------------------------------------+ | CUESHEET_TRACK+ | One or more tracks. A CUESHEET block is |
| "CUESHEET_TRACK"+ | One or more tracks. A CUESHEET block is | | | REQUIRED to have a lead-out track; it is |
| | REQUIRED to have a lead-out track; it is | | | always the last track in the CUESHEET. For |
| | always the last track in the CUESHEET. For | | | CD-DA, the lead-out track number MUST be 170 |
| | CD-DA, the lead-out track number MUST be 170 | | | as specified by the Red Book, otherwise it |
| | as specified by the Red Book, otherwise it | | | MUST be 255. |
| | MUST be 255. | +-----------------+------------------------------------------------+
+-------------------+----------------------------------------------+
Table 12 Table 12
11.14. CUESHEET_TRACK 11.17. CUESHEET_TRACK
+=======================+=================================================+ +=====================+=================================================+
|Data |Description | |Data |Description |
+=======================+=================================================+ +=====================+=================================================+
|u(64) |Track offset in samples, relative to the | |u(64) |Track offset in samples, relative to the |
| |beginning of the FLAC audio stream. It is the | | |beginning of the FLAC audio stream. It is the |
| |offset to the first index point of the track. | | |offset to the first index point of the track. |
| |(Note how this differs from CD-DA, where the | | |(Note how this differs from CD-DA, where the |
| |track's offset in the TOC is that of the track's | | |track's offset in the TOC is that of the track's |
| |INDEX 01 even if there is an INDEX 00.) For CD- | | |INDEX 01 even if there is an INDEX 00.) For CD- |
| |DA, the offset MUST be evenly divisible by 588 | | |DA, the offset MUST be evenly divisible by 588 |
| |samples (588 samples = 44100 samples/s * 1/75 s).| | |samples (588 samples = 44100 samples/s * 1/75 s).|
+-----------------------+-------------------------------------------------+ +---------------------+-------------------------------------------------+
|u(8) |Track number. A track number of 0 is not allowed| |u(8) |Track number. A track number of 0 is not allowed|
| |to avoid conflicting with the CD-DA spec, which | | |to avoid conflicting with the CD-DA spec, which |
| |reserves this for the lead-in. For CD-DA the | | |reserves this for the lead-in. For CD-DA the |
| |number MUST be 1-99, or 170 for the lead-out; for| | |number MUST be 1-99, or 170 for the lead-out; for|
| |non-CD-DA, the track number MUST for 255 for the | | |non-CD-DA, the track number MUST for 255 for the |
| |lead-out. It is not REQUIRED but encouraged to | | |lead-out. It is not REQUIRED but encouraged to |
| |start with track 1 and increase sequentially. | | |start with track 1 and increase sequentially. |
| |Track numbers MUST be unique within a CUESHEET. | | |Track numbers MUST be unique within a CUESHEET. |
+-----------------------+-------------------------------------------------+ +---------------------+-------------------------------------------------+
|u(12*8) |Track ISRC. This is a 12-digit alphanumeric | |u(12*8) |Track ISRC. This is a 12-digit alphanumeric |
| |code; see here (http://isrc.ifpi.org/) and here | | |code; see here (http://isrc.ifpi.org/) and here |
| |(http://www.disctronics.co.uk/technology/cdaudio/| | |(http://www.disctronics.co.uk/technology/cdaudio/|
| |cdaud_isrc.htm). A value of 12 ASCII NUL | | |cdaud_isrc.htm). A value of 12 ASCII NUL |
| |characters MAY be used to denote absence of an | | |characters MAY be used to denote absence of an |
| |ISRC. | | |ISRC. |
+-----------------------+-------------------------------------------------+ +---------------------+-------------------------------------------------+
|u(1) |The track type: 0 for audio, 1 for non-audio. | |u(1) |The track type: 0 for audio, 1 for non-audio. |
| |This corresponds to the CD-DA Q-channel control | | |This corresponds to the CD-DA Q-channel control |
| |bit 3. | | |bit 3. |
+-----------------------+-------------------------------------------------+ +---------------------+-------------------------------------------------+
|u(1) |The pre-emphasis flag: 0 for no pre-emphasis, 1 | |u(1) |The pre-emphasis flag: 0 for no pre-emphasis, 1 |
| |for pre-emphasis. This corresponds to the CD-DA | | |for pre-emphasis. This corresponds to the CD-DA |
| |Q-channel control bit 5; see here | | |Q-channel control bit 5; see here |
| |(http://www.chipchapin.com/CDMedia/cdda9.php3). | | |(http://www.chipchapin.com/CDMedia/cdda9.php3). |
+-----------------------+-------------------------------------------------+ +---------------------+-------------------------------------------------+
|u(6+13*8) |Reserved. All bits MUST be set to zero. | |u(6+13*8) |Reserved. All bits MUST be set to zero. |
+-----------------------+-------------------------------------------------+ +---------------------+-------------------------------------------------+
|u(8) |The number of track index points. There MUST be | |u(8) |The number of track index points. There MUST be |
| |at least one index in every track in a CUESHEET | | |at least one index in every track in a CUESHEET |
| |except for the lead-out track, which MUST have | | |except for the lead-out track, which MUST have |
| |zero. For CD-DA, this number SHOULD NOT be more | | |zero. For CD-DA, this number SHOULD NOT be more |
| |than 100. | | |than 100. |
+-----------------------+-------------------------------------------------+ +---------------------+-------------------------------------------------+
|"CUESHEET_TRACK_INDEX"+|For all tracks except the lead-out track, one or | |CUESHEET_TRACK_INDEX+|For all tracks except the lead-out track, one or |
| |more track index points. | | |more track index points. |
+-----------------------+-------------------------------------------------+ +---------------------+-------------------------------------------------+
Table 13 Table 13
11.15. CUESHEET_TRACK_INDEX 11.18. CUESHEET_TRACK_INDEX
+========+=========================================================+ +========+=========================================================+
| Data | Description | | Data | Description |
+========+=========================================================+ +========+=========================================================+
| u(64) | Offset in samples, relative to the track offset, of the | | u(64) | Offset in samples, relative to the track offset, of the |
| | index point. For CD-DA, the offset MUST be evenly | | | index point. For CD-DA, the offset MUST be evenly |
| | divisible by 588 samples (588 samples = 44100 samples/s | | | divisible by 588 samples (588 samples = 44100 samples/s |
| | * 1/75 s). Note that the offset is from the beginning | | | * 1/75 s). Note that the offset is from the beginning |
| | of the track, not the beginning of the audio data. | | | of the track, not the beginning of the audio data. |
+--------+---------------------------------------------------------+ +--------+---------------------------------------------------------+
skipping to change at page 20, line 40 skipping to change at page 23, line 13
| | 0 corresponds to the track pre-gap. The first index in | | | 0 corresponds to the track pre-gap. The first index in |
| | a track MUST have a number of 0 or 1, and subsequently, | | | a track MUST have a number of 0 or 1, and subsequently, |
| | index numbers MUST increase by 1. Index numbers MUST | | | index numbers MUST increase by 1. Index numbers MUST |
| | be unique within a track. | | | be unique within a track. |
+--------+---------------------------------------------------------+ +--------+---------------------------------------------------------+
| u(3*8) | Reserved. All bits MUST be set to zero. | | u(3*8) | Reserved. All bits MUST be set to zero. |
+--------+---------------------------------------------------------+ +--------+---------------------------------------------------------+
Table 14 Table 14
11.16. METADATA_BLOCK_PICTURE 11.19. METADATA_BLOCK_PICTURE
+========+==================================================+ +========+==================================================+
| Data | Description | | Data | Description |
+========+==================================================+ +========+==================================================+
| u(32) | The PICTURE_TYPE according to the ID3v2 APIC | | u(32) | The PICTURE_TYPE according to the ID3v2 APIC |
| | frame. | | | frame. |
+--------+--------------------------------------------------+ +--------+--------------------------------------------------+
| u(32) | The length of the MIME type string in bytes. | | u(32) | The length of the MIME type string in bytes. |
+--------+--------------------------------------------------+ +--------+--------------------------------------------------+
| u(n*8) | The MIME type string, in printable ASCII | | u(n*8) | The MIME type string, in printable ASCII |
| | characters 0x20-0x7E. The MIME type MAY also be | | | characters 0x20-0x7E. The MIME type MAY also be |
| | "-->" to signify that the data part is a URL of | | | --> to signify that the data part is a URL of |
| | the picture instead of the picture data itself. | | | the picture instead of the picture data itself. |
+--------+--------------------------------------------------+ +--------+--------------------------------------------------+
| u(32) | The length of the description string in bytes. | | u(32) | The length of the description string in bytes. |
+--------+--------------------------------------------------+ +--------+--------------------------------------------------+
| u(n*8) | The description of the picture, in UTF-8. | | u(n*8) | The description of the picture, in UTF-8. |
+--------+--------------------------------------------------+ +--------+--------------------------------------------------+
| u(32) | The width of the picture in pixels. | | u(32) | The width of the picture in pixels. |
+--------+--------------------------------------------------+ +--------+--------------------------------------------------+
| u(32) | The height of the picture in pixels. | | u(32) | The height of the picture in pixels. |
+--------+--------------------------------------------------+ +--------+--------------------------------------------------+
| u(32) | The color depth of the picture in bits-per- | | u(32) | The color depth of the picture in bits-per- |
| | pixel. | | | pixel. |
+--------+--------------------------------------------------+ +--------+--------------------------------------------------+
| u(32) | For indexed-color pictures (e.g. GIF), the | | u(32) | For indexed-color pictures (e.g. GIF), the |
| | number of colors used, or "0" for non-indexed | | | number of colors used, or 0 for non-indexed |
| | pictures. | | | pictures. |
+--------+--------------------------------------------------+ +--------+--------------------------------------------------+
| u(32) | The length of the picture data in bytes. | | u(32) | The length of the picture data in bytes. |
+--------+--------------------------------------------------+ +--------+--------------------------------------------------+
| u(n*8) | The binary picture data. | | u(n*8) | The binary picture data. |
+--------+--------------------------------------------------+ +--------+--------------------------------------------------+
Table 15 Table 15
11.17. PICTURE_TYPE 11.20. PICTURE_TYPE
+=======+=====================================+ +=======+=====================================+
| Value | Description | | Value | Description |
+=======+=====================================+ +=======+=====================================+
| 0 | Other | | 0 | Other |
+-------+-------------------------------------+ +-------+-------------------------------------+
| 1 | 32x32 pixels 'file icon' (PNG only) | | 1 | 32x32 pixels 'file icon' (PNG only) |
+-------+-------------------------------------+ +-------+-------------------------------------+
| 2 | Other file icon | | 2 | Other file icon |
+-------+-------------------------------------+ +-------+-------------------------------------+
skipping to change at page 22, line 28 skipping to change at page 25, line 4
| 16 | Movie/video screen capture | | 16 | Movie/video screen capture |
+-------+-------------------------------------+ +-------+-------------------------------------+
| 17 | A bright colored fish | | 17 | A bright colored fish |
+-------+-------------------------------------+ +-------+-------------------------------------+
| 18 | Illustration | | 18 | Illustration |
+-------+-------------------------------------+ +-------+-------------------------------------+
| 19 | Band/artist logotype | | 19 | Band/artist logotype |
+-------+-------------------------------------+ +-------+-------------------------------------+
| 20 | Publisher/Studio logotype | | 20 | Publisher/Studio logotype |
+-------+-------------------------------------+ +-------+-------------------------------------+
Table 16 Table 16
Other values are reserved and SHOULD NOT be used. There MAY only be Other values are reserved and SHOULD NOT be used. There MAY only be
one each of picture type 1 and 2 in a file. one each of picture type 1 and 2 in a file.
11.18. FRAME 11.21. FRAME
+==============+=================================+ +==============+=================================+
| Data | Description | | Data | Description |
+==============+=================================+ +==============+=================================+
| FRAME_HEADER | | | FRAME_HEADER | |
+--------------+---------------------------------+ +--------------+---------------------------------+
| "SUBFRAME"+ | One SUBFRAME per channel. | | SUBFRAME+ | One SUBFRAME per channel. |
+--------------+---------------------------------+ +--------------+---------------------------------+
| u(?) | Zero-padding to byte alignment. | | u(?) | Zero-padding to byte alignment. |
+--------------+---------------------------------+ +--------------+---------------------------------+
| FRAME_FOOTER | | | FRAME_FOOTER | |
+--------------+---------------------------------+ +--------------+---------------------------------+
Table 17 Table 17
11.19. FRAME_HEADER 11.22. FRAME_HEADER
+=======+================================+ +=======+================================+
| Data | Description | | Data | Description |
+=======+================================+ +=======+================================+
| u(14) | Sync code '0b11111111111110' | | u(14) | Sync code '0b11111111111110' |
+-------+--------------------------------+ +-------+--------------------------------+
| u(1) | FRAME HEADER RESERVED | | u(1) | FRAME HEADER RESERVED |
+-------+--------------------------------+ +-------+--------------------------------+
| u(1) | BLOCKING STRATEGY | | u(1) | BLOCKING STRATEGY |
+-------+--------------------------------+ +-------+--------------------------------+
skipping to change at page 23, line 37 skipping to change at page 26, line 9
+-------+--------------------------------+ +-------+--------------------------------+
| u(?) | BLOCK SIZE INT | | u(?) | BLOCK SIZE INT |
+-------+--------------------------------+ +-------+--------------------------------+
| u(?) | SAMPLE RATE INT | | u(?) | SAMPLE RATE INT |
+-------+--------------------------------+ +-------+--------------------------------+
| u(8) | FRAME CRC | | u(8) | FRAME CRC |
+-------+--------------------------------+ +-------+--------------------------------+
Table 18 Table 18
11.19.1. FRAME HEADER RESERVED 11.22.1. FRAME HEADER RESERVED
+=======+=========================+ +=======+=========================+
| Value | Description | | Value | Description |
+=======+=========================+ +=======+=========================+
| 0 | mandatory value | | 0 | mandatory value |
+-------+-------------------------+ +-------+-------------------------+
| 1 | reserved for future use | | 1 | reserved for future use |
+-------+-------------------------+ +-------+-------------------------+
Table 19 Table 19
FRAME HEADER RESERVED MUST remain reserved for "0" in order for a FRAME HEADER RESERVED MUST remain reserved for 0 in order for a FLAC
FLAC frame's initial 15 bits to be distinguishable from the start of frame's initial 15 bits to be distinguishable from the start of an
an MPEG audio frame (see also (http://lists.xiph.org/pipermail/flac- MPEG audio frame (see also (http://lists.xiph.org/pipermail/flac-
dev/2008-December/002607.html)). dev/2008-December/002607.html)).
11.19.2. BLOCKING STRATEGY 11.22.2. BLOCKING STRATEGY
+=======+==================================+ +=======+==================================+
| Value | Description | | Value | Description |
+=======+==================================+ +=======+==================================+
| 0 | fixed-blocksize stream; frame | | 0 | fixed-blocksize stream; frame |
| | header encodes the frame number | | | header encodes the frame number |
+-------+----------------------------------+ +-------+----------------------------------+
| 1 | variable-blocksize stream; frame | | 1 | variable-blocksize stream; frame |
| | header encodes the sample number | | | header encodes the sample number |
+-------+----------------------------------+ +-------+----------------------------------+
Table 20 Table 20
The "BLOCKING STRATEGY" bit MUST be the same throughout the entire The BLOCKING STRATEGY bit MUST be the same throughout the entire
stream. stream.
The "BLOCKING STRATEGY" bit determines how to calculate the sample The BLOCKING STRATEGY bit determines how to calculate the sample
number of the first sample in the frame. If the bit is "0" (fixed- number of the first sample in the frame. If the bit is 0 (fixed-
blocksize), the frame header encodes the frame number as above, and blocksize), the frame header encodes the frame number as above, and
the frame's starting sample number will be the frame number times the the frame's starting sample number will be the frame number times the
blocksize. If it is "1" (variable-blocksize), the frame header blocksize. If it is 1 (variable-blocksize), the frame header encodes
encodes the frame's starting sample number itself. (In the case of a the frame's starting sample number itself. (In the case of a fixed-
fixed-blocksize stream, only the last block MAY be shorter than the blocksize stream, only the last block MAY be shorter than the stream
stream blocksize; its starting sample number will be calculated as blocksize; its starting sample number will be calculated as the frame
the frame number times the previous frame's blocksize, or zero if it number times the previous frame's blocksize, or zero if it is the
is the first frame). first frame).
11.19.3. INTERCHANNEL SAMPLE BLOCK SIZE 11.22.3. INTERCHANNEL SAMPLE BLOCK SIZE
+=================+=========================================+ +=================+=========================================+
| Value | Description | | Value | Description |
+=================+=========================================+ +=================+=========================================+
| 0b0000 | reserved | | 0b0000 | reserved |
+-----------------+-----------------------------------------+ +-----------------+-----------------------------------------+
| 0b0001 | 192 samples | | 0b0001 | 192 samples |
+-----------------+-----------------------------------------+ +-----------------+-----------------------------------------+
| 0b0010 - 0b0101 | 576 * (2^(n-2)) samples, i.e. 576, | | 0b0010 - 0b0101 | 576 * (2^(n-2)) samples, i.e. 576, |
| | 1152, 2304 or 4608 | | | 1152, 2304 or 4608 |
skipping to change at page 25, line 8 skipping to change at page 27, line 40
+-----------------+-----------------------------------------+ +-----------------+-----------------------------------------+
| 0b0111 | get 16 bit (blocksize-1) from end of | | 0b0111 | get 16 bit (blocksize-1) from end of |
| | header | | | header |
+-----------------+-----------------------------------------+ +-----------------+-----------------------------------------+
| 0b1000 - 0b1111 | 256 * (2^(n-8)) samples, i.e. 256, 512, | | 0b1000 - 0b1111 | 256 * (2^(n-8)) samples, i.e. 256, 512, |
| | 1024, 2048, 4096, 8192, 16384 or 32768 | | | 1024, 2048, 4096, 8192, 16384 or 32768 |
+-----------------+-----------------------------------------+ +-----------------+-----------------------------------------+
Table 21 Table 21
11.19.4. SAMPLE RATE 11.22.4. SAMPLE RATE
+========+=====================================================+ +========+=====================================================+
| Value | Description | | Value | Description |
+========+=====================================================+ +========+=====================================================+
| 0b0000 | get from STREAMINFO metadata block | | 0b0000 | get from STREAMINFO metadata block |
+--------+-----------------------------------------------------+ +--------+-----------------------------------------------------+
| 0b0001 | 88.2 kHz | | 0b0001 | 88.2 kHz |
+--------+-----------------------------------------------------+ +--------+-----------------------------------------------------+
| 0b0010 | 176.4 kHz | | 0b0010 | 176.4 kHz |
+--------+-----------------------------------------------------+ +--------+-----------------------------------------------------+
skipping to change at page 26, line 5 skipping to change at page 28, line 31
+--------+-----------------------------------------------------+ +--------+-----------------------------------------------------+
| 0b1101 | get 16 bit sample rate (in Hz) from end of header | | 0b1101 | get 16 bit sample rate (in Hz) from end of header |
+--------+-----------------------------------------------------+ +--------+-----------------------------------------------------+
| 0b1110 | get 16 bit sample rate (in daHz) from end of header | | 0b1110 | get 16 bit sample rate (in daHz) from end of header |
+--------+-----------------------------------------------------+ +--------+-----------------------------------------------------+
| 0b1111 | invalid, to prevent sync-fooling string of 1s | | 0b1111 | invalid, to prevent sync-fooling string of 1s |
+--------+-----------------------------------------------------+ +--------+-----------------------------------------------------+
Table 22 Table 22
11.19.5. CHANNEL ASSIGNMENT 11.22.5. CHANNEL ASSIGNMENT
For values 0b0000-0b0111, the value represents the (number of Values 0b0000-0b0111 represent the (number of independent channels)-1
independent channels)-1. Where defined, the channel order follows coded independently, channel order follows SMPTE/ITU-R
SMPTE/ITU-R recommendations. recommendations. Values 0b1000-0b1010 represent 2 channel (stereo)
audio where the signal has been mapped to a different representation,
see section on Interchannel Decorrelation (#interchannel-
decorrelation).
+==========+======================================================+ +==========+======================================================+
| Value | Description | | Value | Description |
+==========+======================================================+ +==========+======================================================+
| 0b0000 | 1 channel: mono | | 0b0000 | 1 channel: mono |
+----------+------------------------------------------------------+ +----------+------------------------------------------------------+
| 0b0001 | 2 channels: left, right | | 0b0001 | 2 channels: left, right |
+----------+------------------------------------------------------+ +----------+------------------------------------------------------+
| 0b0010 | 3 channels: left, right, center | | 0b0010 | 3 channels: left, right, center |
+----------+------------------------------------------------------+ +----------+------------------------------------------------------+
skipping to change at page 27, line 5 skipping to change at page 29, line 44
+----------+------------------------------------------------------+ +----------+------------------------------------------------------+
| 0b1010 | mid/side stereo: channel 0 is the mid(average) | | 0b1010 | mid/side stereo: channel 0 is the mid(average) |
| | channel, channel 1 is the side(difference) channel | | | channel, channel 1 is the side(difference) channel |
+----------+------------------------------------------------------+ +----------+------------------------------------------------------+
| 0b1011 - | reserved | | 0b1011 - | reserved |
| 0b1111 | | | 0b1111 | |
+----------+------------------------------------------------------+ +----------+------------------------------------------------------+
Table 23 Table 23
11.19.6. SAMPLE SIZE Please note that the actual coded subframe order for right/side
stereo is side-right.
11.22.6. SAMPLE SIZE
+=======+====================================+ +=======+====================================+
| Value | Description | | Value | Description |
+=======+====================================+ +=======+====================================+
| 0b000 | get from STREAMINFO metadata block | | 0b000 | get from STREAMINFO metadata block |
+-------+------------------------------------+ +-------+------------------------------------+
| 0b001 | 8 bits per sample | | 0b001 | 8 bits per sample |
+-------+------------------------------------+ +-------+------------------------------------+
| 0b010 | 12 bits per sample | | 0b010 | 12 bits per sample |
+-------+------------------------------------+ +-------+------------------------------------+
skipping to change at page 27, line 33 skipping to change at page 30, line 33
+-------+------------------------------------+ +-------+------------------------------------+
| 0b111 | reserved | | 0b111 | reserved |
+-------+------------------------------------+ +-------+------------------------------------+
Table 24 Table 24
For subframes that encode a difference channel, the sample size is For subframes that encode a difference channel, the sample size is
one bit larger than the sample size of the frame, in order to be able one bit larger than the sample size of the frame, in order to be able
to encode the difference between extreme values. to encode the difference between extreme values.
11.19.7. FRAME HEADER RESERVED2 11.22.7. FRAME HEADER RESERVED2
+=======+=========================+ +=======+=========================+
| Value | Description | | Value | Description |
+=======+=========================+ +=======+=========================+
| 0 | mandatory value | | 0 | mandatory value |
+-------+-------------------------+ +-------+-------------------------+
| 1 | reserved for future use | | 1 | reserved for future use |
+-------+-------------------------+ +-------+-------------------------+
Table 25 Table 25
11.19.8. CODED NUMBER 11.22.8. CODED NUMBER
Frame/Sample numbers are encoded using the UTF-8 format, from BEFORE Frame/Sample numbers are encoded using the UTF-8 format, from BEFORE
it was limited to 4 bytes by RFC3629, this variant supports the it was limited to 4 bytes by RFC3629, this variant supports the
original 7 byte maximum. original 7 byte maximum.
Note to implementors: All Unicode compliant UTF-8 decoders and Note to implementors: All Unicode compliant UTF-8 decoders and
encoders are limited to 4 bytes, it's best to just write your own one encoders are limited to 4 bytes, it's best to just write your own one
off solution. off solution.
if(variable blocksize) if(variable blocksize)
`u(8...56)`: "UTF-8" coded sample number (decoded number is 36 bits) `u(8...56)`: "UTF-8" coded sample number (decoded number is 36 bits)
else else
`u(8...48)`: "UTF-8" coded frame number (decoded number is 31 bits) `u(8...48)`: "UTF-8" coded frame number (decoded number is 31 bits)
11.19.9. BLOCK SIZE INT 11.22.9. BLOCK SIZE INT
if(`INTERCHANNEL SAMPLE BLOCK SIZE` == 0b0110) if(`INTERCHANNEL SAMPLE BLOCK SIZE` == 0b0110)
8 bit (blocksize-1) 8 bit (blocksize-1)
else if(`INTERCHANNEL SAMPLE BLOCK SIZE` == 0b0111) else if(`INTERCHANNEL SAMPLE BLOCK SIZE` == 0b0111)
16 bit (blocksize-1) 16 bit (blocksize-1)
11.19.10. SAMPLE RATE INT 11.22.10. SAMPLE RATE INT
if(`SAMPLE RATE` == 0b1100) if(`SAMPLE RATE` == 0b1100)
8 bit sample rate (in kHz) 8 bit sample rate (in kHz)
else if(`SAMPLE RATE` == 0b1101) else if(`SAMPLE RATE` == 0b1101)
16 bit sample rate (in Hz) 16 bit sample rate (in Hz)
else if(`SAMPLE RATE` == 0b1110) else if(`SAMPLE RATE` == 0b1110)
16 bit sample rate (in daHz) 16 bit sample rate (in daHz)
11.19.11. FRAME CRC 11.22.11. FRAME CRC
CRC-8 (polynomial = x^8 + x^2 + x^1 + x^0, initialized with 0) of CRC-8 (polynomial = x^8 + x^2 + x^1 + x^0, initialized with 0) of
everything before the CRC, including the sync code everything before the CRC, including the sync code
11.20. FRAME_FOOTER 11.23. FRAME_FOOTER
+=======+===================================================+ +=======+===================================================+
| Data | Description | | Data | Description |
+=======+===================================================+ +=======+===================================================+
| u(16) | CRC-16 (polynomial = x^16 + x^15 + x^2 + x^0, | | u(16) | CRC-16 (polynomial = x^16 + x^15 + x^2 + x^0, |
| | initialized with 0) of everything before the CRC, | | | initialized with 0) of everything before the CRC, |
| | back to and including the frame header sync code | | | back to and including the frame header sync code |
+-------+---------------------------------------------------+ +-------+---------------------------------------------------+
Table 26 Table 26
11.21. SUBFRAME 11.24. SUBFRAME
+============================================+======================+ +========================================+======================+
| Data | Description | | Data | Description |
+============================================+======================+ +========================================+======================+
| SUBFRAME_HEADER | | | SUBFRAME_HEADER | |
+--------------------------------------------+----------------------+ +----------------------------------------+----------------------+
| "SUBFRAME_CONSTANT" || "SUBFRAME_FIXED" || | The SUBFRAME_HEADER | | SUBFRAME_CONSTANT || SUBFRAME_FIXED || | The SUBFRAME_HEADER |
| "SUBFRAME_LPC" || "SUBFRAME_VERBATIM" | specifies which | | SUBFRAME_LPC || SUBFRAME_VERBATIM | specifies which one. |
| | one. | +----------------------------------------+----------------------+
+--------------------------------------------+----------------------+
Table 27 Table 27
11.22. SUBFRAME_HEADER 11.25. SUBFRAME_HEADER
+========+========================================================+ +========+========================================================+
| Data | Description | | Data | Description |
+========+========================================================+ +========+========================================================+
| u(1) | Zero bit padding, to prevent sync-fooling string of 1s | | u(1) | Zero bit padding, to prevent sync-fooling string of 1s |
+--------+--------------------------------------------------------+ +--------+--------------------------------------------------------+
| u(6) | "SUBFRAME TYPE" (see section on SUBFRAME TYPE | | u(6) | SUBFRAME TYPE (see section on SUBFRAME TYPE |
| | (#subframe-type)) | | | (#subframe-type)) |
+--------+--------------------------------------------------------+ +--------+--------------------------------------------------------+
| u(1+k) | "WASTED BITS PER SAMPLE FLAG" (see section on WASTED | | u(1+k) | WASTED BITS PER SAMPLE FLAG (see section on WASTED |
| | BITS PER SAMPLE FLAG (#wasted-bits-per-sample-flag)) | | | BITS PER SAMPLE FLAG (#wasted-bits-per-sample-flag)) |
+--------+--------------------------------------------------------+ +--------+--------------------------------------------------------+
Table 28 Table 28
11.22.1. SUBFRAME TYPE 11.25.1. SUBFRAME TYPE
+==========+================================+ +==========+=======================================================+
| Value | Description | | Value | Description |
+==========+================================+ +==========+=======================================================+
| 0b000000 | SUBFRAME_CONSTANT | | 0b000000 | SUBFRAME_CONSTANT |
+----------+--------------------------------+ +----------+-------------------------------------------------------+
| 0b000001 | SUBFRAME_VERBATIM | | 0b000001 | SUBFRAME_VERBATIM |
+----------+--------------------------------+ +----------+-------------------------------------------------------+
| 0b00001x | reserved | | 0b00001x | reserved |
+----------+--------------------------------+ +----------+-------------------------------------------------------+
| 0b0001xx | reserved | | 0b0001xx | reserved |
+----------+--------------------------------+ +----------+-------------------------------------------------------+
| 0b001xxx | if(xxx <= 4) "SUBFRAME_FIXED", | | 0b001xxx | if(xxx <= 4) SUBFRAME_FIXED, xxx=order; else reserved |
| | xxx=order; else reserved | +----------+-------------------------------------------------------+
+----------+--------------------------------+ | 0b01xxxx | reserved |
| 0b01xxxx | reserved | +----------+-------------------------------------------------------+
+----------+--------------------------------+ | 0b1xxxxx | SUBFRAME_LPC, xxxxx=order-1 |
| 0b1xxxxx | "SUBFRAME_LPC", xxxxx=order-1 | +----------+-------------------------------------------------------+
+----------+--------------------------------+ Table 29
Table 29 11.25.2. WASTED BITS PER SAMPLE FLAG
11.22.2. WASTED BITS PER SAMPLE FLAG Certain file formats, like AIFF, can store audio samples with a bit
depth that is not an integer number of bytes by padding them with
least significant zero bits to a bit depth that is an integer number
of bytes. For example, shifting a 14-bit sample right by 2 pads it
to a 16-bit sample, which then has two zero least-significant bits.
In this specification, these least-significant zero bits are referred
to as wasted bits-per-sample or simply wasted bits. They are wasted
in a sense that they contain no information, but are stored anyway.
+=======+==============================================+ The wasted bits-per-sample flag in a subframe header is set to 1 if a
| Value | Description | certain number of least-significant bits of all samples in the
+=======+==============================================+ current subframe are zero. If this is the case, the number of wasted
| 0 | no wasted bits-per-sample in source | bits-per-sample (k) minus 1 follows the flag in an unary encoding.
| | subblock, k=0 | For example, if k is 3, 0b001 follows. If k = 0, the wasted bits-
+-------+----------------------------------------------+ per-sample flag is 0 and no unary coded k follows.
| 1 | k wasted bits-per-sample in source subblock, |
| | k-1 follows, unary coded; e.g. k=3 => 0b001 |
| | follows, k=7 => 0b0000001 follows. |
+-------+----------------------------------------------+
Table 30 In case k is not equal to 0, samples are coded ignoring k least-
significant bits. For example, if the preceding frame header
specified a sample size of 16 bits per sample and k is 3, samples in
the subframe are coded as 13 bits per sample. A decoder MUST add k
least-significant zero bits by shifting left (padding) after decoding
a subframe sample. In case the frame has left/side, right/side or
mid/side stereo, padding MUST happen to a sample before it is used to
reconstruct a left or right sample.
The size of the samples stored in the subframe is the subframe sample Besides audio files that have a certain number of wasted bits for the
size reduced by k bits. Decoded samples must be shifted left by k whole file, there exist audio files in which the number of wasted
bits. bits varies. There are DVD-Audio discs in which blocks of samples
have had their least-significant bits selectively zeroed, as to
slightly improve the compression of their otherwise lossless Meridian
Lossless Packing codec. There are also audio processors like
lossyWAV that enable users to improve compression of their files by a
lossless audio codec in a non-lossless way. Because of this the
number of wasted bits k MAY change between frames and MAY differ
between subframes.
11.23. SUBFRAME_CONSTANT 11.26. SUBFRAME_CONSTANT
+======+========================================+ +======+========================================+
| Data | Description | | Data | Description |
+======+========================================+ +======+========================================+
| u(n) | Unencoded constant value of the | | u(n) | Unencoded constant value of the |
| | subblock, n = frame's bits-per-sample. | | | subblock, n = frame's bits-per-sample. |
+------+----------------------------------------+ +------+----------------------------------------+
Table 30
Table 31 11.27. SUBFRAME_FIXED
11.24. SUBFRAME_FIXED
+==========+========================================+ +==========+========================================+
| Data | Description | | Data | Description |
+==========+========================================+ +==========+========================================+
| u(n) | Unencoded warm-up samples (n = frame's | | u(n) | Unencoded warm-up samples (n = frame's |
| | bits-per-sample * predictor order). | | | bits-per-sample * predictor order). |
+----------+----------------------------------------+ +----------+----------------------------------------+
| RESIDUAL | Encoded residual | | RESIDUAL | Encoded residual |
+----------+----------------------------------------+ +----------+----------------------------------------+
Table 32 Table 31
11.25. SUBFRAME_LPC 11.28. SUBFRAME_LPC
+==========+========================================================+ +==========+========================================================+
| Data | Description | | Data | Description |
+==========+========================================================+ +==========+========================================================+
| u(n) | Unencoded warm-up samples (n = frame's bits- | | u(n) | Unencoded warm-up samples (n = frame's bits- |
| | per-sample * lpc order). | | | per-sample * lpc order). |
+----------+--------------------------------------------------------+ +----------+--------------------------------------------------------+
| u(4) | (quantized linear predictor coefficients' | | u(4) | (quantized linear predictor coefficients' |
| | precision in bits)-1 (NOTE: 0b1111 is invalid). | | | precision in bits)-1 (NOTE: 0b1111 is invalid). |
+----------+--------------------------------------------------------+ +----------+--------------------------------------------------------+
skipping to change at page 31, line 27 skipping to change at page 34, line 41
| | needed in bits (NOTE: this number is signed | | | needed in bits (NOTE: this number is signed |
| | two's-complement). | | | two's-complement). |
+----------+--------------------------------------------------------+ +----------+--------------------------------------------------------+
| u(n) | Unencoded predictor coefficients (n = qlp coeff | | u(n) | Unencoded predictor coefficients (n = qlp coeff |
| | precision * lpc order) (NOTE: the coefficients | | | precision * lpc order) (NOTE: the coefficients |
| | are signed two's-complement). | | | are signed two's-complement). |
+----------+--------------------------------------------------------+ +----------+--------------------------------------------------------+
| RESIDUAL | Encoded residual | | RESIDUAL | Encoded residual |
+----------+--------------------------------------------------------+ +----------+--------------------------------------------------------+
Table 33 Table 32
11.26. SUBFRAME_VERBATIM
+=========+===============================================+ 11.29. SUBFRAME_VERBATIM
| Data | Description |
+=========+===============================================+
| u(n\*i) | Unencoded subblock, where "n" is frame's |
| | bits-per-sample and "i" is frame's blocksize. |
+---------+-----------------------------------------------+
Table 34 +=========+=============================================+
| Data | Description |
+=========+=============================================+
| u(n\*i) | Unencoded subblock, where n is frame's |
| | bits-per-sample and i is frame's blocksize. |
+---------+---------------------------------------------+
Table 33
11.27. RESIDUAL 11.30. RESIDUAL
+==================================================+======================+ +================================================+======================+
|Data |Description | |Data |Description |
+==================================================+======================+ +================================================+======================+
|u(2) |RESIDUAL_CODING_METHOD| |u(2) |RESIDUAL_CODING_METHOD|
+--------------------------------------------------+----------------------+ +------------------------------------------------+----------------------+
|"RESIDUAL_CODING_METHOD_PARTITIONED_EXP_GOLOMB" ||| | |RESIDUAL_CODING_METHOD_PARTITIONED_EXP_GOLOMB ||| |
|"RESIDUAL_CODING_METHOD_PARTITIONED_EXP_GOLOMB2" | | |RESIDUAL_CODING_METHOD_PARTITIONED_EXP_GOLOMB2 | |
+--------------------------------------------------+----------------------+ +------------------------------------------------+----------------------+
Table 35 Table 34
11.27.1. RESIDUAL_CODING_METHOD 11.30.1. RESIDUAL_CODING_METHOD
+=======+========================================================+ +=======+========================================================+
| Value | Description | | Value | Description |
+=======+========================================================+ +=======+========================================================+
| 0b00 | partitioned Exp-Golomb coding with 4-bit Exp-Golomb | | 0b00 | partitioned Exp-Golomb coding with 4-bit Exp-Golomb |
| | parameter; | | | parameter; |
| | RESIDUAL_CODING_METHOD_PARTITIONED_EXP_GOLOMB follows | | | RESIDUAL_CODING_METHOD_PARTITIONED_EXP_GOLOMB follows |
+-------+--------------------------------------------------------+ +-------+--------------------------------------------------------+
| 0b01 | partitioned Exp-Golomb coding with 5-bit Exp-Golomb | | 0b01 | partitioned Exp-Golomb coding with 5-bit Exp-Golomb |
| | parameter; | | | parameter; |
| | RESIDUAL_CODING_METHOD_PARTITIONED_EXP_GOLOMB2 follows | | | RESIDUAL_CODING_METHOD_PARTITIONED_EXP_GOLOMB2 follows |
+-------+--------------------------------------------------------+ +-------+--------------------------------------------------------+
| 0b10 | reserved | | 0b10 | reserved |
| - | | | - | |
| 0b11 | | | 0b11 | |
+-------+--------------------------------------------------------+ +-------+--------------------------------------------------------+
Table 36 Table 35
11.27.2. RESIDUAL_CODING_METHOD_PARTITIONED_EXP_GOLOMB 11.30.2. RESIDUAL_CODING_METHOD_PARTITIONED_EXP_GOLOMB
+=========================+===================================+ +=======================+===================================+
| Data | Description | | Data | Description |
+=========================+===================================+ +=======================+===================================+
| u(4) | Partition order. | | u(4) | Partition order. |
+-------------------------+-----------------------------------+ +-----------------------+-----------------------------------+
| "EXP_GOLOMB_PARTITION"+ | There will be 2^order partitions. | | EXP_GOLOMB_PARTITION+ | There will be 2^order partitions. |
+-------------------------+-----------------------------------+ +-----------------------+-----------------------------------+
Table 37 Table 36
11.27.2.1. EXP_GOLOMB_PARTITION 11.30.2.1. EXP_GOLOMB_PARTITION
+==========+====================================================+ +==========+====================================================+
| Data | Description | | Data | Description |
+==========+====================================================+ +==========+====================================================+
| u(4(+5)) | "EXP-GOLOMB PARTITION ENCODING PARAMETER" (see | | u(4(+5)) | EXP-GOLOMB PARTITION ENCODING PARAMETER (see |
| | section on EXP-GOLOMB PARTITION ENCODING PARAMETER | | | section on EXP-GOLOMB PARTITION ENCODING PARAMETER |
| | (#exp-golomb-partition-encoding-parameter)) | | | (#exp-golomb-partition-encoding-parameter)) |
+----------+----------------------------------------------------+ +----------+----------------------------------------------------+
| u(?) | "ENCODED RESIDUAL" (see section on ENCODED | | u(?) | ENCODED RESIDUAL (see section on ENCODED RESIDUAL |
| | RESIDUAL (#encoded-residual)) | | | (#encoded-residual)) |
+----------+----------------------------------------------------+ +----------+----------------------------------------------------+
Table 38 Table 37
11.27.2.2. EXP GOLOMB PARTITION ENCODING PARAMETER 11.30.2.2. EXP GOLOMB PARTITION ENCODING PARAMETER
+==========+==========================================+ +==========+==========================================+
| Value | Description | | Value | Description |
+==========+==========================================+ +==========+==========================================+
| 0b0000 - | Exp-golomb parameter. | | 0b0000 - | Exp-golomb parameter. |
| 0b1110 | | | 0b1110 | |
+----------+------------------------------------------+ +----------+------------------------------------------+
| 0b1111 | Escape code, meaning the partition is in | | 0b1111 | Escape code, meaning the partition is in |
| | unencoded binary form using n bits per | | | unencoded binary form using n bits per |
| | sample; n follows as a 5-bit number. | | | sample; n follows as a 5-bit number. |
+----------+------------------------------------------+ +----------+------------------------------------------+
Table 39 Table 38
11.27.3. RESIDUAL_CODING_METHOD_PARTITIONED_EXP_GOLOMB2 11.30.3. RESIDUAL_CODING_METHOD_PARTITIONED_EXP_GOLOMB2
+==========================+===================================+ +========================+===================================+
| Data | Description | | Data | Description |
+==========================+===================================+ +========================+===================================+
| u(4) | Partition order. | | u(4) | Partition order. |
+--------------------------+-----------------------------------+ +------------------------+-----------------------------------+
| "EXP-GOLOMB2_PARTITION"+ | There will be 2^order partitions. | | EXP-GOLOMB2_PARTITION+ | There will be 2^order partitions. |
+--------------------------+-----------------------------------+ +------------------------+-----------------------------------+
Table 40 Table 39
11.27.3.1. EXP_GOLOMB2_PARTITION 11.30.3.1. EXP_GOLOMB2_PARTITION
+==========+=====================================================+ +==========+=====================================================+
| Data | Description | | Data | Description |
+==========+=====================================================+ +==========+=====================================================+
| u(5(+5)) | "EXP-GOLOMB2 PARTITION ENCODING PARAMETER" (see | | u(5(+5)) | EXP-GOLOMB2 PARTITION ENCODING PARAMETER (see |
| | section on EXP-GOLOMB2 PARTITION ENCODING PARAMETER | | | section on EXP-GOLOMB2 PARTITION ENCODING PARAMETER |
| | (#expgolomb2-partition-encoding-parameter)) | | | (#expgolomb2-partition-encoding-parameter)) |
+----------+-----------------------------------------------------+ +----------+-----------------------------------------------------+
| u(?) | "ENCODED RESIDUAL" (see section on ENCODED RESIDUAL | | u(?) | ENCODED RESIDUAL (see section on ENCODED RESIDUAL |
| | (#encoded-residual)) | | | (#encoded-residual)) |
+----------+-----------------------------------------------------+ +----------+-----------------------------------------------------+
Table 41 Table 40
11.27.3.2. EXP-GOLOMB2 PARTITION ENCODING PARAMETER 11.30.3.2. EXP-GOLOMB2 PARTITION ENCODING PARAMETER
+===========+==========================================+ +===========+==========================================+
| Value | Description | | Value | Description |
+===========+==========================================+ +===========+==========================================+
| 0b00000 - | Exp-golomb parameter. | | 0b00000 - | Exp-golomb parameter. |
| 0b11110 | | | 0b11110 | |
+-----------+------------------------------------------+ +-----------+------------------------------------------+
| 0b11111 | Escape code, meaning the partition is in | | 0b11111 | Escape code, meaning the partition is in |
| | unencoded binary form using n bits per | | | unencoded binary form using n bits per |
| | sample; n follows as a 5-bit number. | | | sample; n follows as a 5-bit number. |
+-----------+------------------------------------------+ +-----------+------------------------------------------+
Table 42 Table 41
11.27.4. ENCODED RESIDUAL 11.30.4. ENCODED RESIDUAL
The number of samples (n) in the partition is determined as follows: The number of samples (n) in the partition is determined as follows:
* if the partition order is zero, n = frame's blocksize - predictor * if the partition order is zero, n = frame's blocksize - predictor
order order
* else if this is not the first partition of the subframe, n = * else if this is not the first partition of the subframe, n =
(frame's blocksize / (2^partition order)) (frame's blocksize / (2^partition order))
* else n = (frame's blocksize / (2^partition order)) - predictor * else n = (frame's blocksize / (2^partition order)) - predictor
order order
12. Normative References 12. Security Considerations
Like any other codec (such as [RFC6716]), FLAC should not be used
with insecure ciphers or cipher modes that are vulnerable to known
plaintext attacks. Some of the header bits as well as the padding
are easily predictable.
Implementations of the FLAC codec need to take appropriate security
considerations into account. Those related to denial of service are
outlined in Section 2.1 of [RFC4732]. It is extremely important for
the decoder to be robust against malicious payloads. Malicious
payloads MUST NOT cause the decoder to overrun its allocated memory
or to take an excessive amount of resources to decode. An overrun in
allocated memory could lead to arbitrary code execution by an
attacker. The same applies to the encoder, even though problems in
encoders are typically rarer. Malicious audio streams MUST NOT cause
the encoder to misbehave because this would allow an attacker to
attack transcoding gateways. An example is allocating more memory
than available especially with blocksizes of more than 10000 or with
big metadata blocks, or not allocating enough memory before copying
data, which lead to execution of malicious code, crashes, freezes or
reboots on some known implementations. See the FLAC decoder
testbench (https://wiki.hydrogenaud.io/
index.php?title=FLAC_decoder_testbench) for a non-exhaustive list of
FLAC files with extreme configurations which lead to crashes or
reboots on some known implementations.
None of the content carried in FLAC is intended to be executable.
13. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997, DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>. <https://www.rfc-editor.org/info/rfc2119>.
[RFC4732] Handley, M., Ed., Rescorla, E., Ed., and IAB, "Internet
Denial-of-Service Considerations", RFC 4732,
DOI 10.17487/RFC4732, December 2006,
<https://www.rfc-editor.org/info/rfc4732>.
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
May 2017, <https://www.rfc-editor.org/info/rfc8174>. May 2017, <https://www.rfc-editor.org/info/rfc8174>.
14. Informative References
[RFC6716] Valin, JM., Vos, K., and T. Terriberry, "Definition of the
Opus Audio Codec", RFC 6716, DOI 10.17487/RFC6716,
September 2012, <https://www.rfc-editor.org/info/rfc6716>.
Authors' Addresses Authors' Addresses
Michael Sandelman Michael Richardson
Email: mcr@sandelman.ca Email: mcr@sandelman.ca
Andrew Weaver Andrew Weaver
Email: theandrewjw@gmail.com Email: theandrewjw@gmail.com
 End of changes. 144 change blocks. 
479 lines changed or deleted 646 lines changed or added

This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/