[SciPy-User] Wave files / PCM question

David david@silveregg.co...
Sun Nov 7 18:44:03 CST 2010


On 11/08/2010 07:51 AM, Dan Goodman wrote:
> Hi all,
>
> In a linear PCM encoded wave file, the samples are typically stored
> either as unsigned bytes or signed 16 bit integers. Does anyone know
> (and preferably have a solid reference for) the correct conversion for
> both of these types to floats between -1 and 1?
>
> My assumption would be that no possible values should be wasted, so that
> -1 should correspond to 0 (or -2**15) and +1 should correspond to 255
> (or 2**15-1) for 8 (or 16) bit samples. But this has the odd feature
> that 0 is not represented, as it would have to correspond to 127.5 (or
> -0.5). That doesn't bother me too much, at least in the case of the
> unsigned bytes, but in the case of the signed 16 bit ints, it means that
> the zero of the signed 16 bit int doesn't correspond to the zero of the
> float, and that essentially the signedness of the 16 bit int is more or
> less ignored.

I think the convention in audio is to keep the asymetry - after all, 
this asymetry exists in 2 complement representation: the minimal 
representable value of a signed int is not the opposite of the maximal 
representable value on most architectures (e.g. CHAR_MIN = -128, 
CHAR_MAX = 127).

> The alternative is that the signedness is used and +/- 1 corresponds to
> +/- 2**15-1, which would mean that the value -2**15 is never used for 16
> bit LPCM, which seems to violate my intuition about how people used to
> design file formats back in the good old days when everything was very
> efficient.

On most non-ancient architectures (i.e. the ones using 2-complement 
representation), the range of possible representations for an integer 
with N bits is between -2**(N-1) and 2**(N-1)-1, so that negating a 
number may be done by 2-complement. That's the origin of you confusion I 
think.

If you are interested in audio coding has have access to a library, 
"Introduction to digital audio coding and standards" by Bosi and 
Goldberg is a good, short (but expensive !) introduction.

> Apologies for slightly offtopic question, although I am using numpy and
> scipy. :)

You may want to use scikits.audiolab, which uses libsndfile for the 
internal int<->float conversion, and is widely used for audio file 
import/export: http://pypi.python.org/pypi/scikits.audiolab/0.11.0

cheers,

David


More information about the SciPy-User mailing list