[SciPy-Dev] scipy.io.wavfile

Jean-Louis Durrieu jean-louis@durrieu...
Mon Jan 16 14:18:09 CST 2012

Hi everyone,

I have been using the scipy.io.wavfile for some time now. I am quite thankful for the person(s) who contributed that, as it makes it easy for me to research, develop and have other people use my programs. And it's easier to install than audiolab (sorry david), but way less powerful (is that better? :) ).

I just found a strange behaviour, and wanted to know what could be done: I have a few wav files for which I got, with scipy.io.wavfile.read, the right sampling rate, but a bad data chunk (actually strings, instead of int). 

As it were, these files (from the MIREX tempo tracking challenge http://www.music-ir.org/mirex/wiki/2011:Audio_Beat_Tracking) do have strange chunks, appended to the data chunk, which contain stuff like annotations or labels. While audacity or any other program does not bother with these, scipy.io.wavfile.read still reads them and, worse of all, replaces the correct data chunk with these labels (and we get a warning saying it does not understand the data type, and reads rubbish instead).

Well, I was wondering if such a behaviour was desired, or if there should not just be something like:
* If there is a data chunk, read it. 
* If there are many such data chunks, keep the first, send a warning.
* If there is a data chunk, and other "funny ones", keep the data chunk, send warning about the others (like providing their chunk_ids?)
* If there is no data chunk, send an error (with list of found chunks?).

That would at least make it not break at the first difficulty, right? Of course, I might be wrong assuming the above cases are the only ones that could occur, but one has to start somewhere, eh?

Additionally, I was wondering why numpy does not recognize 24 bits integers. It would seem quite some people work with 24 bit audio, so maybe some conversion should also be allowed there, although using numpy.fromfile may not work anymore (except if we add 24 bit integers to the allowed data types...). For this matter, I m more curious than pushy, so no need to stress about it :-)

Best regards !


More information about the SciPy-Dev mailing list