[Scipy-tickets] [SciPy] #1686: wavfile.read returns string instead of numpy.array in data

SciPy Trac scipy-tickets@scipy....
Mon Aug 13 04:15:47 CDT 2012


#1686: wavfile.read returns string instead of numpy.array in data
----------------------+-----------------------------------------------------
 Reporter:  jiaquan   |       Owner:  somebody   
     Type:  defect    |      Status:  new        
 Priority:  normal    |   Milestone:  Unscheduled
Component:  scipy.io  |     Version:  0.10.0     
 Keywords:  wavfile   |  
----------------------+-----------------------------------------------------
Changes (by durrieu):

 * cc: jean-louis@… (added)


Comment:

 Hi there,

 I have had this issue also. From my experience, some WAV files have chunks
 generated by some applications (annotation layers, often, additional
 infos, etc.), which are not recognized by `wavfile.read`.

 The issue is that, most of the time, we are anyway interested in the data
 in the `data` chunk (at least, as far as I am concerned). In
 `wavfile.read`, as soon as the program goes through a new chunk, it
 discards the previously scanned data.

 I would propose to keep in the `data` array the audio content which is in
 the `data` chunk of the WAV file, or at least one of them (not sure if
 it's possible - or "legal" to have several per file). The issue being that
 it then would need another type of output for "non data" chunks.
 Personally, I would say that the most important in a WAV file still is the
 `data` chunk, and people interested in stuff in other chunks (which are
 probably in a proprietary or at least not very standard format anyway) can
 still rework the `wavfile.py` file for their purpose. My point: it would
 be much more useful to have `wavfile.read` return the audio data (or some
 of it), by default. That will serve most people's need, I suppose.

 What I would therefore propose is simply to read the `data` chunk, with a
 warning whenever an "exotic" chunk is met. Even simpler is what I proposed
 [https://github.com/wslihgt/separateLeadStereo/issues/1#issuecomment-7679072
 here], that is to change `data` to `data_` in the `else` part of the `if`
 loop (or commenting the corresponding lines). That way, these lines are
 still around, in case someone wants to solve this in a better way.

 At last, I guess for people really needing input/output with audio files,
 [http://pypi.python.org/pypi/scikits.audiolab/0.10.2 `audiolab`] is a
 better long term option, but that requires some extra installation, which
 is not, as far as I know, as easy and automatic as `scipy` can be to
 install, for instance. I therefore pretty much like the fact that there is
 such a functionality in `scipy`, making it easier to migrate to for people
 coming from Matlab! ;-)

-- 
Ticket URL: <http://projects.scipy.org/scipy/ticket/1686#comment:2>
SciPy <http://www.scipy.org>
SciPy is open-source software for mathematics, science, and engineering.


More information about the Scipy-tickets mailing list