[SciPy-User] unpacking binary data from a C structure

Tom Kuiper kuiper@jpl.nasa....
Tue Apr 13 08:20:58 CDT 2010


Dear list,

here's something I find very strange.  I have a C structure defined as:

typedef struct
{
  unsigned short spcid;         /* station id - 10, 40, 60, 21 */
  unsigned short vsrid;         /* vsr1a, vsr1b ... from enum */     
  unsigned short chanid;        /* subchannel id 0,1,2,3 */
  unsigned short bps;           /* number of bits per sample - 1, 2, 4, 
8, or
                                   16 */
  unsigned long  srate;         /* number of samples per second in 
kilo-samples
                                   per second */
  unsigned short error;         /* hw err flag, dma error or num_samples 
error,
                                   0 ==> no errors */
  unsigned short year;          /* time tag - year */
  unsigned short doy;           /* time tag - day of year */
  unsigned long  sec;           /* time tag - second of day */
  double         freq;          /* in Hz */
  unsigned long  orate;         /* number of statistics samples per 
second */
  unsigned short nsubchan;      /* number of output sub chans */
}
stats_hdr_t;

The python module struct unpack expected format is 'HHHH L HHH L d L H'
Here's a real header structure as it appears at the head of a file:

  0000000  000d  0001  0006  0008
  0000010  4240  000f  0000  0000
  0000020  0000  07da  0064  4730
  0000030  0001  0000  0000  0000
  0000040  d800  d31d  421d  03e8
  0000048  0000  0000  0000  0002

Decoded as unsigned shorts:

  0000000    13     1     6     8
  0000010 16960    15     0     0
  0000020     0  2010   100 18224
  0000030     1     0     0     0
  0000040 55296 54045 16925  1000
  0000050     0     0     0     2

Matching these to the stats_hdr_t with 'unpack' notation:

  0000000     H     H     H     H
  0000010    L1    L2     H     ?
  0000020     ?     H     H    L1
  0000030    L2     ?     ?    D1
  0000040    D2    D3    D4    L1
  0000050    L2     ?     ?     H

So the actual format is 'HHHH L H xxxx HH L xxxx d L xxxx H'
What are all the mystery 4-byte blanks?  This works:

buf = fd.read(50)
header = unpack_from('=4H LH2x 2x2HL4xdL4xH',buf)

Since unpacking binary data must be a fairly common activity in 
scientific circles. I hope you will have some suggestions.

Tom



More information about the SciPy-User mailing list