[SciPy-user] NumPy Newcomer Questions: Reading Data Stream & Building Array

Rich Shepard rshepard at appl-ecosys.com
Fri Sep 8 12:18:39 CDT 2006

   After decades of Fortran and C, I'm quite new to Python and all the great
tools it provides. So, I'd like mentoring in how to most elegantly and
effectively parse data read over a serial port from a scanner, and how to
build a 2D array from it. After this, I'll probably need help in other NumPy
array (matrix) manipulations. But, first things first:

   Data are entered on an optical mark readable (OMR) form, which is scanned
and the data sent over a serial line. Each form read transmits 69 bytes;
there are 2 bytes for each "column" (where there is a timing mark on the
edge of the form) and their position within that column determines their
value. The form has two blank lines (with timing marks) and three lines with
labels; these transmit the equivalent of '0' and are not recorded. The final
byte is a carriage return, '\r,' which is translated into a newline ('\n') by
the method.

   For each form read I want to record the form number (programmatically
sequentially assigned, starting with '1') and the data from the form as a
row. The total forms read then fill a two-dimensional array.

   The two issues with which I would like help are:

   How to slice the list of data from the scanner stream into two-byte chunks
(leaving the end-of-record byte to be dealt with separately),

   How to build the array so that each form read creates a new row in the
array. While the 31 columns are fixed, the number of rows is indeterminate
until all submitted forms have been read.

   I've attached the 95-line 'OnScan.py' to this message; it's a single
function within the application.



Richard B. Shepard, Ph.D.               |    The Environmental Permitting
Applied Ecosystem Services, Inc.(TM)    |            Accelerator
<http://www.appl-ecosys.com>     Voice: 503-667-4517      Fax: 503-667-8863
-------------- next part --------------
  def OnScan(self, event):
    """ OMR form timing marks and their meaning:
    Col. #      Value
      1        Category labels
      2        Category mark
      3        Blank line
      4        Position labels
      5        Polition mark
      6        Row labels
      7-34     Pairwise comparison votes
         A total of 69 bytes are sent in a stream; the last is '\r' for end of record
    # scanner row translations
      chr(32)+chr(32): 0
    DATA_MAP_2 = {
      chr(32)+chr(36): 'nat',
      chr(96)+chr(32): 'eco',
      chr(36)+chr(32): 'soc'
    DATA_MAP_5 = {
      chr(32)+chr(36): 'pro',
      chr(96)+chr(32): 'neu',
      chr(36)+chr(32): 'con'
    DATA_MAP_7 = {
      chr(32)+chr(16): 1.000,    
      chr(32)+chr(8): 2.000,
      chr(32)+chr(4): 3.000,
      chr(32)+chr(2): 4.000,
      chr(32)+chr(1): 5.000,
      chr(64)+chr(32): 6.000,
      chr(16)+chr(32): 7.000,
      chr(8)+chr(32): 8.000,
      chr(4)+chr(32): 9.000,
      chr(34)+chr(8): 0.500,
      chr(34)+chr(4): 0.333,
      chr(34)+chr(2): 0.025,
      chr(34)+chr(1): 0.200,
      chr(66)+chr(32): 0.167,     
      chr(18)+chr(32): 0.143,
      chr(10)+chr(32): 0.125,
      chr(6)+chr(32): 0.111 

    # Open serial port
    ser = serial.Serial('/dev/ttyUSB0', 9600, timeout=1)
    if ser.isOpen() != True:
    searchString = '\r'   # end of card read
    vote_id = 0           # card number incremented at start of read loop
    newStart = 0          # only > 0 if the run is aborted by the operator

    progressMax = 500
    omrDialog = wx.ProgressDialog("Scoping Form Input",
           "Processing ...", progressMax, style=wx.PD_CAN_ABORT |
           wx.PD_APP_MODAL | wx.PD_ELAPSED_TIME)
    keepGoing = True
    while True:
      vote_id += 1        # vote record number; heads list row & displayed for entry on form
      if newStart > 0:    # read was aborted (card mis-read? jammed?) and restarted
        vote_id = newStart

      # read the line sent by the OMR reader; record end is '\r'.
      line = ser.readline  # get all 69 bytes in one chunk


      if value == DATA_MAP_BLANK:   # skip line if blank or labels
      if value == searchString:     # we're done with that form
        vrecord[:-1]               # slice off the terminal comma
        vrecord.append('\n')       # add a newline

      vrecord.extend([line, ", "]) # add the pairwise comparison value, a comma, and a space

      Now I need to add that vote record to an array, and start a new row """

      while keepGoing and vote_id < progressMax:    # increment progress dialog
          if keepGoing == False:
            newStart = vote_id
          keepGoing = omrDialog.Update(vote_id)
    # close the dialog box and serial port

More information about the SciPy-user mailing list