[IPython-user] [0:execute]: IOError: [Errno 4] Interrupted system call

Brian Granger ellisonbg.net@gmail....
Mon Dec 1 16:09:11 CST 2008


> Sorry I don't have any specific information concerning your situation,
> but as it's been a few days with no response, I figured I'd chime in. My
> understanding is that an interrupted system call (EINTR) happens when a
> system call (e.g. select(), fread(), fwrite(), and so on) is interrupted
> by a signal that the kernel decides your thread is going to handle. The
> correct behavior is to deal appropriately with the signal (possibly
> ignoring it) and then repeat your system call.

Yep, I think that is what is going on.  Is there a chance that the
signal is anything other than EINTR?  I ask as that will help us track
this down.

> I've found lots of code in the wild that is not robust to being
> interrupted this way (including in core Python), but luckily very little
> code that sends signals and thus interrupts code that way. So, I think
> the best solution will be to find what is sending the signal and
> eliminate it. Hopefully this will ring some bells with someone more
> knowledgeable in the ipython internals than I. Also, are you running any
> third party code that could be sending signals?

There are a couple of possibilities:

1.  Something deep in the internals of Python itself.
2.  Something deep in Twisted
3.  It wouldn't be in IPython as we (as far as I know) are not sending
any signals.
4.  Deep somewhere in user code that they are not aware of.

My best guesses are Twisted or in user code.  I will look at Twisted
to see if it sends signals anywhere.  Is it also possible that the
kernel itself sends the signal?

Thanks

Brian


> -Andrew
>
> mark starnes wrote:
>> Hi again,
>>
>> I cannot replicate this behaviour from the IPython console.  I attempted it, using:
>>
>>
>> from IPython.kernel import client
>> mec = client.MultiEngineClient()
>>
>> # Create file.
>> import cPickle
>> a = file('/tmp/temp.file', 'wb')
>> cPickle.dump('0' * int(1E5), a)
>> a.close()
>>
>> # Execute repeated loads on engines.
>> mec.execute('import cPickle')
>>
>> for j in xrange(100):
>>     for i in xrange(3):
>>         command = "a = file('/tmp/temp.file')"
>>         mec.execute(command, targets = [i])
>>         command = "b = cPickle.load(a)"
>>         mec.execute(command, targets = [i])
>>         command = "a.close()"
>>         mec.execute(command, targets = [i])
>>
>>
>> performing 100 reads, with no errors.
>>
>> BR,
>>
>> Mark.
>>
>>
>>
>>
>> mark starnes wrote:
>>> Hi everyone,
>>>
>>> I'm performing blocking file read commands on remote engines, one at a time, but regularly get the
>>> error:
>>>
>>> ********************************************************************************************************
>>>
>>> CompositeError                            Traceback (most recent call last)
>>>
>>> <ipython console> in <module>()
>>>
>>> /fe2.pyc in setup(self, append)
>>>     365             diskpush({'a':self}, picklemode = 0)  # use mode 0 (others fail)
>>>     366          else:
>>> --> 367             diskpush({'a':self})
>>>     368
>>>     369
>>>
>>> /reference.pyc in diskpush(a, targets, block, isanobject, picklemode)
>>>    3359       time.sleep(5.0); mprint('Done. ')
>>>    3360       #mec.execute('tmpdata = dpfile.read()', targets = [i], block = True)
>>> -> 3361       mec.execute('dpvar = cPickle.load(dpfile)', targets = [i], block = True)
>>>    3362       mec.execute('dpfile.close()', targets = [i], block = True)
>>>    3363       #mec.barrier(a)
>>>
>>> /usr/local/lib64/python2.5/site-packages/IPython/kernel/multiengineclient.pyc in execute(self, lines, targets, block)
>>>     520         targets, block = self._findTargetsAndBlock(targets, block)
>>>     521         result = blockingCallFromThread(self.smultiengine.execute, lines,
>>> --> 522             targets=targets, block=block)
>>>     523         if block:
>>>     524             result = ResultList(result)
>>>
>>> /usr/local/lib64/python2.5/site-packages/IPython/kernel/twistedutil.pyc in blockingCallFromThread(f, *a, **kw)
>>>      67         @raise: any error raised during the callback chain.
>>>      68         """
>>> ---> 69         return twisted.internet.threads.blockingCallFromThread(reactor, f, *a, **kw)
>>>      70
>>>      71 else:
>>>
>>> /usr/local/lib64/python2.5/site-packages/Twisted-8.1.0-py2.5-linux-x86_64.egg/twisted/internet/threads.pyc in blockingCallFromThread(reactor, f, *a, **kw)
>>>      81     result = queue.get()
>>>      82     if isinstance(result, failure.Failure):
>>> ---> 83         result.raiseException()
>>>      84     return result
>>>      85
>>>
>>> /usr/local/lib64/python2.5/site-packages/Twisted-8.1.0-py2.5-linux-x86_64.egg/twisted/python/failure.pyc in raiseException(self)
>>>     317         information if available.
>>>     318         """
>>> --> 319         raise self.type, self.value, self.tb
>>>     320
>>>     321
>>>
>>> CompositeError: one or more exceptions from call to method: execute
>>> [0:execute]: IOError: [Errno 4] Interrupted system call
>>>
>>> *******************************************************************************************************************
>>>
>>> It doesn't always happen; sometimes it will occur after a few hours of execution (ruining overnight runs).
>>> The active parts of the routine falling over are:
>>>
>>> =============================================
>>> for i in targets:  # Sequential engine reads.
>>>    mec.execute('dpfile = bz2.BZ2File(tmpfn, "rb")', targets = [i], block = True)
>>>    mec.execute('dpvar = cPickle.load(dpfile)', targets = [i], block = True)
>>>    mec.execute('dpfile.close()', targets = [i], block = True)
>>>
>>> =============================================
>>>
>>> with targets = mec.get_ids()
>>>
>>>
>>>
>>> Following the error, subsequent requests to the engines get results like:
>>>
>>>
>>>
>>>
>>> /<ipython console> in <module>()
>>>
>>> /fe2.pyc in setup(self, append)
>>>     365             diskpush({'a':self}, picklemode = 0)  # use mode 0 (others fail)
>>>     366          else:
>>> --> 367             diskpush({'a':self})
>>>     368
>>>     369
>>>
>>> /reference.py in diskpush(a, targets, block, isanobject, picklemode)
>>>    3359       #time.sleep(5.0); mprint('Done. ')
>>>    3360       #mec.execute('tmpdata = dpfile.read()', targets = [i], block = True)
>>> -> 3361       mec.execute('dpvar = cPickle.load(dpfile)', targets = [i], block = True)
>>>    3362       mec.execute('dpfile.close()', targets = [i], block = True)
>>>    3363       #mec.barrier(a)
>>>
>>> /usr/local/lib64/python2.5/site-packages/IPython/kernel/multiengineclient.pyc in execute(self, lines, targets, block)
>>>     520         targets, block = self._findTargetsAndBlock(targets, block)
>>>     521         result = blockingCallFromThread(self.smultiengine.execute, lines,
>>> --> 522             targets=targets, block=block)
>>>     523         if block:
>>>     524             result = ResultList(result)
>>>
>>> /usr/local/lib64/python2.5/site-packages/IPython/kernel/twistedutil.pyc in blockingCallFromThread(f, *a, **kw)
>>>      67         @raise: any error raised during the callback chain.
>>>      68         """
>>> ---> 69         return twisted.internet.threads.blockingCallFromThread(reactor, f, *a, **kw)
>>>      70
>>>      71 else:
>>>
>>> /usr/local/lib64/python2.5/site-packages/Twisted-8.1.0-py2.5-linux-x86_64.egg/twisted/internet/threads.pyc in blockingCallFromThread(reactor, f, *a, **kw)
>>>      81     result = queue.get()
>>>      82     if isinstance(result, failure.Failure):
>>> ---> 83         result.raiseException()
>>>      84     return result
>>>      85
>>>
>>> /usr/local/lib64/python2.5/site-packages/Twisted-8.1.0-py2.5-linux-x86_64.egg/twisted/python/failure.pyc in raiseException(self)
>>>     317         information if available.
>>>     318         """
>>> --> 319         raise self.type, self.value, self.tb
>>>     320
>>>     321
>>>
>>> CompositeError: one or more exceptions from call to method: execute
>>> [0:execute]: AttributeError: 'module' object has no attribute 'cbook'
>>>
>>>
>>>
>>> and then, when I quit IPython, I get,
>>>
>>> Closing threads... Done.
>>> Exception exceptions.TypeError: "'NoneType' object is not callable" in <bound method RemoteReferenceTracker._refLost of <RemoteReferenceTracker(clid=1,url=pbu://127.0.0.1:24879/uwtv36uev6e7emd45xfz77tt75krrduj)>> ignored
>>>
>>>
>>>
>>>
>>> The engines need a restart after this error.  Any ideas on how to fix this, would be appreciated.  It's
>>> hobbling my parallel processing attempts!
>>>
>>> Thanks in advance,
>>>
>>> Mark.
>>>
>>>
>>>
>>>
>>>
>>>
>>> matplotlib version 0.98.3
>>> verbose.level helpful
>>> interactive is False
>>> units is False
>>> platform is linux2
>>> Could not load matplotlib icon: 'module' object has no attribute 'window_set_default_icon_from_file'
>>> backend QtAgg version 0.9.1
>>> Activating auto-logging. Current session state plus future input saved.
>>> Filename       : ipython_log.py
>>> Mode           : rotate
>>> Output logging : False
>>> Raw input log  : False
>>> Timestamping   : False
>>> State          : active
>>> Python 2.5.1 (r251:54863, Jan 10 2008, 18:00:49)
>>> Type "copyright", "credits" or "license" for more information.
>>>
>>> IPython 0.9.rc1 -- An enhanced Interactive Python.
>>>
>>>
>>> _______________________________________________
>>> IPython-user mailing list
>>> IPython-user@scipy.org
>>> http://lists.ipython.scipy.org/mailman/listinfo/ipython-user
>>>
>> _______________________________________________
>> IPython-user mailing list
>> IPython-user@scipy.org
>> http://lists.ipython.scipy.org/mailman/listinfo/ipython-user
>
> _______________________________________________
> IPython-user mailing list
> IPython-user@scipy.org
> http://lists.ipython.scipy.org/mailman/listinfo/ipython-user
>


More information about the IPython-user mailing list