[Scipy-tickets] [SciPy] #967: Segfault in scipy.cluster.hierarchy.linkage() due to memcpy

SciPy Trac scipy-tickets@scipy....
Sun Feb 5 13:35:10 CST 2012


#967: Segfault in scipy.cluster.hierarchy.linkage() due to memcpy
---------------------------+------------------------------------------------
 Reporter:  uri.laserson   |       Owner:  somebody   
     Type:  defect         |      Status:  needs_info 
 Priority:  normal         |   Milestone:  Unscheduled
Component:  scipy.cluster  |     Version:  0.7.0      
 Keywords:                 |  
---------------------------+------------------------------------------------
Changes (by rgommers):

  * status:  new => needs_info


Old description:

> I am trying to perform hierarchical clustering on a dataset that has
> about 60000 objects.  This means that the distance matrix has almost 2
> billion entries.  Everytime I run scipy.cluster.hierarchy.linkage, there
> is a reproducible segfault.  I ran a script attached to GDB and did a
> backtrace, which gave me:
>
> (gdb) backtrace
> #0  0x00002aed9066efa0 in memcpy () from /lib/libc.so.6
> #1  0x00002aaaad357600 in linkage (dm=0x2aaab1bb7010, Z=0x2aadd8e93010,
> X=0x0,
>     m=0, n=58186, ml=0, kc=0, dfunc=0x2aaaad354b30 <dist_average>,
> method=2)
>     at scipy/cluster/src/hierarchy.c:411
> #2  0x00002aaaad354777 in linkage_wrap (self=<value optimized out>,
>     args=<value optimized out>) at scipy/cluster/src/hierarchy_wrap.c:72
> #3  0x0000000000483b77 in PyEval_EvalFrameEx (f=0x6d73990,
>     throwflag=<value optimized out>) at Python/ceval.c:3573
> #4  0x0000000000485ae2 in PyEval_EvalCodeEx (co=0x2aaaad0215d0,
>     globals=<value optimized out>, locals=<value optimized out>,
> args=0x3,
>     argcount=1, kws=0x953f48, kwcount=1, defs=0x2aaaacaa2c38, defcount=2,
>     closure=0x0) at Python/ceval.c:2836
> #5  0x0000000000483ca7 in PyEval_EvalFrameEx (f=0x953dc0,
>     throwflag=<value optimized out>) at Python/ceval.c:3669
> #6  0x0000000000485ae2 in PyEval_EvalCodeEx (co=0x2aed900c9cd8,
>     globals=<value optimized out>, locals=<value optimized out>,
> args=0x0,
>     argcount=0, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0)
>     at Python/ceval.c:2836
> #7  0x0000000000485d82 in PyEval_EvalCode (co=0x0,
> globals=0x2aaab1bb7010,
>     locals=0xffffffff272db868) at Python/ceval.c:494
> #8  0x00000000004a717e in PyRun_FileExFlags (fp=0x935010,
>     filename=0x7fff1aa635c2 "clusterscriptSegFaultDebugGDB.py",
>     start=<value optimized out>, globals=0x958550, locals=0x958550,
> closeit=1,
> ---Type <return> to continue, or q <return> to quit---
>     flags=0x7fff1aa61ba0) at Python/pythonrun.c:1273
> #9  0x00000000004a7410 in PyRun_SimpleFileExFlags (fp=0x935010,
>     filename=0x7fff1aa635c2 "clusterscriptSegFaultDebugGDB.py",
> closeit=1,
>     flags=0x7fff1aa61ba0) at Python/pythonrun.c:879
> #10 0x0000000000412160 in Py_Main (argc=<value optimized out>,
>     argv=0x7fff1aa61cb8) at Modules/main.c:523
> #11 0x00002aed906174ca in __libc_start_main () from /lib/libc.so.6
> #12 0x000000000041169a in _start () at ../sysdeps/x86_64/elf/start.S:113
>
> Does anyone have any suggestions on how to identify/fix this problem?
>
> Thanks!
> Uri

New description:

 I am trying to perform hierarchical clustering on a dataset that has about
 60000 objects.  This means that the distance matrix has almost 2 billion
 entries.  Everytime I run scipy.cluster.hierarchy.linkage, there is a
 reproducible segfault.  I ran a script attached to GDB and did a
 backtrace, which gave me:
 {{{
 (gdb) backtrace
 #0  0x00002aed9066efa0 in memcpy () from /lib/libc.so.6
 #1  0x00002aaaad357600 in linkage (dm=0x2aaab1bb7010, Z=0x2aadd8e93010,
 X=0x0,
     m=0, n=58186, ml=0, kc=0, dfunc=0x2aaaad354b30 <dist_average>,
 method=2)
     at scipy/cluster/src/hierarchy.c:411
 #2  0x00002aaaad354777 in linkage_wrap (self=<value optimized out>,
     args=<value optimized out>) at scipy/cluster/src/hierarchy_wrap.c:72
 #3  0x0000000000483b77 in PyEval_EvalFrameEx (f=0x6d73990,
     throwflag=<value optimized out>) at Python/ceval.c:3573
 #4  0x0000000000485ae2 in PyEval_EvalCodeEx (co=0x2aaaad0215d0,
     globals=<value optimized out>, locals=<value optimized out>, args=0x3,
     argcount=1, kws=0x953f48, kwcount=1, defs=0x2aaaacaa2c38, defcount=2,
     closure=0x0) at Python/ceval.c:2836
 #5  0x0000000000483ca7 in PyEval_EvalFrameEx (f=0x953dc0,
     throwflag=<value optimized out>) at Python/ceval.c:3669
 #6  0x0000000000485ae2 in PyEval_EvalCodeEx (co=0x2aed900c9cd8,
     globals=<value optimized out>, locals=<value optimized out>, args=0x0,
     argcount=0, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0)
     at Python/ceval.c:2836
 #7  0x0000000000485d82 in PyEval_EvalCode (co=0x0, globals=0x2aaab1bb7010,
     locals=0xffffffff272db868) at Python/ceval.c:494
 #8  0x00000000004a717e in PyRun_FileExFlags (fp=0x935010,
     filename=0x7fff1aa635c2 "clusterscriptSegFaultDebugGDB.py",
     start=<value optimized out>, globals=0x958550, locals=0x958550,
 closeit=1,
 ---Type <return> to continue, or q <return> to quit---
     flags=0x7fff1aa61ba0) at Python/pythonrun.c:1273
 #9  0x00000000004a7410 in PyRun_SimpleFileExFlags (fp=0x935010,
     filename=0x7fff1aa635c2 "clusterscriptSegFaultDebugGDB.py", closeit=1,
     flags=0x7fff1aa61ba0) at Python/pythonrun.c:879
 #10 0x0000000000412160 in Py_Main (argc=<value optimized out>,
     argv=0x7fff1aa61cb8) at Modules/main.c:523
 #11 0x00002aed906174ca in __libc_start_main () from /lib/libc.so.6
 #12 0x000000000041169a in _start () at ../sysdeps/x86_64/elf/start.S:113
 }}}
 Does anyone have any suggestions on how to identify/fix this problem?

 Thanks!
 Uri

--

Comment:

 <reformat description>

 A segfault in linkage was recently fixed, so this problem may be gone.
 Setting ticket status to needs info, so will close in ~6 months unless the
 requested info is provided.

-- 
Ticket URL: <http://projects.scipy.org/scipy/ticket/967#comment:2>
SciPy <http://www.scipy.org>
SciPy is open-source software for mathematics, science, and engineering.


More information about the Scipy-tickets mailing list