[SciPy-user] SciPy-user Digest, Vol 37, Issue 6

Brendan Simons brendansimons at yahoo.ca
Fri Sep 8 09:27:22 CDT 2006


That's OK William.

I gave it some more thought, and I don't think there's any sparse storage scheme which would allow me to do constant or log time slicing over two axes.  The results from slicing the first axis would always have to be sorted, or individually tested, giving on average a complexity of O(n*log(n)) for the one axis sort and O(k*(log(n) + n/2)) for the lookups, where n is the number of intervals, and k is the number of stab inquiries.   Since both k and n are high for my case, I've decided to go the traditional route and use an interval tree, which has a complexity of O(n*log(n)) for pre-processing, and only O(k * log(n)) for the lookups.

Thanks anyway,
    Brendan

------------------------------

Message: 5
Date: Fri, 8 Sep 2006 07:52:14 +0200
From: "William Hunter" <willemjagter at gmail.com>
Subject: Re: [SciPy-user] Efficient submatrix of a sparse matrix
To: "SciPy Users List" <scipy-user at scipy.org>
Message-ID:
    <8b3894bc0609072252k6f5cdee5p367695c3dd96371f at mail.gmail.com>
Content-Type: text/plain; charset=UTF-8; format=flowed

Brendan;

It occured to me after I sent the mail that you're probably not into
solving sparse systems, but actually just slicing sparse matrices...
If so, my previous post is obviously not of much meaning to you.
--
WH

On 07/09/06, Brendan Simons <brendansimons at yahoo.ca> wrote:
> Hi all,
>
> I have a neat little problem for which I think sparse matrices are
> the answer.   I need to store a bunch of overlapping "intervals", (a,
> b pairs for which any a <= val < b crosses that interval) in a manner
> which makes "stabbing inquiries" (which intervals cross a specified
> value) easy.  The canonical way to to this is with interval trees (a
> good summary here: http://w3.jouy.inra.fr/unites/miaj/public/vigneron/
> cs4235/l5cs4235.pdf), however I think I can simplify things as follows:
>
> 1)  perform an argsort on the interval a and b endpoints.  This gives
> the position of each interval in an "order" matrix M, where each row
> and each column contains only one interval, and intervals are sorted
> in rows by start point, and in columns by endpoint
>
> 2) for a "stab" point x, do a binary search to find the row i and
> column j where x could be inserted into M and maintain its ordering.
> The submatrix of M[:i, j:]  would contains all those intervals which
> start before x, and end after x.
>
> Since the order matrix M is mostly empty it would make sense to use a
> sparse matrix storage scheme, however the only one in scipy.sparse
> that allows 2d slicing is the dok_matrix, and the slicing algorithm
> is O(n), which is much too expensive for my purposes (I have to do
> thousands of stabbing inquiries on a space containing thousands of
> intervals).
>
> Is there no more efficient algorithm for 2d slicing of a sparse matrix?
>
>   -Brendan
> --
> Brendan Simons
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>







More information about the SciPy-user mailing list