[Numpy-discussion] convert strides/shape/offset into nd index?
Dag Sverre Seljebotn
dagss@student.matnat.uio...
Tue Dec 1 04:34:39 CST 2009
Dag Sverre Seljebotn wrote:
> Anne Archibald wrote:
>
>> 2009/11/30 James Bergstra <bergstrj@iro.umontreal.ca>:
>>
>>
>>> Your question involves a few concepts:
>>>
>>> - an integer vector describing the position of an element
>>>
>>> - the logical shape (another int vector)
>>>
>>> - the physical strides (another int vector)
>>>
>>> Ignoring the case of negative offsets, a physical offset is the inner
>>> product of the physical strides with the position vector.
>>>
>>> In these terms, you are asking how to solve the inner-product equation
>>> for the position vector. There can be many possible solutions (like,
>>> if there is a stride of 1, then you can make that dimension account
>>> for the entire offset. This is often not the solution you want.).
>>> For valid ndarrays though, there is at most one solution though with
>>> the property that every position element is less than the shape.
>>>
>>> You will also need to take into account that for certain stride
>>> vectors, there is no way to get certain offsets. Imagine all the
>>> strides were even, and you needed to get at an odd offset... it would
>>> be impossible. It would even be impossible if there were a dimension
>>> with stride 1 but it had shape of 1 too.
>>>
>>> I can't think of an algorithm off the top of my head that would do
>>> this in a quick and elegant way.
>>>
>>>
>> Not to be a downer, but this problem is technically NP-complete. The
>> so-called "knapsack problem" is to find a subset of a collection of
>> numbers that adds up to the specified number, and it is NP-complete.
>> Unfortunately, it is exactly what you need to do to find the indices
>> to a particular memory location in an array of shape (2,2,...,2).
>>
>> What that means in practice is that either you have to allow
>> potentially very slow algorithms (though you know that there will
>> never be more than 32 different values in the knapsack, which might or
>> might not be enough to keep things tractable) or use heuristic
>> algorithms that don't always work. There are probably fairly good
>> heuristics, particularly if the array elements are all at distinct
>> memory locations (arrays with overlapping elements can arise from
>> broadcasting and other slightly more arcane operations).
>>
>>
> Not that this should be done, but getting a chance to discuss NP is
> always fun:
>
> I think this particular problem can be solved in O(d*n^2) or better,
>
Hmm, I guess that should be O(d*n).
http://en.wikipedia.org/wiki/Knapsack_problem has the exact algorithm
(though it needs some customization).
Dag Sverre
> where n is the offset in question and d the number of dimensions of the
> array, by using dynamic programming on the buffer offset in question (so
> first try for offset 1, then 2, and so on up to n).
>
> Which doesn't contradict the fact that the problem is exponential (n is
> exponential in terms of the length of the input to the problem), but it
> is still not *too* bad in many cases, because the exponential term is
> always smaller than the size of the array.
>
> Dag Sverre
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
More information about the NumPy-Discussion
mailing list