[SciPy-Dev] splitting an ordered list as evenly as possilbe

Jeff Whitaker jswhit@fastmail...
Wed Aug 25 09:44:51 CDT 2010

  On 8/25/10 8:00 AM, John Hunter wrote:
> Suppose I have an ordered list/array of numbers, and I want to split
> them into N chunks, such that the intersection of any chunk with each
> other is empty and the data is split as evenly as possible (eg the std
> dev of the lengths of the chunks is minimized or some other such
> criterion).  Context: I am trying to do a quintile analysis on some
> data, and np.percentile doesn't behave like I want because more than
> 20% of my data equals 1, so 1  is in the first and second quintiles.
> I want to avoid this -- I'd rather have uneven counts in my quintiles
> than have the same value show up in multiple quintiles, but I'd like
> the counts to be as even as possible..
> Here is some sample code that illustrates my problem:
> ....

John:  This is a problem we have quite often analyzing precip data in 
arid regions - most of the time it just doesn't rain so the distribution 
has a delta function peak at zero.  There is no good way around it.  
Sometimes people split up the sample into rain and no-rain, and treat 
the two distributions separately.


Jeffrey S. Whitaker         Phone  : (303)497-6313
Meteorologist               FAX    : (303)497-6449
NOAA/OAR/PSD  R/PSD1        Email  : Jeffrey.S.Whitaker@noaa.gov
325 Broadway                Office : Skaggs Research Cntr 1D-113
Boulder, CO, USA 80303-3328 Web    : http://tinyurl.com/5telg

More information about the SciPy-Dev mailing list