[SciPy-Dev] Electronics student with programming background - Like to participate in SciPy - Write some code and learn!
Tue Jan 22 20:48:29 CST 2013
On Tue, Jan 22, 2013 at 4:16 PM, Daniel Smith <email@example.com> wrote:
> <josef.pktd <at> gmail.com> writes:
>> On Tue, Jan 22, 2013 at 2:26 PM, Daniel Smith <smith.daniel.br <at> gmail.com> wrote:
>> > Hello,
>> > I am also looking for ways to contribute to Scipy. I have experience
>> > with Python, C/C++ and limited experience with the Numpy C API. In
>> > particular, I have some code implementing the kernel density estimator
>> > bandwidth selection algorithm from the following paper:
>> > Z. I. Botev, J. F. Grotowski, and D. P. Kroese. Kernel density
>> > estimation via diffusion. The Annals of Statistics, 38(5):2916–2957,
>> > 2010.
>> > That method is more resilient to multi-modal data than the standard
>> > plug-in estimators. I would love to add that method to the current
>> > SciPy stats package if there is interest.
>> Looks interesting, either for scipy.stats or statsmodels.
>> statsmodels has now kde with least-squares cross-validation among
>> other bandwidth choices.
>> However, there is nothing to improve boundary effects or that has
>> adaptive bandwidth choice.
> Boundary effects are another issue. I don't have working code, but I have seen
> a few algorithms and could certainly add those corrections to existing code.
>> Which programming language did you write it in?
> Everything is in Python/SciPy/Numpy. The most computationally expensive parts
> are the FFT, iFFT and a fixed point calculation, which are all implemented in
> SciPy/Numpy. The code is reasonably fast as it stands. I could make it faster
> by using Cython or C for calculating the derivatives of the estimated
> probability distribution function (pdf).
sounds good about the implementation.
I didn't see that it uses fft (when I looked at the 42 page paper for
5 to 10 minutes :)
It could also be interesting to tie it in with the fft based kde in statsmodels
Ralph also worked on kde in scipy.stats and in statsmodels and will
also have an idea which might be a better fit.
Do you have the code somewhere publicly available?
>> and out of curiosity: Do you know how well the estimator behaves in
>> smaller samples, 200 or 500. The paper seems to consider sample size
>> of 1000 as small. (very fast skimming of article)
> Personally, I've had pretty good luck going down to 50-100 samples. The exact
> sample size needed largely depends on how ragged the pdf you are estimating is.
>> > Thanks,
>> > Daniel
>> >> Hi,
>> >> I am Surya, studying Junior Year - Electronics & Communication Engineering
>> >> with Computer Science/ Programming background. I have looked into SciPy and
>> >> its really amazing!
>> >> In this regard, I would like to explore the possibility of contributing to
>> >> this project by writing code and simultaneously learn the real engineering
>> >> stuff. My skills lie in Python, Django, C - and little Facebook API, Cloud
>> >> platforms (Openshift), Git.
>> >> Also, I wrote some fun-stuff projects during week ends which you might like
>> >> to take a look.
>> >> 1. Https://apps.facebook.com/pingmee -- Lets people ping their friends
>> >> using cartoons (Python, Django -- PIL)
>> >> 2. Https://apps.facebook.com/suryaphotography -- social reader framework
>> >> for my photography blog; Not yet finished (Python, Django -- Google Feed
>> >> API) - Got to finish if time permits
>> >> 3. Https://github.com/ksurya -- Github handle
>> >> So, I am ready to take up any work and get along with it that involves
>> >> Python!
>> >> Regarding my scientific skills, I studied Engineering Mathematics, Digital
>> >> Signal Processing (now studying), Signals & Systems etc. [ More on signals ]
>> >> Thanks for reading! waiting for your reply
>> >> -- Surya
>> > _______________________________________________
>> > SciPy-Dev mailing list
>> > SciPy-Dev <at> scipy.org
>> > http://mail.scipy.org/mailman/listinfo/scipy-dev
> SciPy-Dev mailing list
More information about the SciPy-Dev