[SciPy-Dev] Electronics student with programming background - Like to participate in SciPy - Write some code and learn!

josef.pktd@gmai... josef.pktd@gmai...
Tue Jan 22 20:48:29 CST 2013


On Tue, Jan 22, 2013 at 4:16 PM, Daniel Smith <smith.daniel.br@gmail.com> wrote:
>  <josef.pktd <at> gmail.com> writes:
>
>>
>> On Tue, Jan 22, 2013 at 2:26 PM, Daniel Smith <smith.daniel.br <at> gmail.com> wrote:
>> > Hello,
>> >
>> > I am also looking for ways to contribute to Scipy. I have experience
>> > with Python, C/C++ and limited experience with the Numpy C API. In
>> > particular, I have some code implementing the kernel density estimator
>> > bandwidth selection algorithm from the following paper:
>> >
>> > Z. I. Botev, J. F. Grotowski, and D. P. Kroese. Kernel density
>> > estimation via diffusion. The Annals of Statistics, 38(5):2916–2957,
>> > 2010.
>> >
>> > That method is more resilient to multi-modal data than the standard
>> > plug-in estimators. I would love to add that method to the current
>> > SciPy stats package if there is interest.
>>
>> Looks interesting, either for scipy.stats or statsmodels.
>> statsmodels has now kde with least-squares cross-validation among
>> other bandwidth choices.
>>
>> However, there is nothing to improve boundary effects or that has
>> adaptive bandwidth choice.
>
> Boundary effects are another issue. I don't have working code, but I have seen
> a few algorithms and could certainly add those corrections to existing code.
>
>>
>> Which programming language did you write it in?
>
> Everything is in Python/SciPy/Numpy. The most computationally expensive parts
> are the FFT, iFFT and a fixed point calculation, which are all implemented in
> SciPy/Numpy. The code is reasonably fast as it stands. I could make it faster
> by using Cython or C for calculating the derivatives of the estimated
> probability distribution function (pdf).

sounds good about the implementation.
I didn't see that it uses fft (when I looked at the 42 page paper for
5 to 10 minutes :)

It could also be interesting to tie it in with the fft based kde in statsmodels
https://github.com/statsmodels/statsmodels/blob/master/statsmodels/nonparametric/kde.py#L377

Ralph also worked on kde in scipy.stats and in statsmodels and will
also have an idea which might be a better fit.

Do you have the code somewhere publicly available?

Thanks,

Josef

>
>>
>> and out of curiosity: Do you know how well the estimator behaves in
>> smaller samples, 200 or 500. The paper seems to consider sample size
>> of 1000 as small. (very fast skimming of article)
>
> Personally, I've had pretty good luck going down to 50-100 samples. The exact
> sample size needed largely depends on how ragged the pdf you are estimating is.
>
>>
>> Josef
>>
>> >
>> > Thanks,
>> > Daniel
>> >
>> >> Hi,
>> >>
>> >> I am Surya, studying Junior Year - Electronics & Communication Engineering
>> >> with Computer Science/ Programming background. I have looked into SciPy and
>> >> its really amazing!
>> >>
>> >> In this regard, I would like to explore the possibility of contributing to
>> >> this project by writing code and simultaneously learn the real engineering
>> >> stuff. My skills lie in Python, Django, C - and little Facebook API, Cloud
>> >> platforms (Openshift), Git.
>> >>
>> >> Also, I wrote some fun-stuff projects during week ends which you might like
>> >> to take a look.
>> >>
>> >> 1. Https://apps.facebook.com/pingmee -- Lets people ping their friends
>> >> using cartoons (Python, Django -- PIL)
>> >> 2. Https://apps.facebook.com/suryaphotography -- social reader framework
>> >> for my photography blog; Not yet finished (Python, Django -- Google Feed
>> >> API) - Got to finish if time permits
>> >> 3. Https://github.com/ksurya -- Github handle
>> >>
>> >> So, I am ready to take up any work and get along with it that involves
>> >> Python!
>> >>
>> >> Regarding my scientific skills, I studied Engineering Mathematics, Digital
>> >> Signal Processing (now studying), Signals & Systems etc. [ More on signals ]
>> >>
>> >>
>> >> Thanks for reading! waiting for your reply
>> >>
>> >> -- Surya
>> > _______________________________________________
>> > SciPy-Dev mailing list
>> > SciPy-Dev <at> scipy.org
>> > http://mail.scipy.org/mailman/listinfo/scipy-dev
>>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-dev


More information about the SciPy-Dev mailing list