[SciPy-dev] stats.distributions.poisson loc parameter : is it wise ?
josef.pktd@gmai...
josef.pktd@gmai...
Thu Aug 6 17:16:13 CDT 2009
On Thu, Aug 6, 2009 at 6:02 PM, Pierre GM<pgmdevlist@gmail.com> wrote:
>
> On Aug 6, 2009, at 5:49 PM, Robert Kern wrote:
>
>> On Thu, Aug 6, 2009 at 16:43, Pierre GM<pgmdevlist@gmail.com> wrote:
>>> Even if
>>> the scale is simply discarded already, using a location will probably
>>> NOT give the expected result
>>
>> It depends on what your expectations are. For the discrete
>> distributions, all the loc parameter means is this, as documented:
>>
>> pmf(x; loc) -> pmf(x-loc)
>>
>> That's it. I don't know why you would expect anything else.
>
> Because using a location parameter, you change the support domain.
> Back to the example of a Poisson distribution with loc=1, the support
> domain is now x>=1, which amounts to truncating the zeroes. The mean
> of a zero-truncated Poisson with parameter pr should be pr/(1-exp(-
> pr)), but we end up with pr+1. Not the expected result.
> I think it's a source of confusion to keep a location parameter for
> discrete distributions. it'd be worth to implement method to allow
> truncation, but just a loc parameter doesn't do it.
>
>
> _______________________________________________
> Scipy-dev mailing list
> Scipy-dev@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-dev
>
loc just shifts the distribution on the real/integer line.
except for the fit method (which doesn't exist for discrete
distribution), I don't see any real disadvantage to having loc in
there as an option, but I guess in many cases it won't be very useful
either. I think there are also discrete distribution with unbound
support +/- inf for which a loc shift would make sense.
The big advantage of the current setup, as Robert said, is
consistency, both in the implementation and in code that goes over all
(or a large set of) distribution(s).
But for a long time, I have been all in favor of "fixing" the fit
method, and possibly introduce a semi-frozen distribution class, but
for this I don't see why we should special case location. fixing loc
is the main use case, but for example estimation with the scale
parameter fixed is also a common use case.
Josef
More information about the Scipy-dev
mailing list