[SciPy-User] specifying range in scipy.stats.truncnorm

josef.pktd@gmai... josef.pktd@gmai...
Fri Jul 27 22:05:02 CDT 2012

On Fri, Jul 27, 2012 at 8:49 PM, Joon Ro <joonpyro@gmail.com> wrote:
> On Fri 27 Jul 2012 01:39:50 PM CDT, josef.pktd@gmail.com wrote:
>> On Fri, Jul 27, 2012 at 1:58 PM, Joon Ro <joonpyro@gmail.com> wrote:
>>> On Fri 27 Jul 2012 12:07:15 PM CDT, josef.pktd@gmail.com wrote:
>>>> On Fri, Jul 27, 2012 at 12:30 PM, Joon Ro <joonpyro@gmail.com> wrote:
>>>>> Hi,
>>>>> I tried to use scipy.stats.truncnorm and found the way to specifying the
>>>>> parameters of truncated normal very confusing.
>>>>> I expected a, b parameter to be the specification of the interval where I
>>>>> want to truncate the distribution at, but it is not the case when the normal
>>>>> I want to use is not standard.
>>>>> According to the documentation, I need to standardize my values - for
>>>>> example, if I want to have a truncated normal with mean 0.5, variance 1, on
>>>>> [0, 1] interval, I need to do:
>>>>> myclip_a = 0
>>>>> myclip_b = 1
>>>>> my_mean=0.5
>>>>> my_std =1
>>>>> a, b = (myclip_a - my_mean) / my_std, (myclip_b - my_mean) / my_std
>>>>> rv = truncnorm(a, b, loc=my_mean, scale=my_std)
>>>>> Which is unnecessarily complicated in my opinion. Since we have to provide
>>>>> location and scale parameter anyway, why not make truncnorm to accept the
>>>>> actual interval values (in this case, a, b = 0, 1) instead and do the
>>>>> standardization internally? I think it would be more intuitive that way.
>>>> I agree there are several cases of distributions where the
>>>> parameterization is not very intuitive or common. The problem is loc
>>>> and scale and the corresponding transformation of the support is done
>>>> generically.
>>>> So, I don't think it's possible to change this without a change in the
>>>> generic setup for the distributions or writing a specific dispatch
>>>> function or class that does the conversion.
>>>> I think, changing the generic setup would break the standard behavior
>>>> of distributions that have a predefined finite support limit, like
>>>> those that are defined for positive real numbers, a=0, or rdist with
>>>> a=-1, b=1.
>>>> Josef
>>> I just took a look at the code, and I agree.
>>> I wonder if it would be possible to add a couple of more parameters (in
>>> this case, representing the not-standardized interval) with default
>>> None to the generic rv_continuous class and when they are passed
>>> instead of a and b, let a distribution specific function do the
>>> standardization and calculate a and b.
>> a,b are set and would have to be adjusted in _argcheck.
>> _argcheck is currently called only with the shape parameters, but not
>> with loc and scale as argument. It would be possible to adjust this.
>> My guess is that having _argcheck compensate for loc and scale should
>> work. Having a possible change in behavior and extra parameters might
>> get confusing. (distributions are instances and not classes, so care
>> needs to be taken that there are no unwanted spillovers from one use
>> to the next.)
>> If you use frozen distributions, as in your initial example, then
>> doing the reparameterization in the frozen class might be easier, then
>> in the original classes.
> I also think changing what a and b represent is the best way but I
> wonder if it is okay (for compatibility reasons)

Not for the distribution classes, we need a and b for the
standard(ized) distributions.

Take for example lognorm, a=0 ind the standard case (loc=0, scale=1),
and the lower bound of the shifted distribution is loc.
Or pareto, a=1, lower bound of shifted distribution is loc+1

There is no separate way to choose a different lower bound, loc is the
relevant parameter not a.

Very few distribution, like the truncated distributions, have a and b
explicitly as parameters.


> -Joon
> _______________________________________________
> SciPy-User mailing list
> SciPy-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user

More information about the SciPy-User mailing list