[Numpy-discussion] add xirr to numpy financial functions?

josef.pktd@gmai... josef.pktd@gmai...
Mon May 25 17:29:55 CDT 2009


On Mon, May 25, 2009 at 4:27 PM, Skipper Seabold <jsseabold@gmail.com> wrote:
> On Mon, May 25, 2009 at 3:40 PM, Joe Harrington <jh@physics.ucf.edu> wrote:
>> On Mon, 25 May 2009 13:51:38 -0400, josef.pktd@gmail.com wrote:
>>> On Mon, May 25, 2009 at 11:50 AM, Joe Harrington <jh@physics.ucf.edu> wrote:
>>> > On Sun, 24 May 2009 18:14:42 -0400 josef.pktd@gmail.com wrote:
>>> >> On Sun, May 24, 2009 at 4:33 PM, Joe Harrington <jh@physics.ucf.edu> wrote:
>>> >> > I hate to ask for another function in numpy, but there's an obvious
>>> >> > one missing in the financial group: xirr. ?It could be done as a new
>>> >> > function or as an extension to the existing np.irr.
>>> >> >
>>> >> > The internal rate of return (np.irr) is defined as the growth rate
>>> >> > that would give you a zero balance at the end of a period of
>>> >> > investment given a series of cash flows into or out of the investment
>>> >> > at regular intervals (the first and last cash flows are usually an
>>> >> > initial deposit and a withdrawal of the current balance).
>>> >> >
>>> >> > This is useful in academics, but if you're tracking a real investment,
>>> >> > you don't just withdraw or add money on a perfectly annual basis, nor
>>> >> > do you want a calc with thousands of days of zero entries just so you
>>> >> > can handle the uneven intervals by evening them out. ?Both excel and
>>> >> > openoffice define a "xirr" function that pairs each cash flow with a
>>> >> > date. ?Would there be an objection to either a xirr or adding an
>>> >> > optional second arg (or a keyword arg) to np.irr in numpy? ?Who writes
>>> >> > the code is a different question, but that part isn't hard.
>>> >> >
>>> >>
>>> >>
>>> >>
>>> >> 3 comments:
>>> >>
>>> >> * open office has also the other function in an x??? version, so it
>>> >> might be good to add it consistently to all functions
>>> >>
>>> >> * date type: scikits.timeseries and the gsoc for implementing a date
>>> >> type would be useful to have a clear date type, or would you want to
>>> >> base it only on python standard library
>>> >>
>>> >> * real life accuracy: given that there are large differences in the
>>> >> definition of a year for financial calculations, any simple
>>> >> implementation would be only approximately accurate. for example in
>>> >> the open office help, oddlyield list the following option
>>> >>
>>> >> Basis is chosen from a list of options and indicates how the year is
>>> >> to be calculated.
>>> >> Basis Calculation
>>> >> 0 or missing US method (NASD), 12 months of 30 days each
>>> >> 1 Exact number of days in months, exact number of days in year
>>> >> 2 Exact number of days in month, year has 360 days
>>> >> 3 Exact number of days in month, year has 365 days
>>> >> 4 European method, 12 months of 30 days each
>>> >>
>>> >> So, my question: what's the purpose of the financial function in numpy?
>>> >> Currently it provides convenient functions for (approximate) interest
>>> >> calculations.
>>> >> If they get expanded to a "serious" implementation of, for example,
>>> >> the main financial functions listed in the open office help (just for
>>> >> reference) then maybe numpy is not the right location for it.
>>> >>
>>> >> I started to do something similar in matlab, and once I tried to use
>>> >> real dates instead of just counting months, the accounting rules get
>>> >> quickly very messy.
>>> >>
>>> >> Using dates as you propose would be very convenient, but the users
>>> >> shouldn't be surprised that their actual payments at the end of the
>>> >> year don't fully match up with what numpy told them.
>>> >>
>>> >> my 3cents
>>> >>
>>> >> Josef
>>> >
>>> > First point: agreed. ?I wish this community had a design review
>>> > process for numpy and scipy, so that these things could get properly
>>> > hashed out, and not just one person (even Travis) suggesting something
>>> > and everyone else saying yeah-sure-whatever.
>>> >
>>> > Does anyone on the list have the financial background to suggest what
>>> > functions "should" be included in a basic set of financial routines?
>>> > xirr is the only one I've ever used in a spreadsheet, myself.
>>> >
>>> > Other points: Yuk. ?You're right.
>>> >
>>> > When these first came up for discussion, I had a Han Solo moment
>>> > ("I've got a baaad feeling about this...") but I couldn't put my
>>> > finger on why. ?They seemed like simple and limited functions with
>>> > high utility. ?Certainly anything as open-ended as financial-industry
>>> > rules should go elsewhere (scikits, scipy, monpy, whatever).
>>> >
>>> > But, that doesn't prevent a user-supplied, floating-point time array
>>> > from going into a function in numpy. ?The rate of return would be in
>>> > units of that array. ?Functions that convert date/time in some format
>>> > (or many) and following some rule (or one of many) to such a floating
>>> > array can still go elsewhere, maintained by people who know the
>>> > definitions, if they have interest (pun intended). ?That would make
>>> > the functions in numpy much more useful without bloating them or
>>> > making them a maintenance nightmare.
>>> >
>>>
>>> If you think of time just as a regularly spaced, e.g. days, but with
>>> sparse points on it, or as a continuous variable, then extending the
>>> current functions should be relatively easy. I guess the only
>>> questions are compounding, annual, quarterly or at each payment, and
>>> whether the annual rate is calculated as real compounded annualized
>>> rate or as accounting annual rate, e.g. quarterlyrate*4.
>>>
>>> This leaves "What is the present value, if you get 100 Dollars at the
>>> 10th day of each month (or at the next working day if the 10th day is
>>> a holiday or a weekend) for the next 5 years and the monthly interest
>>> rate is 5/12%?"   for another day.
>>>
>>> Initially I understood you wanted the date as a string or date type as
>>> in e.g open office. What would be the units of the user-supplied,
>>> floating-point time array?
>>> It is still necessary to know the time units to provide an annualized
>>> rate, unless the rate is in continuous time, exp(r*t). I don't know
>>> whether this would apply to all functions in numpy.finance, it's a
>>> while since I looked at the code. Maybe there are some standard
>>> simplifications in open office or excel.
>>>
>>> I briefly skimmed the list of function in the open office help, and it
>>> would be useful to have them available, e.g. as a package in scipy.
>>> But my google searches in the past for applications in finance with a
>>> compatible license didn't provide much useful code that could form the
>>> basis of a finance package.
>>>
>>> Adding more convenience and functionality to numpy.finance is useful,
>>> but if they get extended with slow feature creep, then another
>>> location (scipy) might be more appropriate and would be more
>>> expandable, even if it happens only slowly.
>>>
>>> That's just my opinion (obviously), I'm a relative newbie to
>>> numpy/scipy and still working my way through all the different
>>> subpackages.
>>
>> np.irr is defined on (anonymous) constant time intervals and gives you
>> the growth per time interval.  The code is very short, basically a
>> call to np.roots(values):
>>
>> def irr(values):
>>    """
>>    Return the Internal Rate of Return (IRR).
>>
>>    This is the rate of return that gives a net present value of 0.0.
>>
>>    Parameters
>>    ----------
>>    values : array_like, shape(N,)
>>        Input cash flows per time period.  At least the first value would be
>>        negative to represent the investment in the project.
>>
>>    Returns
>>    -------
>>    out : float
>>        Internal Rate of Return for periodic input values.
>>
>>    Examples
>>    --------
>>    >>> np.irr([-100, 39, 59, 55, 20])
>>    0.2809484211599611
>>
>>    """
>>    res = np.roots(values[::-1])
>>    # Find the root(s) between 0 and 1
>>    mask = (res.imag == 0) & (res.real > 0) & (res.real <= 1)
>>    res = res[mask].real
>>    if res.size == 0:
>>        return np.nan
>>    rate = 1.0/res - 1
>>    if rate.size == 1:
>>        rate = rate.item()
>>    return rate
>>
>> So, I think this is a continuous definition of growth, not some
>> periodic compounding.
>>
>> I'd propose the time array would be in anonymous units, and the result
>> would be in terms of those units.  For example, if an interval of 1.0
>> in the time array were one fortnight, it would give interest in units
>> of continuous growth per fortnight, etc.  Anything with many more
>> options than that does not belong in numpy (but it would be
>> interesting to have elsewhere).
>>
>
> Here is my stab at xirr.  It depends on the python datetime module and
> the Newton - Raphson algorithm in scipy.optimize, but it could be
> taken as a starting point if someone wants to get rid of the
> dependencies (I haven't worked too much with dates or NR before).  The
> reference for the open office version is here
> <http://wiki.services.openoffice.org/wiki/Documentation/How_Tos/Calc:_XIRR_function>,
> and it performs in exactly the same way (assumes 365 days a year).  It
> also doesn't take a 'begin' or 'end' argument for when the payments
> are made. but this is already in the numpy.financial and could be
> added easily.
>
> def _discf(rate, pmts, dates):
>    import numpy as np
>    dcf=[]
>    for i,cf in enumerate(pmts):
>        d=dates[i]-dates[0]
>        dcf.append(cf*(1+rate)**(-d.days/365.))
>    return np.add.reduce(dcf)
>
> def xirr(pmts, dates, guess=.10):
>    '''
>    IRR function that accepts irregularly spaced cash flows
>
>    Parameters
>    ----------
>    values: array_like
>          Contains the cash flows including the initial investment
>    dates: array_like
>          Contains the dates of payments as in the form (year, month, day)
>
>    Returns: Float
>          Internal Rate of Return
>
>    Notes
>    ----------
>    In general the xirr is the solution to
>
>    .. math:: \sum_{t=0}^M{\frac{v_t}{(1+xirr)^{(date_t-date_0)/365}}} = 0
>
>
>    Examples
>    --------------
>    dates=[[2008,2,5],[2008,7,5],[2009,1,5]]
>    pmts=[-2750,1000,2000]
>    print xirr(pmts,dates)
>    '''
>    from datetime import date
>    from scipy.optimize import newton
>
>    for i,dt in enumerate(dates):
>        dates[i]=date(*dt)
>
>    f = lambda x: _discf(x, pmts, dates)
>
>    return newton(f, guess)
>
> if __name__=="__main__":
>    dates=[[2008,2,5],[2008,7,5],[2009,1,5]]
>    pmts=[-2750,1000,2000]
>    print xirr(pmts,dates)


While I was still trying to think about the general problem, Skipper
already implemented a solution.

The advantage of Skippers implementation using actual dates instead of
just an array of numbers is that it is possible to directly calculate
the annual irr, since the time units are well specified. The only
problem is the need for an equation solver in numpy. Just using a date
tuple would remove the problem of string parsing, and it might be
possible to extend it later to a date array.

So, I think it would be possible to include Skippers solution, with
some cleanup and testing, if an equation solver can be found or if
np.roots can handle high order (sparse) polynomials.

Below is my original message, which is based on the assumption of a
date array that is just an array of numbers without any time units
associated with it.

Josef

"""
>From my reading of the current irr, you have compounding of the
interest rate at the given time interval. So if your data is daily
data, you would get a daily interest rate with daily compounded rates,
which might not be the most interesting number that the user wants.
For the iir function it would still be very easy for the user to
convert the daily or monthly rate to the annualized rate, (1+r_d)**365
-1  (1+r_m)**12 -1 (?).

For the implementation, would np.roots still work if you have 1000
days for example, or 360 months, or a few hundred fortnights? What
would be the alternative in numpy for finding the root? equation
solvers are in scipy.

For arbitrary time units with possible large numbers, working with exp
should be easier . In this case the exponent would be floats and not
integers, so not a polynomial.

I think in the continuous time version, we need to solve for r in
sum(values*exp(-r*dates)) = 0

Can this be done in numpy?
If dates are floats where the unit is one year, then this would give
the continuously compounded annual rate, I think.

Another property of the current function, that I just realized, is,
that it doesn't allow for negative interest rates. This might not be a
problem for the intended use, but if you look at real, i.e. inflation
adjusted, interest rates then it happens often enough.

Other options that might work if np.roots can handle it, would be to
use integer time internally but fractional time from the user where
the integer unit would be the reference period and the fractions would
be for example 2/12 for the second month. I never tried this but using
fractional units has a long enough tradition in finance.
Or that the user optionally specifies the time units ("y" or "m" or
"d") or number of periods per year (365, 12, 52, 26)
"""


More information about the Numpy-discussion mailing list