PercentRank reloaded

The community user @randyt has been able to stretch backtrader to its limits. Finding some of the obscure corners, even adding pdb statements here and there, and has been the driving force behind getting a much more refined synchronization of resampled streams.

Lately, @randyt added a pull request to integrate a new indicator named PercentRank. Here is the original code

class PercentRank(bt.Indicator):
    lines = ('pctrank',)
    params = (('period', 50),)

    def __init__(self):
        self.addminperiod(self.p.period)

    def next(self):
        self.lines.pctrank[0] = \
            (math.fsum([x < self.data[0]
                       for x in self.data.get(size=self.p.period)])
            / self.p.period)
        super(PercentRank, self).__init__()

It really shows how someone has got into the source code of backtrader, fired some questions, and grasped some concepts. This is really good:

self.addminperiod(self.p.period)

Unexpected, because end users wouldn’t be expected to even know that someone can use that API call in a lines objects. This call tells the machinery to make sure that the indicator will have at least self.p.period samples of the data feeds available, because they are needed for the calculation.

In the original code a self.data.get(size=self.p.period) can be seen, which will only work if the background engine has made sure that those many samples are available before making the 1^st ever calculation (and if exactbars is used to reduce memory usage, that those many samples are always there)

Initial Reload

The code can be reworked to take advantage of pre-existing utilities that are intended to alleviate the development. Nothing end users have to be aware of, but good to know if one is constantly developing or prototyping indicators.

class PercentRank_PeriodN1(bt.ind.PeriodN):
    lines = ('pctrank',)
    params = (('period', 50),)

    def next(self):
        d0 = self.data[0]  # avoid dict/array lookups each time
        dx = self.data.get(size=self.p.period)
        self.l.pctrank[0] = math.fsum((x < d0 for x in dx)) / self.p.period

Reusing PeriodN is key to remove the self.addminperiod magic and make the indicator somehow more amenable. PeriodN already features a period params and will make the call for the user (remember to call super(cls, self).__init__() if __init__ is overridden.

The calculation has been broken into 3 lines to cache dictionary and array lookups in the first place and make it more readable (although the latter is just a matter of taste)

The code has also gone down from 13 to 8 lines. This usually helps when reading.

Reloading via OperationN

Existing indicators like SumN, which sums the values of a data source over a period, do not build directly upon PeriodN like above, but on a subclass of it named OperationN. Like its parent class it still defines no lines and has a class attribute named func.

func will be called with an array which contains the data of the period the host function hast to operate on. The signature basically is: func(data[0:period]) and returns something suitable to be stored in a line, i.e.: a float value.

Knowing that, we can try the obvious

class PercentRank_OperationN1(bt.ind.OperationN):
    lines = ('pctrank',)
    params = (('period', 50),)
    func = (lambda d: math.fsum((x < d[-1] for x in d)) / self.p.period)

Down to 4 lines. But this will fail with (only last line needed):

TypeError: <lambda>() takes 1 positional argument but 2 were given

(Use --strat n1=True to make the sample fail)

By putting our unnamed function inside func it seems to have been turned into a method, because it’s taking two parameters. This can be quickly cured.

class PercentRank_OperationN2(bt.ind.OperationN):
    lines = ('pctrank',)
    params = (('period', 50),)
    func = (lambda self, d: math.fsum((x < d[-1] for x in d)) / self.p.period)

And it works. But there is something ugly: this is not how one would be expecting most of the time to pass a function, i.e.: taking self as a parameter. In this case, we have control of the function, but this may not always be the case (a wrapper would be needed to work around it)

Syntactic sugar in Python comes to the rescue with staticmethod, but before we even do that, we know the reference to self.p.period will no longer be possible in a staticmethod, losing the ability to make the average calculation as before.

But since func receives an iterable with a fixed length, len can be used.

And now the new code.

class PercentRank_OperationN3(bt.ind.OperationN):
    lines = ('pctrank',)
    params = (('period', 50),)
    func = staticmethod(lambda d: math.fsum((x < d[-1] for x in d)) / len(d))

All good and fine, but this has put some thinking into why it wasn’t thought before to give users the chance to pass their own function. Subclassing OperationN is a good option, but something better can be around the corner avoiding the need to use staticmethod or take self as a parameter and building upon the machinery in backtrader.

Let’s define a handy subclass of OperationN.

class ApplyN(bt.ind.OperationN):
    lines = ('apply',)
    params = (('func', None),)

    def __init__(self):
        self.func = self.p.func
        super(ApplyN, self).__init__()

This should probably have been in the platform already a long time ago. The only real discern here would be if the lines = ('apply',) has to be there or users should be free to define that line and some others. Something to consider before integration.

With ApplyN in hand the final versions of PercentRank fully comply with all things we expected. First, the version with manual average calculation.

class PercentRank_ApplyN(ApplyN):
    params = (
        ('period', 50),
        ('func', lambda d: math.fsum((x < d[-1] for x in d)) / len(d)),
    )

Without breaking PEP-8 we could still reformat both to fit in 3 lines … Good!

Let’s run the sample

The sample which can be seen below has the usual skeleton boilerplate, but is intended to show a visual comparison of the different PercentRank implementations.

Note

Execute it with --strat n1=True to try the PercentRank_OperationN1 version which doesn’t work

Graphic output.

Sample Usage

$ ./percentrank.py --help
usage: percentrank.py [-h] [--data0 DATA0] [--fromdate FROMDATE]
                      [--todate TODATE] [--cerebro kwargs] [--broker kwargs]
                      [--sizer kwargs] [--strat kwargs] [--plot [kwargs]]

Sample Skeleton

optional arguments:
  -h, --help           show this help message and exit
  --data0 DATA0        Data to read in (default:
                       ../../datas/2005-2006-day-001.txt)
  --fromdate FROMDATE  Date[time] in YYYY-MM-DD[THH:MM:SS] format (default: )
  --todate TODATE      Date[time] in YYYY-MM-DD[THH:MM:SS] format (default: )
  --cerebro kwargs     kwargs in key=value format (default: )
  --broker kwargs      kwargs in key=value format (default: )
  --sizer kwargs       kwargs in key=value format (default: )
  --strat kwargs       kwargs in key=value format (default: )
  --plot [kwargs]      kwargs in key=value format (default: )

Sample Code

from __future__ import (absolute_import, division, print_function,
                        unicode_literals)


import argparse
import datetime
import math

import backtrader as bt



class PercentRank(bt.Indicator):
    lines = ('pctrank',)
    params = (('period', 50),)

    def __init__(self):
        self.addminperiod(self.p.period)

    def next(self):
        self.lines.pctrank[0] = \
            (math.fsum([x < self.data[0]
                       for x in self.data.get(size=self.p.period)])
            / self.p.period)
        super(PercentRank, self).__init__()


class PercentRank_PeriodN1(bt.ind.PeriodN):
    lines = ('pctrank',)
    params = (('period', 50),)

    def next(self):
        d0 = self.data[0]  # avoid dict/array lookups each time
        dx = self.data.get(size=self.p.period)
        self.l.pctrank[0] = math.fsum((x < d0 for x in dx)) / self.p.period


class PercentRank_OperationN1(bt.ind.OperationN):
    lines = ('pctrank',)
    params = (('period', 50),)
    func = (lambda d: math.fsum((x < d[-1] for x in d)) / self.p.period)


class PercentRank_OperationN2(bt.ind.OperationN):
    lines = ('pctrank',)
    params = (('period', 50),)
    func = (lambda self, d: math.fsum((x < d[-1] for x in d)) / self.p.period)


class PercentRank_OperationN3(bt.ind.OperationN):
    lines = ('pctrank',)
    params = (('period', 50),)
    func = staticmethod(lambda d: math.fsum((x < d[-1] for x in d)) / len(d))


class ApplyN(bt.ind.OperationN):
    lines = ('apply',)
    params = (('func', None),)

    def __init__(self):
        self.func = self.p.func
        super(ApplyN, self).__init__()


class PercentRank_ApplyN(ApplyN):
    params = (
        ('period', 50),
        ('func', lambda d: math.fsum((x < d[-1] for x in d)) / len(d)),
    )


class St(bt.Strategy):
    params = (
        ('n1', False),
    )

    def __init__(self):
        PercentRank()
        PercentRank_PeriodN1()
        if self.p.n1:
            PercentRank_OperationN1()
        PercentRank_OperationN2()
        PercentRank_OperationN3()
        PercentRank_ApplyN()

    def next(self):
        pass


def runstrat(args=None):
    args = parse_args(args)

    cerebro = bt.Cerebro()

    # Data feed kwargs
    kwargs = dict()

    # Parse from/to-date
    dtfmt, tmfmt = '%Y-%m-%d', 'T%H:%M:%S'
    for a, d in ((getattr(args, x), x) for x in ['fromdate', 'todate']):
        if a:
            strpfmt = dtfmt + tmfmt * ('T' in a)
            kwargs[d] = datetime.datetime.strptime(a, strpfmt)

    # Data feed
    data0 = bt.feeds.BacktraderCSVData(dataname=args.data0, **kwargs)
    cerebro.adddata(data0)

    # Broker
    cerebro.broker = bt.brokers.BackBroker(**eval('dict(' + args.broker + ')'))

    # Sizer
    cerebro.addsizer(bt.sizers.FixedSize, **eval('dict(' + args.sizer + ')'))

    # Strategy
    cerebro.addstrategy(St, **eval('dict(' + args.strat + ')'))

    # Execute
    cerebro.run(**eval('dict(' + args.cerebro + ')'))

    if args.plot:  # Plot if requested to
        cerebro.plot(**eval('dict(' + args.plot + ')'))


def parse_args(pargs=None):
    parser = argparse.ArgumentParser(
        formatter_class=argparse.ArgumentDefaultsHelpFormatter,
        description=(
            'Sample Skeleton'
        )
    )

    parser.add_argument('--data0', default='../../datas/2005-2006-day-001.txt',
                        required=False, help='Data to read in')

    # Defaults for dates
    parser.add_argument('--fromdate', required=False, default='',
                        help='Date[time] in YYYY-MM-DD[THH:MM:SS] format')

    parser.add_argument('--todate', required=False, default='',
                        help='Date[time] in YYYY-MM-DD[THH:MM:SS] format')

    parser.add_argument('--cerebro', required=False, default='',
                        metavar='kwargs', help='kwargs in key=value format')

    parser.add_argument('--broker', required=False, default='',
                        metavar='kwargs', help='kwargs in key=value format')

    parser.add_argument('--sizer', required=False, default='',
                        metavar='kwargs', help='kwargs in key=value format')

    parser.add_argument('--strat', required=False, default='',
                        metavar='kwargs', help='kwargs in key=value format')

    parser.add_argument('--plot', required=False, default='',
                        nargs='?', const='{}',
                        metavar='kwargs', help='kwargs in key=value format')

    return parser.parse_args(pargs)


if __name__ == '__main__':
    runstrat()