Rebalancing with the Conservative Formula

The Conservative Formula approach is presented in this paper: The Conservative Formula in Python: Quantitative Investing made Easy

It is one many possible rebalancing approaches, but one that is easy to grasp. A summary of the approach:

• `x` stocks are selected from a universe of `Y` (100 of 1000)

• The selection criteria are

• Low volatility
• High Net Payout Yield
• High Momentum
• Rebalancing every month

With this in mind let's go and present a possible implementation in backtrader

The data

Even if one has a winning strategy, nothing will be actually won if no data is available for the strategy. Which means that it has to be considered how the data looks like and how to load it.

A set of CSV ("comma-separated-values") files is assumed to be available, containing with the following features

• `ohlcv` monthly data

• With an extra field after the `v` containing the Net Payout Yield (`npy`), to have an `ohlcvn` data set.

The format of the CSV data will therefore look like this

```date, open, high, low, close, volume, npy
2001-12-31, 1.0, 1.0, 1.0, 1.0, 0.5, 3.0
2002-01-31, 2.0, 2.5, 1.1, 1.2, 3.0, 5.0
...
```

I.e.: one row per month. The data loader engine can now be prepared for which simple extension of the generic built-in CSV loader delivered with backtrader will be created.

```class NetPayOutData(bt.feeds.GenericCSVData):
lines = ('npy',)  # add a line containing the net payout yield
params = dict(
npy=6,  # npy field is in the 6th column (0 based index)
dtformat='%Y-%m-%d',  # fix date format a yyyy-mm-dd
timeframe=bt.TimeFrame.Months,  # fixed the timeframe
openinterest=-1,  # -1 indicates there is no openinterest field
)
```

And that is. Notice how easy has been to add a point of fundamental data to the `ohlcv` data stream.

1. By using the expresion `lines=('npy',)`. The other usual fields (`open`, `high`, ...) are already part of `GenericCSVData`

2. By indicating the loading position with the `params = dict(npy=6)`. The other fields have a predefined position.

The timeframe has also been updated in the parameters to reflect the monthly nature of the data.

Note

See Docs - Data Feeds Reference - GenericCSVData for the actual fields and loading positions (which can all be customized)

The data loader will have to be properly instantiated with a file name, but that's something for later, when a standard boilerplate is presented below to have a complete script.

The Strategy

Let's put the logic into a standard backtrader strategy. To make it as generic and customizable as possible, the same same `params` approach will be used, as it was used before with the data.

Before delving into the strategy, let's consider one of the points from the quick summary

• `x` stocks are selected from a universe of `Y`

The strategy itself is not in charge of adding stocks to the universe, but it is in charge of the selection. One could be in a situation in which only 50 stocks have been added and still try to select 100 if `x` and `Y` are fixed in the code. To cope with such situations, the following will be done:

• Have a `selperc` parameter with a value of `0.10` (i.e.: `10%`), to indicate the amount of stocks to be selected from the universe.

This means that if 1000 are present, only 100 will be selected and if the universe consist of 50 stocks, only 5 will be selected.

As for the formula ranking the stock, it looks like this:

• `(momentum * net payout) / volatility`

Which means that those with higher momentum, higher payout and lower volatility will have a higher score.

For `momentum` the `RateOfChange` indicator (aka `ROC`) will be used, which measures the ratio of change in prices over a period.

The `net payout` is already part of the data feed.

To calculate the `volatility`, the `StandardDeviation` of the `n-periods` return of the stock (`n-periods`, because things will be kept as parameters) will be used.

With this information, the strategy can already be initialize with the right parameters and the setup of the indicators and calculations which will be later used in each monthly iteration.

First the declaration and the parameters

```class St(bt.Strategy):
params = dict(
selcperc=0.10,  # percentage of stocks to select from the universe
rperiod=1,  # period for the returns calculation, default 1 period
vperiod=36,  # lookback period for volatility - default 36 periods
mperiod=12,  # lookback period for momentum - default 12 periods
reserve=0.05  # 5% reserve capital
)
```

Notice that something not mentioned above has been added, and that is a parameter `reserve=0.05` (i.e. 5%), which is used to calculated the percentage allocation per stock, keeping a reserve capital in the bank. Although for a simulation one could conceivable want to use 100% of the capital, one can hit the usual problems doing that, such as price gaps, floating point precision and end up missing some of the market entries.

Before anything else, a small logging method is created, which will allow to log how the portfolio is rebalanced.

```    def log(self, arg):
print('{} {}'.format(self.datetime.date(), arg))
```

At the beginning of the `__init__` method, the number of stocks to rank is calculated and the reserve capital parameter is applied to determine the per stock percentage of the bank.

```    def __init__(self):
# calculate 1st the amount of stocks that will be selected
self.selnum = int(len(self.datas) * self.p.selcperc)

# allocation perc per stock
# reserve kept to make sure orders are not rejected due to
# margin. Prices are calculated when known (close), but orders can only
# be executed next day (opening price). Price can gap upwards
self.perctarget = (1.0 - self.p.reserve) % self.selnum
```

And finally the initialization is over with the calculation of the per stock indicators for volatility and momentum, which are then applied in the per stock ranking formula calculation.

```        # returns, volatilities and momentums
rs = [bt.ind.PctChange(d, period=self.p.rperiod) for d in self.datas]
vs = [bt.ind.StdDev(ret, period=self.p.vperiod) for ret in rs]
ms = [bt.ind.ROC(d, period=self.p.mperiod) for d in self.datas]

# simple rank formula: (momentum * net payout) / volatility
# the highest ranked: low vol, large momentum, large payout
self.ranks = {d: d.npy * m / v for d, v, m in zip(self.datas, vs, ms)}
```

It's now time to iterate each month. The ranking is available in the `self.ranks` dictionary. The key/value pairs have to be sorted for each iteration, to get which items have to go and which ones have to be part of the portfolio (remain or be added)

```    def next(self):
# sort data and current rank
ranks = sorted(
self.ranks.items(),  # get the (d, rank), pair
key=lambda x: x[1][0],  # use rank (elem 1) and current time "0"
reverse=True,  # highest ranked 1st ... please
)
```

The iterable is sorted in reverse order, because the ranking formula delivers higher scores for the highest ranked stocks.

Rebalancing is now due.

Rebalancing 1: Get Top Ranked and the stocks with open positions

```        # put top ranked in dict with data as key to test for presence
rtop = dict(ranks[:self.selnum])

# For logging purposes of stocks leaving the portfolio
rbot = dict(ranks[self.selnum:])
```

A bit of Python trickery is happening here, because a `dict` is being used. The reason is that if the top ranked stocks were put in a `list` the operator `==` would be used internally by Python to check for presence with the operator `in`. And although improbable it would be possible for two stocks to have the same value on the same day. When using a `dict` a hash value is used when checking for presence of an item as part of the keys.

Note: For logging purposes `rbot` (ranked bottom) is also created with the stocks not present in `rtop`.

To later discriminate between stocks that have to leave the portfolio, those which simply have to be rebalanced and the newly top ranked, a current list of stocks in the portfolio is prepared.

```        # prepare quick lookup list of stocks currently holding a position
posdata = [d for d, pos in self.getpositions().items() if pos]
```

Rebalancing 2: Sell those no longer top ranked

Just like in real world, in the backtrader ecosystem selling before buying is a must to ensure enough cash is there.

```        # remove those no longer top ranked
# do this first to issue sell orders and free cash
for d in (d for d in posdata if d not in rtop):
self.log('Exit {} - Rank {:.2f}'.format(d._name, rbot[d][0]))
self.order_target_percent(d, target=0.0)
```

Stocks currently with an open position and no longer top ranked are sold (i.e. `target=0.0`).

Note

A simple `self.close(data)` would have sufficed here, rather than explicitly stating the target percentage.

Rebalancing 3: Issue a target order for all top ranked stocks

The total portfolio value changes over time and those stocks already in the portfolio may have to slightly increase/reduce the current position to match the expected percentage. `order_target_percent` is an ideal method to enter the market, because it does automatically calculate whether a `buy` or a `sell` order is needed.

```        # rebalance those already top ranked and still there
for d in (d for d in posdata if d in rtop):
self.log('Rebal {} - Rank {:.2f}'.format(d._name, rtop[d][0]))
self.order_target_percent(d, target=self.perctarget)
del rtop[d]  # remove it, to simplify next iteration
```

Rebalancing the stocks already with a position is done before adding the new ones to the portfolio, as the new one will only issue `buy` orders and consume cash. Having removed the existing stocks from with `rtop[data].pop()` after having re-balanced, the remaining stocks in `rtop` are those which will be newly added to the portfolio.

```        # issue a target order for the newly top ranked stocks
# do this last, as this will generate buy orders consuming cash
for d in rtop:
self.log('Enter {} - Rank {:.2f}'.format(d._name, rtop[d][0]))
self.order_target_percent(d, target=self.perctarget)
```

Running it all and Evaluating it!

Having a data loader class and the strategy is not enough. Just like with any other framework, some boilerplate is needed. The following code makes it possible.

```def run(args=None):
args = parse_args(args)

cerebro = bt.Cerebro()

# Data feed kwargs
dkwargs = dict(**eval('dict(' + args.dargs + ')'))

# Parse from/to-date
dtfmt, tmfmt = '%Y-%m-%d', 'T%H:%M:%S'
if args.fromdate:
fmt = dtfmt + tmfmt * ('T' in args.fromdate)
dkwargs['fromdate'] = datetime.datetime.strptime(args.fromdate, fmt)

if args.todate:
fmt = dtfmt + tmfmt * ('T' in args.todate)
dkwargs['todate'] = datetime.datetime.strptime(args.todate, fmt)

# add all the data files available in the directory datadir
for fname in glob.glob(os.path.join(args.datadir, '*')):
data = NetPayOutData(dataname=fname, **dkwargs)

cerebro.addstrategy(St, **eval('dict(' + args.strat + ')'))

# set the cash
cerebro.broker.setcash(args.cash)

cerebro.run()  # execute it all

# Basic performance evaluation ... final value ... minus starting cash
pnl = cerebro.broker.get_value() - args.cash
print('Profit ... or Loss: {:.2f}'.format(pnl))
```

Where the following is done:

• Parsing arguments and have this available (this is obviously optional, as everything can be hardcoded, but good practices are good practices)

• Creating a `cerebro` engine instance. Yes, this is Spanish for "brain" and is the part of the framework in charge of coordinating the orchestral maneuvers in the dark. Although it can accept several options, the defaults should suffice for most use cases.

• Loading the data files, which is done with a simple directory scan of `args.datadir` is done and all files are loaded with `NetPayOutData` and added to the `cerebro` instance

• Adding the strategy

• Setting the cash, which defaults to `1,000,000`. Given that the use case is for `100` stocks in a universe of `500`, it seems fair to have some cash to spare. It is also an argument which can be changed.

• And calling `cerebro.run()`

• Finally the performance is evaluated

To make it possible to run things with different parameters straight from the command line, an `argparse` enabled boilerplate is presented below, with the entire code

Performance Evaluation

A naive performance evaluation added in the form of the final resulting value, i.e.: the final net asset value minus the starting cash.

The backtrader ecosystem offers a set of built-in performance analyzers which could also be used, like: `SharpeRatio`, `Variability-Weighted Return`, `SQN` and others. See Docs - Analyzers Reference

The complete script

And finally the bulk of the work presented as whole. Enjoy!

```import argparse
import datetime
import glob
import os.path

import backtrader as bt

class NetPayOutData(bt.feeds.GenericCSVData):
lines = ('npy',)  # add a line containing the net payout yield
params = dict(
npy=6,  # npy field is in the 6th column (0 based index)
dtformat='%Y-%m-%d',  # fix date format a yyyy-mm-dd
timeframe=bt.TimeFrame.Months,  # fixed the timeframe
openinterest=-1,  # -1 indicates there is no openinterest field
)

class St(bt.Strategy):
params = dict(
selcperc=0.10,  # percentage of stocks to select from the universe
rperiod=1,  # period for the returns calculation, default 1 period
vperiod=36,  # lookback period for volatility - default 36 periods
mperiod=12,  # lookback period for momentum - default 12 periods
reserve=0.05  # 5% reserve capital
)

def log(self, arg):
print('{} {}'.format(self.datetime.date(), arg))

def __init__(self):
# calculate 1st the amount of stocks that will be selected
self.selnum = int(len(self.datas) * self.p.selcperc)

# allocation perc per stock
# reserve kept to make sure orders are not rejected due to
# margin. Prices are calculated when known (close), but orders can only
# be executed next day (opening price). Price can gap upwards
self.perctarget = (1.0 - self.p.reserve) / self.selnum

# returns, volatilities and momentums
rs = [bt.ind.PctChange(d, period=self.p.rperiod) for d in self.datas]
vs = [bt.ind.StdDev(ret, period=self.p.vperiod) for ret in rs]
ms = [bt.ind.ROC(d, period=self.p.mperiod) for d in self.datas]

# simple rank formula: (momentum * net payout) / volatility
# the highest ranked: low vol, large momentum, large payout
self.ranks = {d: d.npy * m / v for d, v, m in zip(self.datas, vs, ms)}

def next(self):
# sort data and current rank
ranks = sorted(
self.ranks.items(),  # get the (d, rank), pair
key=lambda x: x[1][0],  # use rank (elem 1) and current time "0"
reverse=True,  # highest ranked 1st ... please
)

# put top ranked in dict with data as key to test for presence
rtop = dict(ranks[:self.selnum])

# For logging purposes of stocks leaving the portfolio
rbot = dict(ranks[self.selnum:])

# prepare quick lookup list of stocks currently holding a position
posdata = [d for d, pos in self.getpositions().items() if pos]

# remove those no longer top ranked
# do this first to issue sell orders and free cash
for d in (d for d in posdata if d not in rtop):
self.log('Leave {} - Rank {:.2f}'.format(d._name, rbot[d][0]))
self.order_target_percent(d, target=0.0)

# rebalance those already top ranked and still there
for d in (d for d in posdata if d in rtop):
self.log('Rebal {} - Rank {:.2f}'.format(d._name, rtop[d][0]))
self.order_target_percent(d, target=self.perctarget)
del rtop[d]  # remove it, to simplify next iteration

# issue a target order for the newly top ranked stocks
# do this last, as this will generate buy orders consuming cash
for d in rtop:
self.log('Enter {} - Rank {:.2f}'.format(d._name, rtop[d][0]))
self.order_target_percent(d, target=self.perctarget)

def run(args=None):
args = parse_args(args)

cerebro = bt.Cerebro()

# Data feed kwargs
dkwargs = dict(**eval('dict(' + args.dargs + ')'))

# Parse from/to-date
dtfmt, tmfmt = '%Y-%m-%d', 'T%H:%M:%S'
if args.fromdate:
fmt = dtfmt + tmfmt * ('T' in args.fromdate)
dkwargs['fromdate'] = datetime.datetime.strptime(args.fromdate, fmt)

if args.todate:
fmt = dtfmt + tmfmt * ('T' in args.todate)
dkwargs['todate'] = datetime.datetime.strptime(args.todate, fmt)

# add all the data files available in the directory datadir
for fname in glob.glob(os.path.join(args.datadir, '*')):
data = NetPayOutData(dataname=fname, **dkwargs)

cerebro.addstrategy(St, **eval('dict(' + args.strat + ')'))

# set the cash
cerebro.broker.setcash(args.cash)

cerebro.run()  # execute it all

# Basic performance evaluation ... final value ... minus starting cash
pnl = cerebro.broker.get_value() - args.cash
print('Profit ... or Loss: {:.2f}'.format(pnl))

def parse_args(pargs=None):
parser = argparse.ArgumentParser(
formatter_class=argparse.ArgumentDefaultsHelpFormatter,
description=('Rebalancing with the Conservative Formula'),
)

help='Directory with data files')

metavar='kwargs', help='kwargs in k1=v1,k2=v2 format')

# Defaults for dates
help='Date[time] in YYYY-MM-DD[THH:MM:SS] format')

help='Date[time] in YYYY-MM-DD[THH:MM:SS] format')

metavar='kwargs', help='kwargs in k1=v1,k2=v2 format')