On Backtesting Performance and Out of Core Memory Execution
There have been two recent https://redit.com/r/algotrading threads which are the inspiration for this article.
-
A thread with a bogus claim that backtrader cannot cope with 1.6M candles: reddit/r/algotrading - A performant backtesting system?
-
And another one asking for something which can backtest a universe of 8000 stocks: reddit/r/algotrading - Backtesting libs that supports 1000+ stocks?
With the author asking about a framework that can backtest '"out-of-core/memory", *"because obviously it cannot load all this data into memory"
We'll be of course addressing these concepts with backtrader
The 2M Candles
In order to do this, the first thing is to generate that amount of candles. Given that the first poster talks about 77 stocks and 1.6M candles, this would amount to 20,779 candles per stock, so we'll do the following to have nice numbers
-
Generate candles for 100 stocks
-
Generate 20,000 candles per stock
I.e.: 100 files totaling 2M candles.
The script
import numpy as np import pandas as pd COLUMNS = ['open', 'high', 'low', 'close', 'volume', 'openinterest'] CANDLES = 20000 STOCKS dateindex = pd.date_range(start='2010-01-01', periods=CANDLES, freq='15min') for i in range(STOCKS): data = np.random.randint(10, 20, size=(CANDLES, len(COLUMNS))) df = pd.DataFrame(data * 1.01, dateindex, columns=COLUMNS) df = df.rename_axis('datetime') df.to_csv('candles{:02d}.csv'.format(i))
This generates 100 files, starting with candles00.csv
and going all the way
up to candles99.csv
. The actual values are not important. Having the standard
datetime
, OHLCV
components (and OpenInterest
) is what matters.
The test system
-
Hardware/OS: A Windows 10 15.6" laptop with an Intel i7 and 32 Gbytes of memory will be used.
-
Python: CPython
3.6.1
andpypy3 6.0.0
-
Misc: an application running constantly and taking around 20% of the CPU. The usual suspects like Chrome (102 processes), Edge, Word, Powerpoint, Excel and some minor application are running
backtrader default configuration
Let's recall what the default run-time configuration for backtrader is:
-
Preload all data feeds if possible
-
If all data feeds can be preloaded, run in batch mode (named
runonce
) -
Precalculate all indicators first
-
Go through the strategy logic and broker step-by-step
Executing the challenge in the default batch runonce
mode
Our test script (see at the bottom for the full source code) will open those 100 files and process them with the default backtrader configuration.
$ ./two-million-candles.py Cerebro Start Time: 2019-10-26 08:33:15.563088 Strat Init Time: 2019-10-26 08:34:31.845349 Time Loading Data Feeds: 76.28 Number of data feeds: 100 Strat Start Time: 2019-10-26 08:34:31.864349 Pre-Next Start Time: 2019-10-26 08:34:32.670352 Time Calculating Indicators: 0.81 Next Start Time: 2019-10-26 08:34:32.671351 Strat warm-up period Time: 0.00 Time to Strat Next Logic: 77.11 End Time: 2019-10-26 08:35:31.493349 Time in Strategy Next Logic: 58.82 Total Time in Strategy: 58.82 Total Time: 135.93 Length of data feeds: 20000
Memory Usage: A peak of 348 Mbytes was observed
Most of the time is actually spent preloading the data (98.63
seconds),
spending the rest in the strategy, which includes going through the broker in
each iteration (73.63
seconds). The total time is 173.26
seconds.
Depending on how you want to calculate it the performance is:
14,713
candles/second considering the entire run time
Bottomline: the claim in the 1st of the two reddit thread above that backtrader cannot handle 1.6M candles is FALSE.
Doing it with pypy
Since the thread claims that using pypy
didn't help, let's see what happens
when using it.
$ ./two-million-candles.py Cerebro Start Time: 2019-10-26 08:39:42.958689 Strat Init Time: 2019-10-26 08:40:31.260691 Time Loading Data Feeds: 48.30 Number of data feeds: 100 Strat Start Time: 2019-10-26 08:40:31.338692 Pre-Next Start Time: 2019-10-26 08:40:31.612688 Time Calculating Indicators: 0.27 Next Start Time: 2019-10-26 08:40:31.612688 Strat warm-up period Time: 0.00 Time to Strat Next Logic: 48.65 End Time: 2019-10-26 08:40:40.150689 Time in Strategy Next Logic: 8.54 Total Time in Strategy: 8.54 Total Time: 57.19 Length of data feeds: 20000
Holy Cow! The total time has gone down to 57.19
seconds in total from
135.93
seconds. The performance has more than doubled.
The performance: 34,971
candles/second
Memory Usage: a peak of 269 Mbytes was seen.
This is also an important improvement over the standard CPython interpreter.
Handling the 2M candles out of core memory
All of this can be improved if one considers that backtrader has several
configuration options for the execution of a backtesting session, including
optimizing the buffers and working only with the minimum needed set of data
(ideally with just buffers of size 1
, which would only happen in ideal
scenarios)
The option to be used will be exactbars=True
. From the documentation for
exactbars
(which is a parameter given to Cerebro
during either
instantiation or when invoking run
)
`True` or `1`: all “lines” objects reduce memory usage to the automatically calculated minimum period. If a Simple Moving Average has a period of 30, the underlying data will have always a running buffer of 30 bars to allow the calculation of the Simple Moving Average * This setting will deactivate `preload` and `runonce` * Using this setting also deactivates **plotting**
For the sake of maximum optimization and because plotting will be disabled, the
following will be used too: stdstats=False
, which disables the standard
Observers for cash, value and trades (useful for plotting, which is no longer
in scope)
$ ./two-million-candles.py --cerebro exactbars=False,stdstats=False Cerebro Start Time: 2019-10-26 08:37:08.014348 Strat Init Time: 2019-10-26 08:38:21.850392 Time Loading Data Feeds: 73.84 Number of data feeds: 100 Strat Start Time: 2019-10-26 08:38:21.851394 Pre-Next Start Time: 2019-10-26 08:38:21.857393 Time Calculating Indicators: 0.01 Next Start Time: 2019-10-26 08:38:21.857393 Strat warm-up period Time: 0.00 Time to Strat Next Logic: 73.84 End Time: 2019-10-26 08:39:02.334936 Time in Strategy Next Logic: 40.48 Total Time in Strategy: 40.48 Total Time: 114.32 Length of data feeds: 20000
The performance: 17,494
candles/second
Memory Usage: 75 Mbytes (stable from the beginning to the end of the backtesting session)
Let's compare to the previous non-optimized run
-
Instead of spending over
76
seconds preloading data, backtesting starts immediately, because the data is not preloaded -
The total time is
114.32
seconds vs135.93
. An improvement of15.90%
. -
An improvement in memory usage of
68.5%
.
Note
We could have actually thrown 100M candles to the script and the amount of
memory consumed would have remained fixed at 75 Mbytes
Doing it again with pypy
Now that we know how to optimize, let's do it the pypy
way.
$ ./two-million-candles.py --cerebro exactbars=True,stdstats=False Cerebro Start Time: 2019-10-26 08:44:32.309689 Strat Init Time: 2019-10-26 08:44:32.406689 Time Loading Data Feeds: 0.10 Number of data feeds: 100 Strat Start Time: 2019-10-26 08:44:32.409689 Pre-Next Start Time: 2019-10-26 08:44:32.451689 Time Calculating Indicators: 0.04 Next Start Time: 2019-10-26 08:44:32.451689 Strat warm-up period Time: 0.00 Time to Strat Next Logic: 0.14 End Time: 2019-10-26 08:45:38.918693 Time in Strategy Next Logic: 66.47 Total Time in Strategy: 66.47 Total Time: 66.61 Length of data feeds: 20000
The performance: 30,025
candles/second
Memory Usage: constant at 49 Mbytes
Comparing it to the previous equivalent run:
-
66.61
seconds vs114.32
or a41.73%
improvement in run time -
49 Mbytes
vs75 Mbytes
or a34.6%
improvement.
Note
In this case pypy
has not been able to beat its own time compared to the
batch (runonce
) mode, which was 57.19
seconds. This is to be expected,
because when preloading, the calculator indications are done in
vectorized mode and that's where the JIT of pypy
excels
It has, in any case, still done a very good job and there is an important improvement in memory consumption
A complete run with trading
The script can create indicators (moving averages) and execute a short/long
strategy on the 100 data feeds using the crossover of the moving
averages. Let's do it with pypy
, and knowing that it is better with the batch
mode, so be it.
$ ./two-million-candles.py --strat indicators=True,trade=True Cerebro Start Time: 2019-10-26 08:57:36.114415 Strat Init Time: 2019-10-26 08:58:25.569448 Time Loading Data Feeds: 49.46 Number of data feeds: 100 Total indicators: 300 Moving Average to be used: SMA Indicators period 1: 10 Indicators period 2: 50 Strat Start Time: 2019-10-26 08:58:26.230445 Pre-Next Start Time: 2019-10-26 08:58:40.850447 Time Calculating Indicators: 14.62 Next Start Time: 2019-10-26 08:58:41.005446 Strat warm-up period Time: 0.15 Time to Strat Next Logic: 64.89 End Time: 2019-10-26 09:00:13.057955 Time in Strategy Next Logic: 92.05 Total Time in Strategy: 92.21 Total Time: 156.94 Length of data feeds: 20000
The performance: 12,743
candles/second
Memory Usage: A peak of 1300 Mbytes
was observed.
The execution time has obviously increased (indicators + trading), but why the memory usage increase?
Before reaching any conclusions, let's run it creating indicators but without trading
$ ./two-million-candles.py --strat indicators=True Cerebro Start Time: 2019-10-26 09:05:55.967969 Strat Init Time: 2019-10-26 09:06:44.072969 Time Loading Data Feeds: 48.10 Number of data feeds: 100 Total indicators: 300 Moving Average to be used: SMA Indicators period 1: 10 Indicators period 2: 50 Strat Start Time: 2019-10-26 09:06:44.779971 Pre-Next Start Time: 2019-10-26 09:06:59.208969 Time Calculating Indicators: 14.43 Next Start Time: 2019-10-26 09:06:59.360969 Strat warm-up period Time: 0.15 Time to Strat Next Logic: 63.39 End Time: 2019-10-26 09:07:09.151838 Time in Strategy Next Logic: 9.79 Total Time in Strategy: 9.94 Total Time: 73.18 Length of data feeds: 20000
The performance: 27,329
candles/second
Memory Usage: 600 Mbytes
(doing the same in optimized exactbars
mode
consumes only 60 Mbytes
, but with an increase in the execution time as
pypy
itself cannot optimize so much)
With that in the hand: Memory usage increases really when trading. The
reason being that Order
and Trade
objects are created, passed
around and kept by the broker.
Note
Take into account that the data set contains random values, which generates a huge number of crossovers, hence an enourmous amounts of orders and trades. A similar behavior shall not be expected for a regular data set.
Conclusions
The bogus claim
Already proven above as bogus, becase backtrader CAN handle 1.6 million candles and more.
General
-
backtrader can easily handle
2M
candles using the default configuration (with in-memory data pre-loading) -
backtrader can operate in an non-preloading optimized mode reducing buffers to the minimum for out-of-core-memory backtesting
-
When backtesting in optimized non-preloading mode, the increase in memory consumption comes from the administrative overhead which the broker generates.
-
Even when the trading, using indicators and the broker getting constantly in the way, the performance is
12,473
candles/second -
Use
pypy
where possible (for example if you don't need to plot)
Using Python and/or backtrader for these cases
With pypy
, trading enabled, and the random data set (higher than usual number
of trades), the entire 2M bars was processed in a total of:
156.94
seconds, i.e.: almost2 minutes and 37 seconds
Taking into account that this is done in a laptop running multiple other things
simultaneously, it can be concluded that 2M
bars can be done.
What about the 8000
stocks scenario?
Execution time would have to be scaled by 80, hence:
12,560 seconds
(or almost210 minutes
or3 hours and 30 minutes
) would be needed to run this random set scenario.
Even assuming a standard data set which would generate far less operations, one
would still be talking of backtesting in hours (3 or 4
)
Memory usage would also increase, when trading due to the broker actions, and would probably require some Gigabytes.
Note
One cannot here simply multiply by 80 again, because the sample scripts trades with random data and as often as possible. In any case the amount of RAM needed would be IMPORTANT
As such, a workflow with only backtrader as the research and backtesting tool would seem far fetched.
A Discussion about Workflows
There are two standard workflows to consider when using backtrader
-
Do everything with
backtrader
, i.e.: research and backtesting all in one -
Research with
pandas
, get the notion if the ideas are good and then backtest withbacktrader
to verify with as much as accuracy as possible, having possibly reduced huge data-sets to something more palatable for usual RAM scenarios.
Tip
One can imagine replacing pandas
with something like dask
for
out-of-core-memory execution
The Test Script
Here the source code
#!/usr/bin/env python # -*- coding: utf-8; py-indent-offset:4 -*- ############################################################################### import argparse import datetime import backtrader as bt class St(bt.Strategy): params = dict( indicators=False, indperiod1=10, indperiod2=50, indicator=bt.ind.SMA, trade=False, ) def __init__(self): self.dtinit = datetime.datetime.now() print('Strat Init Time: {}'.format(self.dtinit)) loaddata = (self.dtinit - self.env.dtcerebro).total_seconds() print('Time Loading Data Feeds: {:.2f}'.format(loaddata)) print('Number of data feeds: {}'.format(len(self.datas))) if self.p.indicators: total_ind = self.p.indicators * 3 * len(self.datas) print('Total indicators: {}'.format(total_ind)) indname = self.p.indicator.__name__ print('Moving Average to be used: {}'.format(indname)) print('Indicators period 1: {}'.format(self.p.indperiod1)) print('Indicators period 2: {}'.format(self.p.indperiod2)) self.macross = {} for d in self.datas: ma1 = self.p.indicator(d, period=self.p.indperiod1) ma2 = self.p.indicator(d, period=self.p.indperiod2) self.macross[d] = bt.ind.CrossOver(ma1, ma2) def start(self): self.dtstart = datetime.datetime.now() print('Strat Start Time: {}'.format(self.dtstart)) def prenext(self): if len(self.data0) == 1: # only 1st time self.dtprenext = datetime.datetime.now() print('Pre-Next Start Time: {}'.format(self.dtprenext)) indcalc = (self.dtprenext - self.dtstart).total_seconds() print('Time Calculating Indicators: {:.2f}'.format(indcalc)) def nextstart(self): if len(self.data0) == 1: # there was no prenext self.dtprenext = datetime.datetime.now() print('Pre-Next Start Time: {}'.format(self.dtprenext)) indcalc = (self.dtprenext - self.dtstart).total_seconds() print('Time Calculating Indicators: {:.2f}'.format(indcalc)) self.dtnextstart = datetime.datetime.now() print('Next Start Time: {}'.format(self.dtnextstart)) warmup = (self.dtnextstart - self.dtprenext).total_seconds() print('Strat warm-up period Time: {:.2f}'.format(warmup)) nextstart = (self.dtnextstart - self.env.dtcerebro).total_seconds() print('Time to Strat Next Logic: {:.2f}'.format(nextstart)) self.next() def next(self): if not self.p.trade: return for d, macross in self.macross.items(): if macross > 0: self.order_target_size(data=d, target=1) elif macross < 0: self.order_target_size(data=d, target=-1) def stop(self): dtstop = datetime.datetime.now() print('End Time: {}'.format(dtstop)) nexttime = (dtstop - self.dtnextstart).total_seconds() print('Time in Strategy Next Logic: {:.2f}'.format(nexttime)) strattime = (dtstop - self.dtprenext).total_seconds() print('Total Time in Strategy: {:.2f}'.format(strattime)) totaltime = (dtstop - self.env.dtcerebro).total_seconds() print('Total Time: {:.2f}'.format(totaltime)) print('Length of data feeds: {}'.format(len(self.data))) def run(args=None): args = parse_args(args) cerebro = bt.Cerebro() datakwargs = dict(timeframe=bt.TimeFrame.Minutes, compression=15) for i in range(args.numfiles): dataname = 'candles{:02d}.csv'.format(i) data = bt.feeds.GenericCSVData(dataname=dataname, **datakwargs) cerebro.adddata(data) cerebro.addstrategy(St, **eval('dict(' + args.strat + ')')) cerebro.dtcerebro = dt0 = datetime.datetime.now() print('Cerebro Start Time: {}'.format(dt0)) cerebro.run(**eval('dict(' + args.cerebro + ')')) def parse_args(pargs=None): parser = argparse.ArgumentParser( formatter_class=argparse.ArgumentDefaultsHelpFormatter, description=( 'Backtrader Basic Script' ) ) parser.add_argument('--numfiles', required=False, default=100, type=int, help='Number of files to rea') parser.add_argument('--cerebro', required=False, default='', metavar='kwargs', help='kwargs in key=value format') parser.add_argument('--strat', '--strategy', required=False, default='', metavar='kwargs', help='kwargs in key=value format') return parser.parse_args(pargs) if __name__ == '__main__': run()