Escape from OHLC Land
One of the key concepts applied during the conception and development of backtrader was flexibility. The metaprogramming and introspection capabilities of Python were (and still are) the basis to keep many things flexible whilst still being able to deliver.
An old post shows the extension concept.
The basics:
from backtrader.feeds import GenericCSVData class GenericCSV_PE(GenericCSVData): lines = ('pe',) # Add 'pe' to already defined lines
Done. backtrader
defines in the background the most usual lines: OHLC.
If we digged into the final aspect of GenericCSV_PE
, the sum of inherited
plus newly defined lines would yield the following lines:
('close', 'open', 'high', 'low', 'volume', 'openinterest', 'datetime', 'pe',)
This can be check at any time with the method getlinealiases
(applicable to
DataFeeds, Indicators, Strategies and Observers)
The mechanism is flexible and by poking a bit into the internals you could actually get anything, but it has been proven not to be enough.
Ticket #60 asks about supporting High Frequency Data, ie: Bid/Ask data. Which implies that the predefined lines hierarchy in the form of OHLC is not enough. The Bid and Ask prices, volumes and number of trades can be made to fit into the existing OHLC fields, but it wouldn’t feel natural. And if one is only concerned with the Bid and Ask prices, there would be too many fields left untouched.
This called for a solution which has been implemented with Release 1.2.1.88. The idea can be summarized as:
- Now it’s not only possible to extend the existing hierarchy, but also to replace the hierarchy with a new one
Only one constraint in place:
-
There must be a
datetime
field present (which will hopefully contain meaningfuldatetime
information)This is so because
backtrader
needs something for synchronization (multiple datas, multiple timeframes, resampling, replaying) just like Archimedes needed a lever.
Here it is how it works:
from backtrader.feeds import GenericCSVData class GenericCSV_BidAsk(GenericCSVData): linesoverride = True lines = ('bid', 'ask', 'datetime') # Replace hierarchy with this one
Done.
Ok, not fully. But only because we are looking at loading the lines from a
csv source. The hierarchy has actually already been replaced with the
bid, ask datetime definition thanks to the linesoverride=True
setting.
The original GenericCSVData
class parses a csv file
and needs a hint as to where the fields corresponding to the lines are
located. The original definition is:
class GenericCSVData(feed.CSVDataBase): params = ( ('nullvalue', float('NaN')), ('dtformat', '%Y-%m-%d %H:%M:%S'), ('tmformat', '%H:%M:%S'), ('datetime', 0), ('time', -1), # -1 means not present ('open', 1), ('high', 2), ('low', 3), ('close', 4), ('volume', 5), ('openinterest', 6), )
The new hierarchy-redefining-class can be completed with a light touch:
from backtrader.feeds import GenericCSVData class GenericCSV_BidAsk(GenericCSVData): linesoverride = True lines = ('bid', 'ask', 'datetime') # Replace hierarchy with this one params = (('bid', 1), ('ask', 2))
Indicating that Bid prices are field #1 in the csv stream and Ask prices are field #2. We have left the datetime #0 definition untouched from the base class.
Crafting a small data file for the occasion helps:
TIMESTAMP,BID,ASK 02/03/2010 16:53:50,0.5346,0.5347 02/03/2010 16:53:51,0.5343,0.5347 02/03/2010 16:53:52,0.5543,0.5545 02/03/2010 16:53:53,0.5342,0.5344 02/03/2010 16:53:54,0.5245,0.5464 02/03/2010 16:53:54,0.5460,0.5470 02/03/2010 16:53:56,0.5824,0.5826 02/03/2010 16:53:57,0.5371,0.5374 02/03/2010 16:53:58,0.5793,0.5794 02/03/2010 16:53:59,0.5684,0.5688
Add a small test script to the equation (with some more content for those who just go directly to the samples in the sources) (see full code at the end):
$ ./bidask.py
And the output speaks up for itself:
1: 2010-02-03T16:53:50 - Bid 0.5346 - 0.5347 Ask 2: 2010-02-03T16:53:51 - Bid 0.5343 - 0.5347 Ask 3: 2010-02-03T16:53:52 - Bid 0.5543 - 0.5545 Ask 4: 2010-02-03T16:53:53 - Bid 0.5342 - 0.5344 Ask 5: 2010-02-03T16:53:54 - Bid 0.5245 - 0.5464 Ask 6: 2010-02-03T16:53:54 - Bid 0.5460 - 0.5470 Ask 7: 2010-02-03T16:53:56 - Bid 0.5824 - 0.5826 Ask 8: 2010-02-03T16:53:57 - Bid 0.5371 - 0.5374 Ask 9: 2010-02-03T16:53:58 - Bid 0.5793 - 0.5794 Ask 10: 2010-02-03T16:53:59 - Bid 0.5684 - 0.5688 Ask
Et voilá! The Bid/Ask prices have been properly read, parsed and interpreted and the strategy has been able to access the .bid and .ask lines in the data feed through self.data.
Redefining the lines hierarchy opens a broad question though and that is the usage of the already predefined Indicators.
-
Example: the Stochastic is an indicator which relies on close, high and low prices to calculate its output
Even if we though about Bid as the close (because is the first) there is only one other price element (Ask) and not two more. And conceptually Ask has nothing to do with high and low
It is probable that someone working with these fields and operating (or researching) in the High Frequency Trading domain is not concerned with Stochastic as an indicator of choice
-
Other indicators like moving average are perfectly fine. They assume nothing about what the fields mean or imply and will happily take anything. As such one can do:
mysma = backtrader.indicators.SMA(self.data.bid, period=5)
And an moving average of the last 5 bid prices will be delivered
The test script already supports adding a SMA. Let’s execute:
$ ./bidask.py --sma --period=3
The output:
3: 2010-02-03T16:53:52 - Bid 0.5543 - 0.5545 Ask - SMA: 0.5411 4: 2010-02-03T16:53:53 - Bid 0.5342 - 0.5344 Ask - SMA: 0.5409 5: 2010-02-03T16:53:54 - Bid 0.5245 - 0.5464 Ask - SMA: 0.5377 6: 2010-02-03T16:53:54 - Bid 0.5460 - 0.5470 Ask - SMA: 0.5349 7: 2010-02-03T16:53:56 - Bid 0.5824 - 0.5826 Ask - SMA: 0.5510 8: 2010-02-03T16:53:57 - Bid 0.5371 - 0.5374 Ask - SMA: 0.5552 9: 2010-02-03T16:53:58 - Bid 0.5793 - 0.5794 Ask - SMA: 0.5663 10: 2010-02-03T16:53:59 - Bid 0.5684 - 0.5688 Ask - SMA: 0.5616
Note
Plotting still relies on open
, high
, low
, close
and
volume
being present in the data feed.
Some cases can be directly covered by simply plotting with a Line on Close
and taking just the 1st defined line in the object. But a sound model has to
be developed. For an upcoming version of backtrader
The test script usage:
$ ./bidask.py --help usage: bidask.py [-h] [--data DATA] [--dtformat DTFORMAT] [--sma] [--period PERIOD] Bid/Ask Line Hierarchy optional arguments: -h, --help show this help message and exit --data DATA, -d DATA data to add to the system (default: ../../datas/bidask.csv) --dtformat DTFORMAT, -dt DTFORMAT Format of datetime in input (default: %m/%d/%Y %H:%M:%S) --sma, -s Add an SMA to the mix (default: False) --period PERIOD, -p PERIOD Period for the sma (default: 5)
And the test script itself (included in the backtrader
sources)
from __future__ import (absolute_import, division, print_function, unicode_literals) import argparse import backtrader as bt import backtrader.feeds as btfeeds import backtrader.indicators as btind class BidAskCSV(btfeeds.GenericCSVData): linesoverride = True # discard usual OHLC structure # datetime must be present and last lines = ('bid', 'ask', 'datetime') # datetime (always 1st) and then the desired order for params = ( # (datetime, 0), # inherited from parent class ('bid', 1), # default field pos 1 ('ask', 2), # default field pos 2 ) class St(bt.Strategy): params = (('sma', False), ('period', 3)) def __init__(self): if self.p.sma: self.sma = btind.SMA(self.data, period=self.p.period) def next(self): dtstr = self.data.datetime.datetime().isoformat() txt = '%4d: %s - Bid %.4f - %.4f Ask' % ( (len(self), dtstr, self.data.bid[0], self.data.ask[0])) if self.p.sma: txt += ' - SMA: %.4f' % self.sma[0] print(txt) def parse_args(): parser = argparse.ArgumentParser( description='Bid/Ask Line Hierarchy', formatter_class=argparse.ArgumentDefaultsHelpFormatter, ) parser.add_argument('--data', '-d', action='store', required=False, default='../../datas/bidask.csv', help='data to add to the system') parser.add_argument('--dtformat', '-dt', required=False, default='%m/%d/%Y %H:%M:%S', help='Format of datetime in input') parser.add_argument('--sma', '-s', action='store_true', required=False, help='Add an SMA to the mix') parser.add_argument('--period', '-p', action='store', required=False, default=5, type=int, help='Period for the sma') return parser.parse_args() def runstrategy(): args = parse_args() cerebro = bt.Cerebro() # Create a cerebro data = BidAskCSV(dataname=args.data, dtformat=args.dtformat) cerebro.adddata(data) # Add the 1st data to cerebro # Add the strategy to cerebro cerebro.addstrategy(St, sma=args.sma, period=args.period) cerebro.run() if __name__ == '__main__': runstrategy()