Datafeed Development

Adding a new CSV based data feed is easy. The existing base class CSVDataBase provides the framework taking most of the work off the subclasses which in most cases can simply do:

def _loadline(self, linetokens):

  # parse the linetokens here and put them in self.lines.close,
  # self.lines.high, etc

  return True # if data was parsed, else ... return False

This was shown in CSV Data Feed Development.

The base class takes care of the parameters, initialization, opening of files, reading lines, splitting the lines in tokens and additional things like skipping lines which don’t fit into the date range (fromdate, todate) which the en user may have defined.

Developing a non-CSV datafeed follows the same pattern without going down to the already splitted line tokens.

Things to do:

Derive from backtrader.feed.DataBase
Add any parameters you may need
Should initialization be needed, override __init__(self) and/or start(self)
Should any clean-up code be needed, override stop(self)
The work happens inside the method which MUST always be overriden: _load(self)

Let’s the parameters already provided by backtrader.feed.DataBase:

class DataBase(six.with_metaclass(MetaDataBase, dataseries.OHLCDateTime)):

    params = (('dataname', None),
        ('fromdate', datetime.datetime.min),
        ('todate', datetime.datetime.max),
        ('name', ''),
        ('compression', 1),
        ('timeframe', TimeFrame.Days),
        ('sessionend', None))

Having the following meanings:

dataname is what allows the data feed to identify how to fetch the data. In the case of the CSVDataBase this parameter is meant to be a path to a file or already a file-like object.
fromdate and todate define the date range which will be passed to strategies. Any value provided by the feed outside of this range will be ignored
name is cosmetic for plotting purposes
timeframe and compression are cosmetic and informative. They really play a role in Data Resampling and Data Replaying.
sessionend if passed (a datetime.time object) will be added to the datafeed datetime line which allows identifying the end of the session

Sample binary datafeed

backtrader already defines a CSV datafeed (VChartCSVData) for the exports of VisualChart, but it is also possible to directly read the binary data files.

Let’s do it (full data feed code can be found at the bottom)

Initialization

The binary VisualChart data files can contain either daily (.fd extension) or intraday data (.min extension). Here the informative parameter timeframe will be used to distinguish which type of file is being read.

During __init__ constants which differ for each type are set up.

    def __init__(self):
        super(VChartData, self).__init__()

        # Use the informative "timeframe" parameter to understand if the
        # code passed as "dataname" refers to an intraday or daily feed
        if self.p.timeframe >= TimeFrame.Days:
            self.barsize = 28
            self.dtsize = 1
            self.barfmt = 'IffffII'
        else:
            self.dtsize = 2
            self.barsize = 32
            self.barfmt = 'IIffffII'

Start

The Datafeed will be started when backtesting commences (it can actually be started several times during optimizations)

In the start method the binary file is open unless a file-like object has been passed.

    def start(self):
        # the feed must start ... get the file open (or see if it was open)
        self.f = None
        if hasattr(self.p.dataname, 'read'):
            # A file has been passed in (ex: from a GUI)
            self.f = self.p.dataname
        else:
            # Let an exception propagate
            self.f = open(self.p.dataname, 'rb')

Stop

Called when backtesting is finished.

If a file was open, it will be closed

    def stop(self):
        # Close the file if any
        if self.f is not None:
            self.f.close()
            self.f = None

Actual Loading

The actual work is done in _load. Called to load the next set of data, in this case the next : datetime, open, high, low, close, volume, openinterest. In backtrader the “actual” moment corresponds to index 0.

A number of bytes will be read from the open file (determined by the constants set up during __init__), parsed with the struct module, further processed if needed (like with divmod operations for date and time) and stored in the lines of the data feed: datetime, open, high, low, close, volume, openinterest.

If no data can be read from the file it is assumed that the End Of File (EOF) has been reached

False is returned to indicate the fact no more data is available

Else if data has been loaded and parsed:

True is returned to indicate the loading of the data set was a success

    def _load(self):
        if self.f is None:
            # if no file ... no parsing
            return False

        # Read the needed amount of binary data
        bardata = self.f.read(self.barsize)
        if not bardata:
            # if no data was read ... game over say "False"
            return False

        # use struct to unpack the data
        bdata = struct.unpack(self.barfmt, bardata)

        # Years are stored as if they had 500 days
        y, md = divmod(bdata[0], 500)
        # Months are stored as if they had 32 days
        m, d = divmod(md, 32)
        # put y, m, d in a datetime
        dt = datetime.datetime(y, m, d)

        if self.dtsize > 1:  # Minute Bars
            # Daily Time is stored in seconds
            hhmm, ss = divmod(bdata[1], 60)
            hh, mm = divmod(hhmm, 60)
            # add the time to the existing atetime
            dt = dt.replace(hour=hh, minute=mm, second=ss)

        self.lines.datetime[0] = date2num(dt)

        # Get the rest of the unpacked data
        o, h, l, c, v, oi = bdata[self.dtsize:]
        self.lines.open[0] = o
        self.lines.high[0] = h
        self.lines.low[0] = l
        self.lines.close[0] = c
        self.lines.volume[0] = v
        self.lines.openinterest[0] = oi

        # Say success
        return True

Other Binary Formats

The same model can be applied to any other binary source:

Database
Hierarchical data storage
Online source

The steps again:

__init__ -> Any init code for the instance, only once
start -> start of backtesting (one or more times if optimization will be run)

This would for example open the connection to the database or a socket to an online service
stop -> clean-up like closing the database connection or open sockets
_load -> query the database or online source for the next set of data and load it into the lines of the object. The standard fields being: datetime, open, high, low, close, volume, openinterest

VChartData Test

The VCharData loading data from a local “.fd” file for Google for the year 2006.

It’s only about loading the data, so not even a subclass of Strategy is needed.

from __future__ import (absolute_import, division, print_function,
                        unicode_literals)

import datetime

import backtrader as bt
from vchart import VChartData


if __name__ == '__main__':
    # Create a cerebro entity
    cerebro = bt.Cerebro(stdstats=False)

    # Add a strategy
    cerebro.addstrategy(bt.Strategy)

    # Create a Data Feed
    datapath = '../datas/goog.fd'
    data = VChartData(
        dataname=datapath,
        fromdate=datetime.datetime(2006, 1, 1),
        todate=datetime.datetime(2006, 12, 31),
        timeframe=bt.TimeFrame.Days
    )

    # Add the Data Feed to Cerebro
    cerebro.adddata(data)

    # Run over everything
    cerebro.run()

    # Plot the result
    cerebro.plot(style='bar')

VChartData Full Code

from __future__ import (absolute_import, division, print_function,
                        unicode_literals)

import datetime
import struct

from backtrader.feed import DataBase
from backtrader import date2num
from backtrader import TimeFrame


class VChartData(DataBase):
    def __init__(self):
        super(VChartData, self).__init__()

        # Use the informative "timeframe" parameter to understand if the
        # code passed as "dataname" refers to an intraday or daily feed
        if self.p.timeframe >= TimeFrame.Days:
            self.barsize = 28
            self.dtsize = 1
            self.barfmt = 'IffffII'
        else:
            self.dtsize = 2
            self.barsize = 32
            self.barfmt = 'IIffffII'

    def start(self):
        # the feed must start ... get the file open (or see if it was open)
        self.f = None
        if hasattr(self.p.dataname, 'read'):
            # A file has been passed in (ex: from a GUI)
            self.f = self.p.dataname
        else:
            # Let an exception propagate
            self.f = open(self.p.dataname, 'rb')

    def stop(self):
        # Close the file if any
        if self.f is not None:
            self.f.close()
            self.f = None

    def _load(self):
        if self.f is None:
            # if no file ... no parsing
            return False

        # Read the needed amount of binary data
        bardata = self.f.read(self.barsize)
        if not bardata:
            # if no data was read ... game over say "False"
            return False

        # use struct to unpack the data
        bdata = struct.unpack(self.barfmt, bardata)

        # Years are stored as if they had 500 days
        y, md = divmod(bdata[0], 500)
        # Months are stored as if they had 32 days
        m, d = divmod(md, 32)
        # put y, m, d in a datetime
        dt = datetime.datetime(y, m, d)

        if self.dtsize > 1:  # Minute Bars
            # Daily Time is stored in seconds
            hhmm, ss = divmod(bdata[1], 60)
            hh, mm = divmod(hhmm, 60)
            # add the time to the existing atetime
            dt = dt.replace(hour=hh, minute=mm, second=ss)

        self.lines.datetime[0] = date2num(dt)

        # Get the rest of the unpacked data
        o, h, l, c, v, oi = bdata[self.dtsize:]
        self.lines.open[0] = o
        self.lines.high[0] = h
        self.lines.low[0] = l
        self.lines.close[0] = c
        self.lines.volume[0] = v
        self.lines.openinterest[0] = oi

        # Say success
        return True