Cryptocurrencies algorithmic trading with Python (2/4)

Romain Barrot
12 min readMay 8, 2021

Part II: Creating a backtesting environment

In our previous post, we have reviewed several options available to us in order to obtain cryptocurrency market data, and have chosen the Binance exchange as the main source for our algorithm for several detailed reasons. We have written a function to quickly retrieve a large amount of data on Binance exchange, clean it, and store it in a friendly format.

Now that we have the data, and before diving into the design of our investment strategies, we need to (i) design a backtest environment, (ii) determine a fair model to forecast trading costs, and (iii) select and code a set of objective performance indicators to properly assess and compare the performance of our strategies.

The Backtest environment

Backtesting is the action of simulating the execution of an investment strategy over a given past period of time. Before putting in production our investment strategies, backtesting is essential in order to assess the chances for our investment strategy to perform when in production. Of course, past performance is not an indicator of future performance, however, it is generally a good indicator of the kind of results we can expect in the future.

The backtest function

We create a BACKTEST()function which takes as input the start and end dates of period we want to backtest our investment strategy on, the investment universe of the investment strategy to backtest, the type of investment strategy we want to test, and the initial balance of our portfolio (which will be arbitrary fixed at 10,000 USDT for all our backtests).

The BACKTEST() function will :

  1. fetch the data for the complete backtest period using the GET_DATA() function we discussed in our previous post ✅
  2. compute some technical indicators (moving averages, MACDs, SRI…) for the full backtest period when necessary (depending if those indicators are needed for the investment strategy to backtest). Note that we could have computed those indicators for every date of the backtest period, however it is more efficient from an algorithm complexity perspective to compute only once the indicator for the full period (while ensuring that on every iteration, we don’t feed the investment strategy function with future data which would obviously bias our results) ✅
  3. iterate through each date of the backtest period and do the following actions:
  • compute today’s NAV (Net Asset Value), by summing today’s portfolio positions (in USD) minus any costs like transaction costs or management fees, the later being null in our case; ✅
  • execute theRUN_STRATEGY() function: this is the core of the investment strategy we want to backtest. This function contains the rules to decide which cryptocurrencies to go LONG, NEUTRAL or SHORT. This function returns a Series of views for each asset (‘0’ for neutral, ‘1’ for a long position and ‘-1’ for a short position if shorting is allowed)); ✅
  • run thePORTFOLIO_CONSTRUCTION() function, which, based on the VIEWS returned in the previous step and a given portfolio construction model, computes and returnsTARGET_WEIGHTS as output. TARGET_WEIGHTS is a Series of weights for each of the assets of the investment universe (in %, the sum of all weights being equal to 100%). There are various portfolio models that we can use, such as the equal-weight, min variance, max Sharpe or min vol models. We will try several ones during the various investment strategy backtests, because the way to build our portfolio is almost as important as pure investment decisions and should therefore be treated and backtested together with the investment strategy itself ✅
  • compare the TARGET_WEIGHTS generated through the previous step with the previous day weights of the portfolio to determine the trades to pass (although the algorithm doesn’t trade by itself at this stage as we are still in the backtest environment, it is used to integrate into our model any trading costs that we will face in real trading, such as the transaction costs and slippage effects, see below); ✅

The BACKTEST() function then computes and returns a set of performance indicators that we will discuss in the next paragraph, together with the total estimated transaction costs, the history of the weights, and of course, the NAV history and the GAV history (GAV meaning the Gross Asset Value, being the gross value of the assets of the portfolio without taking into account trading costs).

The trading costs

Trading costs are the costs associated with the implementation of the target portfolio determined by our investment strategy. The total trading cost will depend on the frequency our algorithm is trading. Naturally, the more we will rebalance our portfolio, the higher the transaction costs will be. The trading costs associated with a given trade have globally two components:

  1. The explicit costs, which are the costs easily identified, mainly the transaction costs charged by the exchange (we won’t cover capital gain taxes in this article, but it could also be considered as part of the explicit costs). Crypto exchanges usually apply a percentage based transaction cost to any executed trade. Generally, exchanges offer degressive fees grids based on the traded volume over a certain period. For Binance, the transaction costs start at 10bps (0.1%) and can decrease significantly once the volume of trading starts increasing. In our backtest environment, we will stay conservative and consider a fixed 10bps transaction cost.
  2. The implicit costs, also named the slippage costs which are the costs associated with trading but which may be harder to identify because depend on the conditions of execution of every trade and the type or orders. The slippage cost usually refers to the difference between the price our algorithm tells us to trade versus the actual price our executed. In practice, those two prices can differ significantly. The effects of slippage are particularly important in such instances: when trading high volume relative to the market cap of the asset, when trading on particularly volatile assets, when the latency between our computer and the exchange server is high, and when deploying trend following strategies (as we trade in the same direction as the market). Slippage costs can be divided into 2 main components:(i) the spread cost, and (ii) the market impact costs.

Although the explicit costs are known ahead of the trade (as we said, we will consider a constant 10bps transaction fees), the implicit costs are not known until the trade is executed. The implicit costs ultimately depend on the bid-ask spread and the market depth at the time of trading, and it is not something for which we can easily retrieve historical data.

So, how can we factor in the implicit costs in our backtest?

The flat model: the flat model is the simplest of all and consists in using a fixed proxy for the implicit costs working against us on every transaction. For example, if our algorithm tells us to go long 1 BTC/USDT at 66,000 USDT, we will consider that our algorithm has traded 1 BTC/USDT at the price of 66,000 x (1- IC), IC being a fixed percentage. A conservative approach would be to choose IC = 2%. The limit of this model is that it doesn’t account for the relative volume of our trade vs. the total volume of the stock, nor the volatility of the asset.

Another approach is to estimate its 2 components, namely the bid-ask spread and the market impact, using mathematical models.

Estimating the bid-ask spread

The Roll model (“A Simple Implicit Measure of the Effective Bid-Ask Spread in an Efficient Market”, Richard Roll, 1984): the Roll model is a simple model that seeks to estimate the bid-ask spread when quote data are not available by using just the historical close price of the asset. The spread estimator, expressed in percentage, is defined as a function of the covariance matrix of the changes in prices through the following equation:

The Roll model has some flaws, especially when the covariance matrix is positive (in which case we ignore the spread computed but rather take the last available spread). More developed models use the OHLC instead of just the close price. Let’s test one of them, the Corwin and Schultz model which uses the high and lows instead of the close price.

The Corwin and Schultz model (hereafter the“C&S model”) as published in A Simple Way to Estimate Bid‐Ask Spreads from Daily High and Low Price” by Corwin and Schultz in 2012. The C&S estimator is based on two assumptions. First, the high prices are typically buyer initiated and low prices are seller initiated, and therefore the ratio of high-to-low prices for a day reflects both the fundamental volatility of stock and its bid-ask spread. Second, the volatility component of the high-to-low price ratio increases proportionately with the length of trading interval whereas the component due to bid-ask spreads does not. The Corwin-Schultz bid-ask spread estimator is presented by the below equations, where S is the spread and H and L denote the observed high and low prices, respectively.

α represents the difference between the adjustments of a single day and a 2-day period, β represents the daily high and low price adjustments to the high price, and γ represents a 2-day period high and low price adjustments.

Comparison of both models results: BTC/USD historical spread can be retrieved from open-source data providers (such as https://data.bitcoinity.org/) and will serve as a benchmark to assess the accuracy of our models. The below graph shows the result of both our BTC/USD estimated spreads time series together with the observed historical spread of the BTC/USD, averaged across multiple exchange platforms, as obtained from https://data.bitcoinity.org/ (both the C&S and the observeed real spread have been smoothed using EMA7 filter for better readability in the graph below).

Simulated spread using the Roll and C&S models vs. historical observations from https://data.bitcoinity.org/ (average spread across several exchanges)

On average, both models seems to estimate fairly correctly the real spread observed since Jan 2020 for the BTC/USD, however, the C&S model seems to be more reactive and more accurately reflecting jumps in spreads, such as the one which occurred in early March 2020.

Estimating the market impact

The sigma-root-liquidity model: The sigma-root-liquidity model, described for example by Grinold and Kahn in 1994, is a widely accepted and empirically verified model that estimates the market impact of a trade based on volume and volatility measurements. Its advantage is that it is simple to use and apply to a various range of assets classes. It defines the market impact as :

  • α is a constant of order 1. In order to estimate this constant, we could do some regression over historical trades, but as we don’t have any historical trades available, we will just use widely accepted constants empirically validated by other hedge funds or trading desks, such as CFM (a Parisian Hedge Fund) as described in “Anomalous price impact and the critical nature of liquidity in financial markets” published by CFM teams in 2018. Based on those analysis, we will take α = 0.75.
  • σ is the annualized daily volatility of the asset to trade
  • Q is the traded volume over a period (say a day)
  • V is the total volume traded by the market on the platform, over the same period as Q

Note: The sigma-root-liquidity model has some limitations, especially it doesn’t look at the capitalisation of the asset or the type of orders. Some other model exist and try to account for such parameters (such as the continuous time propagator model, the Alfonsi and Schied order book model or the locally linear order book model). However, some of those models are very intensive to implement in term of algorithm complexity, and we will focus for now on the sigma-root-liquidity model.

Model for the total trading costs:

By combining the estimators of the above various components of the trading costs, we have the total estimated trading cost for a certain time t expressed as:

Note: one solution to reduce negative slippage is to trade only through limit orders, but it carries the inherent risk of the trade not being fully executed so we won’t retain that solution.

IS vs OOS

We remember the golden rule of backtesting: train your algorithm on in-sample (IS) data and test it on out-of-sample (OOS) data, in other words, make sure not to backtest your strategy over the same dataset than the one used to train it.

Indeed, using the same dataset to train and test our algorithm might result in an overfitted model where our algorithm is perfectly optimised to perform optimally during a very specific historical period, but might be completely lost when tested over another history period. We need an algorithm that can adapt to various regimes.

In our backtest, we will train our algorithm over the IS period [01 July 2017–31 December 2019] and backtest our strategy over the OOS period [01 July 2020–now]. Periods are roughly the same size and both experience a good variety of market regimes: a bull-run, a massive drawdown, and periods of low-volatility. The limits of such a split is that the investment universe is obviously higher on the OOS period than the IS period which might be problematic with certain strategies we want to backtest.

Measuring the performance of a strategy

In order to assess the performance of our strategies, we will implement a set of objective performance indicators:

  • The GAV (Gross Asset Value) is the sum of the USDT value of the assets held in the portfolio without taking into account the trading costs. In other words, this is the theoretical value of your portfolio when you don’t take into account the implementation costs. Although it is only a theoretical value of our portfolio which may appear as an unrealistic and useless indicator, it will help us assessing the impact of the trading costs of the investment strategy.
  • The NAV (Net Asset Value) is the sum of the USDT value of the assets held in the portfolio but this time taking into accounts estimated trading costs (transaction and slippage costs).
  • The Total net P&L (profits and loss) defined as the difference between the initial value and the final value of the portfolio:
  • The Total Returns “R” of the portfolio, being the Total net P&L divided by the initial balance of the portfolio.
  • The CAGR (Compound Annual Growth Rate) is the measure of the annual growth rate of return over time, taking into account the effect of compounding. With N being the full period expressed in years, it is defined as:
  • The Volatility “σ” of the portfolio, defined as the standard deviation of the log-returns of the NAV
  • The risk adjusted returns of the portfolio. We will use the Sortino Ratio (“SR”) instead of the more common Sharpe Ratio. The Sortino Ratio differs from the Sharpe Ratio in that it only considers the standard deviation of the downside risk, rather than that of the entire risk (upside + downside) because after all, the upside volatility is our friend. If Rp is the annualized arithmetic average of the log-returns, rf the risk-free rate (0.02% in our case, which is roughly the rate of returns of a US treasury bill, considered as the less risky asset, see https://www.treasury.gov/resource-center/data-chart-center/interest-rates/Pages/TextView.aspx?data=billrates) and σd the standard deviation of the negative log-returns , then the Sortino Ratio is defined as:
  • The market exposure of the portfolio, which is the percentage of the portfolio that is invested in cryptocurrencies vs. the portion in USDT (considered as non-exposed/non-risky although USDT is ultimately a cryptocurrency).

Now that we have our backtest environment and tools to measure the performance of the strategies we want to test, let’s play…✅

See you “au prochain episode” → Next episode

--

--

Romain Barrot

Crypto and python enthusiast here to share my research!