Saturday, February 28, 2015

Commitments of Traders (COT) strategy on soybean futures

In our drive to extract alphas from a variety of non-price data, we came across this old-fashioned source: Commitments of Traders (COT) on futures. This indicator is well-known to futures traders since 1923 (see, but there are often persistent patterns (risk factors?) in the markets that refuse to be arbitraged away. It is worth another look, especially since the data has become richer over the years.

First, some facts about COT:
1) CFTC collects the reports of the number of long and short futures and options contracts ("open interest") held by different types of firms by Tuesdays, and reports them every Friday by 4:30 CT.
2) Options positions are added to COT as if they were futures but adjusted by their deltas.
3) COT are then broken down into contracts held by different types of firms. The most familiar types are "Commercial" (e.g. an ethanol plant) and "Non-Commercial" (i.e. speculators).
4) Other types are "Spreaders" who hold calendar spreads, "Index traders", "Money Managers", etc. There are 9 mutually exclusive types in total.

Since we only have historical COT data from, and they do not collect data on all these types, we have to restrict our present analysis only to Commercial and Non-Commercial. Also, beware that csidata tags a COT report by its Tuesday data collection date. As noted above, that information is unactionable until the following Sunday evening when the market re-opens.

A simple strategy would be to compute the ratio of long vs short COT for Non-Commercial traders. We buy the front contract when this ratio is equal to or greater than 3, exiting when the ratio drops to or below 1. We short the front contract when this ratio is equal to or less than 1/3, exiting when the ratio rises to or above 1. Hence this is a momentum strategy: we trade in the same direction as the speculators did. As most profitable futures traders are momentum traders, it would not be surprising this strategy could be profitable.

Over the period from 1999 to 2014, applying this strategy on CME soybean futures returns about 9% per annum, though its best period seems to be behind us already. I have plotted the cumulative returns below (click to enlarge).

I have applied this strategy to a few other agricultural commodities, but it doesn't seem to work on them. It is therefore quite possible that the positive result on soybeans is a fluke. Also, it is very unsatisfactory that we do not have data on the Money Managers (which include the all important CPOs and CTAs), since they would likely to be an important source of alpha. Of course, we can go directly to the, download all the historical reports in .xls format, and compile the data ourselves. But that is a project for another day.

My Upcoming  Talks and Workshops

3/14: "Beware of Low Frequency Data" at QuantCon 2015, New York.
3/22-: "Algorithmic Trading of Bitcoins" pre-recorded online workshop.
3/24-25: "Millisecond Frequency Trading" live online workshop.
5/13-14: "Mean Reversion Strategies", "AI techniques in Trading" and "Portfolio Optimization" at Q-Trade Bootcamp 2015, Milan, Italy. 

Managed Account Program Update

Our FX Managed Account program has a net return of +7.68% in February (YTD: +8.06%).

Follow me on Twitter: @chanep

Thursday, January 08, 2015

Trading with Estimize and I/B/E/S earnings estimates data

By Yang Gao

Estimize is an online-community utilizing 'wisdom of crowds' to offer intelligence about market. It contains a wide range of crowd-sourced estimates from over 4,500 buy-side, sell-side and individual analysts. Studies (from Deustche Bank and Rice University among others) show estimates from Estimize are more accurate than estimates from traditional sell-side analysts.

The first strategy we tested is a mean reversion strategy developed by the quantitative research team from Deltix using Estimize’s data. This strategy is based on the idea that post-earning-announcement prices typically revert from the short-term trend driven by the more recent Estimize estimates just before the announcement. We backtested this strategy with S&P100 over the period between 2012/01/01 and 2013/12/31. (Even though Estimize has 2014 data, we do not have the corresponding survivorship-bias-free price data from the Center for Research in Securities Prices that includes the closing bid and ask prices.) With 5bp one-way transaction cost, we found that the backtest shows a Sharpe ratio of 0.8 and an average annual return of 6%.  The following figure is the cumulative P&L of the strategy based on $1 per stock position.

Cumulative P&L of Deltix Mean Reversion Strategy with Estimize 
It surprised us that a mean-reverting instead of a momentum strategy was used in conjunction with Estimize data, since earnings estimates and announcements typically generate price momentum. In order to show that this return is really driven by the information in Estimize and not simply due to price reversal, we provide a benchmark mean-reverting strategy that uses prices alone to generate signal:

1. Find long period T and short period T_s, where T is average period of the reporting of all the quarterly estimates and T_s is average period of the reporting of the latest 20% of all estimates.
2. Calculate stock return R over T and Rs over T_s, and let delta = R - Rs
3. Buy stocks with delta > 0 at close before an earnings announcement and exit the positions next morning at the open after the announcement.
4. Sell stocks with delta < 0 at close before an earnings announcement and exit the positions next morning at the open after the announcement.
5. Hedge net exposure with SPY during the entire holding period.

This benchmark shows no significant positive return and so it does seem that there is useful information in the Estimize data captured by Deltix’s mean-reversion strategy.

Next, we compare the traditional earnings estimates from I/B/E/S gathered from sell-side Wall Street analysts to the crowd-sourced Estimize estimates. Backtest showed the same Deltix mean reversion strategy described above but using I/B/E/S estimates gave negative return over the same S&P100 universe and over the same 2012-2013 period, again supporting the thesis that Estimize estimates may be superior.

Since Deltix's mean reversion strategy gives negative returns on I/B/E/S data, it is natural to see if a momentum strategy would work instead: if the short-term average estimate is higher than the long-term average estimate (i.e. analogous to delta < 0 above), we expect the price to move up and vice verse.

The backtest result of this momentum strategy over the same universe and time period is quite promising: with 5bp transaction cost, the Sharpe ratio = 1.5 and average annual return = 11%. The following figure is the daily P&L of the strategy based on $1 per stock position

 Cumulative P&L of momentum Strategy with I/B/E/S

We tried the same momentum strategy using Estimize data over 2012-2013, and it generated negative returns this time. This is not surprising since we found earlier that the mean reversion strategy using Estimize data generated positive returns.

We proceeded to backtest this momentum strategy over the S&P100 using out-of-sample I/B/E/S data between 2010 and 2012, and unfortunately the strategy failed there too. The following figure is the daily P&L of the strategy from 2010-2014.

Cumulative P&L of momentum Strategy with I/B/E/S 

So how would Deltix’s mean-reversion strategy with Estimize data work over this out-of-sample period? Unfortunately, we won’t know because Estimize didn't start collecting data until the end of 2011. The following table is a summary on the annual returns comparing different strategies using different data sets and periods.


As a result, we cannot conclude that Estimize data is consistently better than I/B/E/S data in terms of generating alpha: it depends on the strategy deployed. We also cannot decide which strategy – mean-reversion or momentum – is consistently better: it depends on the time period and the data used. The only conclusion we can reach is that the short duration of the Estimize data coupled with our lack of proper price data in 2014 means that we cannot have a statistically significant backtest. This state of inconclusiveness will of course be cured in time.

Yang Gao, Ph.D., is a research intern at QTS Capital Management, LLC.

Industry Update
(No endorsement of companies or products is implied by our mention.)
  • There is a good discussion comparing Quantconnect to Quantopian here.
  • For FX traders, Rizm offers a comparable service as Quantconnect and Quantopian as it is directly connected to FXCM.
  • Quantopian now offers free fundamental data from MorningStar. Also, check out their Quantopian Managers Program where you can compete to manage real money.
Workshop Update

Our next online workshop will be Millisecond Frequency Trading on March 25-26. It is for traders who are interested in intraday trading (even if not at millisecond frequency) and who want to defend against certain HFT tactics.

Managed Account Program Update

Our FX Managed Account program had a strong finish in 2014, with annual net return of 69.86%.

Follow me on Twitter: @chanep

Friday, November 14, 2014

Rent, don’t buy, data: our experience with QuantGo (Guest Post)

By Roger Hunter

I am a quant researcher and developer for QTS Partners, a commodity pool Ernie (author of this blog) founded in 2011. I help Ernie develop and implement several strategies in the pool and various separate accounts.  I wrote this article to give insights into a very important part of our strategy development process: the selection of data sources.

Our main research focus is on strategies that monitor execution in milliseconds and that hold for seconds through several days. For example, a strategy that trades more than one currency pair simultaneously must ensure that several executions take place at the right price and within a very short time. Backtesting requires high quality historical intraday quote and trade, preferably tick data for testing.  Our initial focus was futures and after looking at various vendors for the tick data quality and quantity we needed, we chose Nanex data which is aggregated at 25ms. This means, for example, that aggressor flags are not available. We purchased several years of futures data and set to work.

Earlier this year we needed to update our data and discovered that Nanex prices had increased significantly. We also needed quotes and trades, and data for more asset classes including US equities and options.

We looked at which has good data but is very expensive and you pay up-front per symbol.  There are other services like and where you pay based on your monthly usage (number of data requests made) which is a model we do not like.  We ended up choosing, where you have unlimited access to years of global tick or bar data for a fixed monthly subscription fee per data service.

On QuantGo, you get computer instances in your own secure and private cloud built on Amazon AWS with on-demand access to a wide range of global intraday tick or bar data from multiple data vendors.  Since you own and manage the computer instances you can choose any operating system, install any software, access the internet or import your own data.  With QuantGo the original vendor data must remain in the cloud but you can download your results, this allows QuantGo to rent access to years of data at affordable monthly prices.

All of the data we have used so far is from AlgoSeek (one of QuantGo’s data vendors). This data is survivorship bias-free and is exactly as provided by the exchanges at the time. Futures quotes and trades download very quickly on the system. I am testing options strategies, which is challenging due to the size of the data. The data is downloaded in highly compressed form which is then expanded (by QuantGo) to a somewhat verbose text form.  Before the price split, a day of option quotes and trades for AAPL was typically 100GB in this form. Here is a data sample from the full Options (OPRA) data:

Timestamp, EventType, Ticker, OptionDetail, Price, Quantity, Exchange, Conditions
08:30:02.493, NO_QUOTE BID NB, LLEN, PUT at 7.0000 on 2013-12-21, 0.0000, 0, BATS, F
08:30:02.493, NO_QUOTE ASK, LLEN, CALL at 7.0000 on 2013-12-21, 0.0000, 0, BATS, F
09:30:00.500, ROTATION ASK, LLEN, PUT at 2.0000 on 2013-07-20, 0.2500, 15, ARCA, R
09:30:00.500, ROTATION BID, LLEN, PUT at 2.0000 on 2013-07-20, 0.0000, 0, ARCA, R
09:30:00.507, FIRM_QUOTE ASK NB, LLEN, PUT at 5.0000 on 2013-08-17, 5.0000, 7, BATS, A
09:30:00.508, FIRM_QUOTE BID NB, LLEN, PUT at 6.0000 on 2013-08-17, 0.2000, 7, BATS, A

These I convert to a more compact format, and filter out lines we don't need (e.g. NO_QUOTE, non-firm, etc.)

The quality of the AlgoSeek data seems to be high. One test I have performed is to record live data and compare it with AlgoSeek. This is possible because the AlgoSeek historical data is now updated daily, and is one day behind for all except options, which varies from two days to five (they are striving for two, but the process involves uploading all options data to special servers --- a significant task). Another test is done using OptionNET Explorer (ONE). ONE data is at 5-minute intervals and the software displays midpoints only. However, by executing historical trades, you can see the bid and ask values for options at these 5-minute boundaries. I have checked 20 of these against the AlgoSeek data and found exact agreement in every case. In any event, you are free to contact the data vendors directly to learn more about their products. The final test of data quality (and of our market model) is the comparison of live trading results (at one contract/spread level) with backtests over the same period.

The data offerings have recently expanded dramatically with more data partners and now include historical data from (QuantGo claims) "every exchange in the world". I haven't verified this, but the addition of elementized, tagged and scored news from Acquire Media, for example, will allow us to backtest strategies of the type discussed in Ernie's latest book.

So far, we like the system. For us, the positives are:

1. Affordable Prices.  The reason that the price has been kept relatively low is that original vendor data must be kept and used in the QuantGo cloud. For example, to access years of US data we have been paying
Five years of US Equities Trades and Quotes (“TAQ”) is $250 per month
Five years of US Equities 5 minute Bars $75 per month
Three Years of US Options 1 minute bars $100 per month.
Three Year of CME, CBOT, NYMEX Futures Trades and Quotes $250 per month

2.  Free Sample Data.  Each data service has free demo data which is actual real historical data where I can select data from the demo date range.  This allowed me to view and work with the data before subscribing.

3. One API.  I have one API to access different data vendors.  QuantGo gives me a java GUI, python CLI and various libraries (R, Matlab, Java).

4. On-Demand.  The ability to select the data we want "on demand" via a subscription from a website console at any time. You can select data for any symbol and for just a day or for several years.

5. Platform not proprietary.  We can use any operating system or software with the data as it is being downloaded to virtual computers we fully control and manage.

Because all this is done in the cloud, we have to pay for our cloud computer usage as well.  While cloud usage is continuing to drop rapidly in price it is still a variable cost and it needs to monitored.  QuantGo does provide close to real-time billing estimates and alarms you can preset at dollar values.

I was at first skeptical of the restriction of not being able to download the data vendor’s tick or bar data, but so far this hasn't been an issue as in practice we only need the results and our derived data sets. I'm told that if you want to buy the data for your own computers, you can negotiate directly with the individual data vendor and will get a discount if you have been using it for a while on QuantGo.

As we use the windows operating system we access our cloud computers with Remote Desktop and there have been some latency issues, but these are tolerable. On the other hand, it is a big advantage to be able to start with a relatively small virtual machine for initial coding and debugging, then "dial up" a much larger machine (or group of machines) when you want to run many compute and data intensive backtests. While QuantGo is recently launched and is not perfect, it does open up the world of the highest institutional quality data to those of us who do not have the data budget of a Renaissance Technologies or D.E. Shaw.

Industry Update
(No endorsement of companies or products is implied by our mention.)
  • A new site for jobs in finance was recently launched:
  • A new software package Geode by Georgica Software can backtest tick data, and comes with a fairly rudimentary fill simulator.
  • now incorporates a new IPython based research environment that allows interactive data analysis using minute level pricing data in Python.
Workshops Update

My next online Quantitative Momentum Strategies workshop will be held on December 2-4. Any reader interested in futures trading  in general would benefit from this course.

Managed Account Program Update

Our FX Managed Account program had an unusually profitable month in October.

Follow me on Twitter: @chanep

Friday, September 05, 2014

Moving Average Crossover = Triangle Filter on 1-Period Returns

Many traders who use technical analysis favor the Moving Average Crossover as a momentum indicator. They compute the short-term minus the long-term moving averages of prices, and go long if this indicator just turns positive, or go short if it turns negative. This seems intuitive enough. What isn't obvious, however, is that MA Crossover is nothing more than an estimate of the recent average compound return.

But just when you might be tempted to ditch this indicator in favor of the average compound return, it can be shown that the MA Crossover is also a triangle filter on the 1-period returns. (A triangle filter in signal processing is a set of weights imposed on a time series that increases linearly with time up to some point, and then decreases linearly with time up to the present time. See the diagram at the end of this article.) Why is this interpretation interesting? That's because it leads us to consider other, more sophisticated filters (such as the least square, Kalman, or wavelet filters) as possible momentum indicators. In collaboration with my former workshop participant Alex W. who was inspired by this paper by Bruder et. al., we present the derivations below.


First, note that we will compute the moving average of log prices y, not raw prices. There is of course no loss or gain in information going from prices to log prices, but it will make our analysis possible. (The exact time of the crossover, though, will depend on whether we use prices or log prices.) If we write MA(t, n1) to denote the moving average of n1 log prices ending at time t, then the moving average crossover is MA(t, n1)-MA(t, n2), assuming n1< n2.  By definition,

MA(t, n1)=(y(t)+y(t-1)+...+y(t-n1+1))/n1
MA(t, n2)=(y(t)+y(t-1)+...+y(t-n1+1)+y(t-n1)+...+y(t-n2+1)/n2

MA(t, n1)-MA(t, n2)
=[(n2-n1)/(n1*n2)] *[y(t)+y(t-1)+...+y(t-n1+1)] - (1/n2)*[y(t-n1)+...+y(t-n2+1)]    
=[(n2-n1)/n2] *MA(t, n1)-[(n2-n1)/n2]*MA(t-n1, n2-n1)
=[(n2-n1)/n2]*[MA(t, n1)-MA(t-n1, n2-n1)]

If we interpret MA(t, n1) as an approximation of the log price at the midpoint (n1-1)/2 of the time interval [t-n1+1, t], and MA(t-n1, n2-n1) as an approximation of the log price at the midpoint (n2-n1-1)/2 of the time interval [t-n1, t-(n2-n1)], then [MA(t, n1)-MA(t-n1, n2-n1)] is an approximation of the total return over a time period of n2/2. If we write this total return as an average compound growth rate r multiplied by the period n2/2, we get

MA(t, n1)-MA(t, n2)  ≈ [(n2-n1)/n2]*(n2/2)*r

r ≈ [2/(n2-n1)]*[MA(t, n1)-MA(t, n2)]

as shown in Equation 4 of the paper cited above. (Note the roles of n1 and n2 are reversed in that paper.)


Next, we will show why the MA crossover is also a triangle filter on 1-period returns. Simplifying notation by fixing t to be 0,

MA(t=0, n1)

Writing the returns from t-1 to t as R(t), this becomes

MA(t=0, n1)=(1/n1)*[R(0)+2*R(-1)+...+n1*R(-n1+1)]+y(-n1)


MA(t=0, n2)=(1/n2)*[R(0)+2*R(-1)+...+n2*R(-n2+1)]+y(-n2)

So MA(0, n1)-MA(0, n2)

Note that the last line above is just the total cumulative return from -n2 to -n1, which can be written as


Hence we can absorb that into the expression prior to that

MA(0, n1)-MA(0, n2)

We can see the coefficients of R's from t=-n2+2 to -n1 form the left side of an triangle with positive slope, and those from  t=-n1+1 to 0 form the right side of the triangle with negative slope. The plot (click to enlarge) below shows the coefficients as a function of time, with n2=10, n1=7, and current time as t=0. The right-most point is the weight for R(0): the return from t=-1 to 0.

Q.E.D. Now I hope you are ready to move on to a wavelet filter!

P.S. It is wonderful to be able to check the correctness of messy algebra like those above with a simple Matlab program!

New Service Announcement

Our firm QTS Capital Management has recently launched a FX Managed Accounts program. It uses one of the mean-reverting strategies we have been trading successfully in our fund for the last three years, and is still going strong despite the low volatility in the markets. The benefits of a managed account are that clients retain full ownership and control of their funds at all times, and they can decide what level of leverage they are comfortable with. Unlike certain offshore FX operators, QTS is a CPO/CTA regulated by the National Futures Association and the Commodity Futures Trading Commission.

Workshops Update

Readers may be interested in my next workshop series to be held in London, November 3-7. Please follow the link at the bottom of this page for information.

Follow me on Twitter: @chanep

Monday, August 18, 2014

Kelly vs. Markowitz Portfolio Optimization

In my book, I described a very simple and elegant formula for determining the optimal asset allocation among N assets:

F=C-1*M   (1)

where F is a Nx1 vector indicating the fraction of the equity to be allocated to each asset, C is the covariance matrix, and M is the mean vector for the excess returns of these assets. Note that these "assets" can in fact be "trading strategies" or "portfolios" themselves. If these are in fact real assets that incur a carry (financing) cost, then excess returns are returns minus the risk-free rate.

Notice that these fractions, or weights as they are usually called, are not normalized - they don't necessarily add up to 1. This means that F not only determines the allocation of the total equity among N assets, but it also determines the overall optimal leverage to be used. The sum of the absolute value of components of F divided by the total equity is in fact the overall leverage. Thus is the beauty of Kelly formula: optimal allocation and optimal leverage in one simple formula, which is supposed to maximize the compounded growth rate of one's equity (or equivalently the equity at the end of many periods).

However, most students of finance are not taught Kelly portfolio optimization. They are taught Markowitz mean-variance portfolio optimization. In particular, they are taught that there is a portfolio called the tangency portfolio which lies on the efficient frontier (the set of portfolios with minimum variance consistent with a certain expected return) and which maximizes the Sharpe ratio. Left unsaid are

  • What's so good about this tangency portfolio?
  • What's the real benefit of maximizing the Sharpe ratio?
  • Is this tangency portfolio the same as the one recommended by Kelly optimal allocation?
I want to answer these questions here, and provide a connection between Kelly and Markowitz portfolio optimization.

According to Kelly and Ed Thorp (and explained in my book), F above not only maximizes the compounded growth rate, but it also maximizes the Sharpe ratio. Put another way: the maximum growth rate is achieved when the Sharpe ratio is maximized. Hence we see why the tangency portfolio is so important. And in fact, the tangency portfolio is the same as the Kelly optimal portfolio F, except for that fact that the tangency portfolio is assumed to be normalized and has a leverage of 1 whereas F goes one step further and determines the optimal leverage for us. Otherwise, the percent allocation of an asset in both are the same (assuming that we haven't imposed additional constraints in the optimization problem). How do we prove this?

The usual way Markowitz portfolio optimization is taught is by setting up a constrained quadratic optimization problem - quadratic because we want to optimize the portfolio variance which is a quadratic function of the weights of the underlying assets - and proceed to use a numerical quadratic programming (QP) program to solve this and then further maximize the Sharpe ratio to find the tangency portfolio. But this is unnecessarily tedious and actually obscures the elegant formula for F shown above. Instead, we can proceed by applying Lagrange multipliers to the following optimization problem (see for a similar treatment):

Maximize Sharpe ratio = FT*M/(FT*C*F)1/2    (2)

subject to constraint FT*1=1   (3)

(to emphasize that the 1 on the left hand side is a column vector of one's, I used bold face.)

So we should maximize the following unconstrained quantity with respect to the weights Fof each asset i and the Lagrange multiplier λ:

FT*M/(FT*C*F)1/2  - λ(FT*1-1)  (4)

But taking the partial derivatives of this fraction with a square root in the denominator is unwieldy. So equivalently, we can maximize the logarithm of the Sharpe ratio subject to the same constraint. Thus we can take the partial derivatives of 

log(FT*M)-(1/2)*log(FT*C*F)  - λ(FT*1-1)   (5)

with respect to Fi. Setting each component i to zero gives the matrix equation

(1/FT*M)M-(1/FT*C*F)C*F=λ1   (6)

Multiplying the whole equation by Fon the right gives

(1/FT*M)FT*M-(1/FT*C*F)FT*C*F=λFT*1   (7)

Remembering the constraint, we recognize the right hand side as just λ. The left hand side comes out to be exactly zero, which means that λ is zero. A Lagrange multiplier that turns out to be zero means that the constraint won't affect the solution of the optimization problem up to a proportionality constant. This is satisfying since we know that if we apply an equal leverage on all the assets, the maximum Sharpe ratio should be unaffected. So we are left with the matrix equation for the solution of the optimal F:

C*F=(FT*C*F/FT*M)M    (8)

If you know how to solve this for F using matrix algebra, I would like to hear from you. But let's try an ansatz F=C-1*M as in (1). The left hand side of (8) becomes M, the right hand side becomes (FT*M/FT*M)M = M as well. So the ansatz works, and the solution is in fact (1), up to a proportionality constant. To satisfy the normalization constraint (3), we can write

F=C-1*M / (1T*C-1*M)  (9)

So there, the tangency portfolio is the same as the Kelly optimal portfolio, up to a normalization constant, and without telling us what the optimal leverage is.

Workshop Update:

Based on popular demand, I have revised the dates for my online Mean Reversion Strategies workshop to be August 27-29. 

Follow me @chanep on Twitter.

Wednesday, July 02, 2014

Another "universal" capital allocation algorithm

Financial engineers are accustomed to borrowing techniques from scientists in other fields (e.g. genetic algorithms), but rarely does the borrowing go the other way. It is therefore surprising to hear about this paper on a possible mechanism for evolution due to natural selection which is inspired by universal capital allocation algorithms.

A capital allocation algorithm attempts to optimize the allocation of capital to stocks in a portfolio. An allocation algorithm is called universal if it results in a net worth that is "similar" to that generated by the best constant-rebalanced portfolio with fixed weightings over time (denoted CBAL* below), chosen in hindsight. "Similar" here means that the net worth does not diverge exponentially. (For a precise definition, see this very readable paper by Borodin, et al. H/t: Vladimir P.)

Previously, I know only of one such universal trading algorithm - the Universal Portfolio invented by Thomas Cover, which I have described before. But here is another one that has proven to be universal: the exceedingly simple EG algorithm.

The EG ("Exponentiated Gradient") algorithm is an example of a capital allocation rule using "multiplicative updates": the new capital allocated to a stock is proportional to its current capital multiplied by a factor. This factor is an exponential function of the return of the stock in the last period. This algorithm is both greedy and conservative: greedy because it always allocates more capital to the stock that did well most recently; conservative because there is a penalty for changing the allocation too drastically from one period to the next. This multiplicative update rule is the one proposed as a model for evolution by natural selection.

The computational advantage of EG over the Universal Portfolio is obvious: the latter requires a weighted average over all possible allocations at every step, while the former needs only know the allocation and returns for the most recent period. But does this EG algorithm actually generate good returns in practice? I tested it two ways:

1) Allocate between cash (with 2% per annum interest) and SPY.
2) Allocate among SP500 stocks.

In both cases, the only free parameter of the model is a number called the "learning rate" η, which determines how fast the allocation can change from one period to the next. It is generally found that η=0.01 is optimal, which we adopted. Also, we disallow short positions in this study.

The benchmarks for comparison for 1) are, using the notations of the Borodin paper,

a)  the buy-and-hold SPY portfolio BAH, and
b) the best constant-rebalanced portfolio with fixed allocations in hindsight CBAL*.

The benchmarks for comparison for 2)  are

a) a constant rebalanced portfolio of SP500 stocks with equal allocations U-CBAL,
b) a portfolio with 100% allocation to the best stock chosen in hindsight BEST1, and
c) CBAL*.

To find CBAL* for a SP500 portfolio, I used Matlab Optimization Toolbox's constrained optimization function fmincon.

There is also the issue of SP500 index reconstitution. It is complicated to handle the addition and deletion of stocks in the index within a constrained optimization function. So I opted for the shortcut of using a subset of stocks that were in SP500 from 2007 to 2013, tolerating the presence of surivorship bias. There are only 346 such stocks.

The result for 1) (cash vs SPY) is that the CAGR (compound annualized growth rate) of EG is slightly lower than BAH (4% vs 5%). It turns out that BAH and CBAL* are the same: it was best to allocate 100% to SPY during 2007-2013, an unsurprising recommendation in hindsight.

The result for 2) is that the CAGR of EG is higher than the equal-weight portfolio (0.5% vs 0.2%). But both these numbers are much lower than that of BEST1 (39.58%), which is almost the same as that of CBAL* (39.92%). (Can you guess which stock in the current SP500 generated the highest CAGR? The answer, to be revealed below*, will surprise you!)

We were promised that the EG algorithm will perform "similarly" to CBAL*, so why does it underperform so miserably? Remember that similarity here just means that the divergence is sub-exponential: but even a polynomial divergence can in practice be substantial! This seems to be a universal problem with universal algorithms of asset allocation: I have never found any that actually achieves significant returns in the short span of a few years. Maybe we will find more interesting results with higher frequency data.

So given the underwhelming performance of EG, why am I writing about this algorithm, aside from its interesting connection with biological evolution? That's because it serves as a setup for another, non-universal, portfolio allocation scheme, as well as a way to optimize parameters for trading strategies in general: both topics for another time

Workshops Update:

My next online workshop will be on  Mean Reversion Strategies, August 26-28. This and the Quantitative Momentum workshops will also be conducted live at Nanyang Technological University in Singapore, September 18-21.

Do follow me @chanep on Twitter, as I often post links to interesting articles there.

*The SP500 stock that generated the highest return from 2007-2013 is AMZN.

Friday, May 09, 2014

Short Interest as a Factor

Readers of will no doubt be impressed by this chart and the accompanying article:

Cumulative Returns of Most Shorted Stocks in 2013

Indeed, short interest (expressed as the number of shares shorted divided by the total number of shares outstanding) has long been thought to be a useful factor. To me, the counter-intuitive wisdom is that the more a stock is shorted, the better is its performance. You might explain that by saying this is a result of the "short squeeze", when there is jump in price perhaps due to news and stock lenders are eager to sell the stock they own. If you have borrowed this stock to short, your borrowed stock may be recalled and you will be forced to buy cover at this most inopportune time. But this is an unsatisfactory explanation, as this will result only in a short term (upward) momentum in price, not the sustained out-performance of the most shorted stocks. This long-term out-performance seems to suggest that short sellers are less informed than the average trader, which is odd.

Whatever the explanation, I am intrigued to find out if short interest really is a good factor to incorporate into a comprehensive factor model over the long term.

The result? Not particularly impressive. It turns out that 2013 was one of the best years for this factor (hence the impressive chart above). For that year, a daily-rebalanced long-short portfolio (long 50 most shorted stocks and short 50 least shorted stocks in the SPX) returned 6.9%, with a Sharpe ratio of 2 and a Calmar ratio of 2.9. However, if we extend our backtest to 2007, the APR is only 2.8%, with a Sharpe ratio of 0.5 and a Calmar ratio of 0.3. This backtest was done using survivorship-bias-free data from CRSP, with short interest data provided by Compustat.

Here is the cumulative returns chart from 2007-2013:

Cumulative Returns of LS Portfolio based on Short Interest: 2007-2013

Interesting, trying this on the SP600 small-cap universe yielded negative returns, possibly meaning that short-sellers of small caps do have superior information.

I promise, this will be the last time I talk about factors in a while!

Tech Update:

I was shocked to learn that Matlab now offers licenses for just $149 - the so-called Matlab Home  (h/t: Ken H.) In addition, its Trading Toolbox now offers API connection to Interactive Brokers, in addition to a few other brokerages. I am familiar with both Matlab and R, and while I am impressed by the large number of free, sophisticated statistical packages in R, I still stand by Matlab as the most productive platform for developing our own strategies. The Matlab development (debugging) environment is just that much more polished and easy-to-use. The difference is bigger than Microsoft Word vs. Google Docs.
A reader Ravi B. told me that there is a website called if you want to try out different seasonal futures strategies.
Finally, a startup at offers machine learning algorithms to help you find the best combination of technical indicators for trading FX.
Workshops Update:

I am now offering the Millisecond Frequency Trading (MFT) Workshop as an online course on June 26- 27. Previously, I have only offered it live in London and to a few institutional investors. It has two main parts:

Part 1: introducing techniques for traders who want to avoid HFT predators.

Part 2: how to backtest a strategy that requires tick data with millisecond resolution using Matlab.

The example strategy used is based on order flow. For more details, please visit

Additionally, I will be teaching the Mean Reversion and Momentum (but not MFT) workshops in Hong Kong on June 17-20.

Thursday, March 27, 2014

Update on the fundamentals factors: their effect on small cap stocks

In my last post, I reported that the fundamental factors used by Lyle and Wang seem to generate no returns on SP500 large cap stocks. These fundamental factors are the growth factor return-on-equity (ROE), and the value factor book-to-market ratio (BM).

I have since studied the effect of these factors on SP600 small cap stocks since 2004, using a survivorship-bias-free database combining information from both Compustat and CRSP. This time, the factors do produce an annualized average return of 4.7% and a Sharpe ratio of 0.8. Though these numbers are nowhere near the 26% return that Lyle and Wang found, they are still statistically significant. I have plotted the equity curve below.

Equity curve of long-short small-cap portfolio based on regression on ROE and BM factors (2004-2013)
One may wonder whether ROE or BM is the more important factor. So I run a simpler model which uses one factor at a time to rank stocks every day. We buy stocks in top decile of ROE, and short the ones in the bottom decile. Ditto for BM. I found an annualized average return of 5% with a Sharpe ratio of 0.8 using ROE only, and only 0.8% with a Sharpe ratio of 0.09 using BM only. The value factor BM is almost completely useless! Indeed, if we were to first sort on ROE, pick the top and bottom deciles, and then sort on BM, and pick the top and bottom halves, the resulting average return is almost the same as sorting on ROE alone. I plotted the equity curve for sorting on ROE below.

Equity curve of long-short small-cap portfolio based on top and bottom deciles of ROE (2004-2013)

Notice the sharp drawdown from 2008-05-30 to 2008-11-04, and the almost perfect recovery since then. This mirrors the behavior of the equity market itself, which raises the question of why we bother to construct a long-short portfolio at all as it provides no hedge against the downturn. It is also interesting to note that this factor does not exhibit "momentum crash" as explained in a previous article: it does not suffer at all during the market recovery. This means we should not automatically think of a fundamental growth factor as similar to price momentum.

My conclusion was partly corroborated by I. Kaplan who has written a preprint on a similar topic. He found that a long-short portfolio created using the ratio EBITA/Enterprise Value on large caps generates a Sharpe ratio of about 0.6 but with very little drawdown unlike the ROE factor that I studied above as applied to small caps.

As Mr. Kaplan noted, these results are in some contradiction not only with Lyle and Wang's paper, but also with the widely circulated paper by Cliff Asness et al. These authors found the the BM factor works in practically every asset class. Of course, the timeframe of their research is much longer than my focus above. Furthermore, they have excluded financial and penny stocks, though I did not find such restrictions to have great impact in my study of large cap portfolios. In place of a fundamental growth factor, these authors simply used price momentum over an 11-month period (skipping the most recent month), and found that this is also predictive of future quarterly returns.

Finally, we should note that the ROE and BM factors here are quite similar to the Return-on-Capital and Earnings Yield factors used by Joel Greenblatt in his famous "Little Book That Still Beats The Market". One wonders if those factors suffer a similar drawdown during the financial crisis.


My online Momentum Workshop will be offered on May 5-7. Please visit for registration details. Furthermore, I will be teaching my Mean Reversion, Momentum, and Millisecond Frequency Trading workshops in Hong Kong on June 17-20.