Sunday, December 20, 2009

Pairs Trading--Cointegration Testing

There are several papers on this topic. A quick google search gives you a list of research papers on this topic. Cointegration technique is sometimes used to do Pairs trading. By checking if a pair of stocks are cointegrated, one could go long on one stock and short on the other (multiplied by Hedge Ratio). We are thus trying to be market neutral. Carol Alexander in the book "Market Models" gives a very good explaination of the theory behind it.

A set of I(1) series are termed "cointegrated" if there is a linear combination of these series that is stationary. Stock A and Stock B are cointegrated if A,B are approximately I(1), but there is Hedge Ratio such that

Spread = StockA - Hedge_Ratio * StockB is Approximately I(0). I.e The spread is stationary or mean reverting.

So We perform the following Steps to Check if two stocks are cointegrated:

Step1 : Check if the two stocks are atleast integrated of order 1, I(1)
This is done with Augmented Dickey fuller Test

Step2: After they pass the above test, perform a Cointegration augmented dickey fuller test

Step 3: After it passes the ADF test, we can perform a Ordinary Least Squares Regression to get the Hedge Ratio (The Beta of the regression)

Step 4: So the Spread = StockA - Hedge_Ratio * StockB would now be Cointegrated (mean reverting)

Step 5: Figuring when to exit : Calculate Half Life

Half life basically tells you how much time it takes for the spread to revert back to half the distance of the mean.

Step 6: Calculate the Spread TODAY. Calculate the Standard deviation of the spread upto the day before. Check how far the current spread is from the historical average. If it is greater than 1.5 standard deviations (or any other threshold), then go short the spread otherwise go long the spread. I.e Go Short Stock A and Long Hedge Ratio * Stock B. Be in the trade until the half life calculated for the pair. If the Half Life time period has passed, Get out of the trade.

These are simple steps. One should put on more work and research on it to develop it into a practical trading strategy.

Some interesting papers are here:

Cointegration Paper I

Cointegration Carol Alexander


MATLAB Code Here:

MATLAB COINTEGRATION CODE

CointPairsTrade.m is the main function and calls all other functions

Please let me know if there are any bugs

43 comments:

sjev said...

Excellent post! I've been looking in to the pair trading for the past couple of month, during which I've dug through lots of reading material. Your post sums it up very nicely and the papers you've provided are much better than everything I've seen so far.
I'll try to take a look at your code and benchmark it against my own.

Stochastic Universe said...

Could you please post your PriceMat and Symbols files too, so I can back-out your output file. Thanks for posting this!

Eric D said...

Very good post....I'm an independent stock & futures trader & been doing a fair bit of work on pairs trading with Matlab. I use it mainly as a research tool for a lot of the strategies I run. I had a couple of comments/questions

1) You can cut the # of calls to the TestForCoint function if you only call it when jdx > idx around line 34. That cuts the runtime down a lot.

2) Using xlswrite fails (on my machine) on large datasets. I used the fprintf function without any problems instead. I have a function I wrote to create flat files with headers using fprintf I can send you if you want.

3) In the TestForCoint code it looks like you're comparing the absolute values of the adf test to the critical value for Hx & Hy. I think the less than sign should be a greater than sign since you're taking the absolute value of the test statistic, or the absolute value can be eliminated. Am I understanding things right or did I miss something?

I would be open to collaborating on some ideas/concepts etc if you're interested. If so, email me and we can work on some things together.

Regards,
Eric

Amit R said...

Great Post. This was really helpful. Can you please post your PriceMat and Symbols files as well.

gamma_sf said...

I am curious to see if you have the PriceMat and Symbol data template for back testing. Thank you for the great work.

Anonymous said...

Could you please add the TestForCoint.m in the file? It's missing.

Many thanks!

Eric D said...

The Matlab File Exchange has code to pull data from Yahoo so you can create any price file you wish.

Link to code:
http://www.mathworks.com/matlabcentral/fileexchange/23569

Also, on my earlier comment (point #3), I re-read the hypothesis testing methods for cointegration testing and the author's methods appear correct. I stand corrected.

Regards,
Eric

Human Rhythm said...

Looks cool. I am only thinking about the robustness though. You are using

- - -
Step1 : Check if the two stocks are atleast integrated of order 1, I(1). This is done with Augmented Dickey fuller Test
- - -

ADF does not do it for the time series with structural breaks. After subprime, e.g. Zivot & Andrews might be more reliable as it would accept one NON-specified break...

What do you think, how important a topic?

HR

Eric D said...

@ Human Rhythm:
Thanks for that link. I'll dive into that a lot more.

Most of my pairs trading models i've work on have died due to a structural break so I think this concept is very important.

Most of my strategies have been net profitable until the break came & then I gave back most of what I had earned.

Thanks again for the link.

Eric

Kourosh said...

The concepts presented here are covered (almost verbatim) in E.P.Chan's quantitative trading book, along with similar Matlab examples. What would be really valuable is to see some extension of this work for determining cointegration in a multivariate system, perhaps using the Johansen methodology - which is also a part of the Econometrics Toolbox.

Anonymous said...

Nice blog as for me. I'd like to read something more about that theme. The only thing it would also be great to see here is some pictures of some gadgets.
Alex Watcerson
Phone jammers

Anonymous said...

The 'TestForCoint' file is not present in the winzip

Rohish said...

Hey Eric,

Can you post the section of code you are talking about. I found the model working with current sign bnut not changing < to >. I could be wrong.
<>

Human Rhythm said...

Eric:

I had some extra time, so here are some references for unit root testing with structural breaks.

http://rwalks.wordpress.com/

HR

Aris said...

Hi just a suggestion since you have included some of the spatial econometrics codes in your cointegration file, why not make full use of the library. For example in line 69 of the CointPairsTrade.m you could use the ols regress function to find beta and the residual instead of beta = X\Y. and res = Y-X*beta.

Alexandre Rubesam said...

First, I'd like to say this is a really nice blog. One comment: in the code you apparently include an option for estimating the regression with a constant (which is commented out). To me it seems that the correct regression would not include a constant. Could you comment on that pls?

Also Aris, ols or regress is unbelievably slower than what is done in the code.

sean said...

Hello,
Great website… I am having trouble implementing the pairs trading, cointegration code. I am using your files along with files I got from the spatial econometrics website. (MATLAB told me I needed to add more supporting functions to the path in order to execute this code.)I am really close, but an error I can’t get past is showing up in the support file mprint.m
Here is the MATLAB output:
Augmented DF test for co-integration variables: variable 1,variable 2
??? Undefined function or variable "version".

Error in ==> mprint at 270
[version,junk] = version; vers = str2num(version);

Error in ==> prt_coint at 173
mprint(tmp,in);

Error in ==> demonstratecadf at 17
prt_coint(res);


the Error in ==> mprint at 270 is the problem, [version, junk] in mprint is causing me a lot of problems and time. Could you please push me in the right direction? Sincerely Many thanks.

Anonymous said...

I'm having a problem with large datasets. I'm running into the typical "Warning: Matrix is singular to working precision" error in the call to adf (the provided augmented dickey-fuller), which usually means a large matrix is being inverted. I think these two lines need to be re-written somehow. I'm a little sketchy about what they're trying to do though, it doesn't look like a standard case of "replace with a \ and everything's good" that I've seen. Any advice?

line 74:
b = inv(z'*z)*(z'*dep);

and line 82:
var_cov = so*inv(z'*z);

sjev said...

coming back almost a year after my first comment I must say that I've abandoned the classical pair trading altogether. The DF test does not seem to perform well and the OU half-life time estimator seem totally useless. I'm not here to complain about your code (which is very good btw), it is just that I was not able to get any descent results with DF or OU in the past year. Some other types of pair trading seem to work fairly well though. I'm very curious if somebody here managed to find a strategy using DF test with a sharpe above 2. Anybody?

Credit Union said...

I have been a long-time visitor to this site and wanted to finally post something. This site is top notch, not to mention the market commentary + t/a on the charts.

Anonymous said...

The johansen.m file is not in the zip file posted on your blog

stock trading seminars said...

I agree to sjev. This is an excellent post. I learn a lot of things about pairs trading in this article and I really had a great time in reading it.

Anonymous said...

Can someone please post the code or teach us how in the world you find PriceMat,Symbols,Ndays,NumStocks.....It's such a waste to have this package put together yet nobody can figure it out without an MBA in Matlab. Thank you so much for your time!

Anonymous said...

In your Matlab code your spread is calculated in the following manner (different from the post):
Spread = Ticker2 - beta * Ticker1

Anonymous said...

Hi,
The Half Life you get from Matlab - is this in days or some other time unit?

Thanks a lot

trading course said...

I don't have any idea about this Pairs Trading that is why I am so glad that I found your post. I never thought that it is very important to know.

Anonymous said...

Hello, thank you for your post. I have a project i would like to discuss with you. Can you please contact me at rbertematti@gmail.com

Anonymous said...

SIR I AM WORKING IN INDIAN MARKET DO U HAVE ANY PAIR TRADING SOFTWARE USING STATISTICAL TOOLS LIKE CORELATION ,COINTEGRATION ETC. SO THAT I CAN EASILY WORK IN ALL INDIAN MARKETS LIKE NSE,NCDEX,MCX,MCX-SX ETC. WAITING FOR A FAVOURABLE ANSWER AT AMITGOYAL4U@YAHOO.COM

Anonymous said...

SIR I AM WORKING IN INDIAN MARKET DO U HAVE ANY PAIR TRADING SOFTWARE USING STATISTICAL TOOLS LIKE CORELATION ,COINTEGRATION ETC. SO THAT I CAN EASILY WORK IN ALL INDIAN MARKETS LIKE NSE,NCDEX,MCX,MCX-SX ETC. WAITING FOR A FAVOURABLE ANSWER AT AMITGOYAL4U@YAHOO.COM

Anonymous said...

Thanks for sharing. Have a couple of questions:
1. what is the size of sample for the cointegration estimation? Different sizes may give different results.
2. how does the model perform historically (back testing?)

Louis said...

Thanks a lot for this post. I am going through the matlab code and I think I found a bug (not sure though):

In the function adf.m, if the correct regression for ADF test is: diff(x(t)) = beta*x(t-1) + theta*t + error, then the variable 'dep' and 'z' are defined in the wrong way. Shouldn't the correct way be:

dep = tdiff(x,1);
z = lag(x,k);

Anonymous said...

You have shared very useful blog and I enjoyed reading it. I got very good points from this blog post.I really appreciate your post.
binary options

Anonymous said...

hello,
i have a very basic question:
how can i build the Pricemat matrix and Symbols the matrix?

Anonymous said...

PriceMat & NumStocks please

Anonymous said...

Hi there... am I correct that due to the hedge ratio most of the pairs will not be dollar neutral? With some of the pairs having quite a large net cash bias? Thanks.

NCDEX trading Tips said...

I had read the blog, it is qualitative and the information is worth to read.Most of my strategies have been net profitable until the break came & then I gave back most of what I had earned.

Best Share Trading Tips said...

Hi,

Our Company Best Share Trading Tips Offer Free Share Trading Tips, Intraday Tips, Free Stock Market Tips in India.

For More Info Visit: http://www.bestsharetradingtips.com/

Tina Roy said...

Very nice post and well written content by writer giving beneficial information to traders. Keep posting stuff like that to grow our knowledge with your information.

Best stock tips provider in india

Anonymous said...

Good Morning
i am a university student.
I saw on the internet a video on the subject and I am fond of pairs trading.
I try to understand the code you wrote but I have not succeeded.
I wanted to ask if you can explain to some parts of the code.
to start I did not understand what they refer to these variables.

1 PriceMat- A Ndays X NumStocks matrix of Prices
2 Symbols-- A Cell NumStocks X 1 matrix of Symbols

I excel in two sets of prices of the two stocks, which variable should I put these two sets of prices?

this code will help me for educational purposes only.

thank you very much for your cooperation
regards

Anonymous said...

How to paste time series of Price of 2 stocks?
And what is Symbols?
Please give me example of use this functions.

Marisa Mazo said...

could you send the PriceMat and Symbols files to make a test.Thanks.

Leah Graham said...

We have trading rules that we need to obey. Trading is not just about making money but is also about discipline.
Trading Rules

Karthika Shree said...

Usually I do not read post on blogs, but I would like to say that this write-up very forced me to try and do it! Your writing style has been surprised me. Great work admin.Keep update more blog.
Matlab Training in Chennai