here<\/a>.<\/p>\n\n\n\nThe CSV doesn’t have a header for the first column, so we edit the header and name the first column ‘Date’. Then we can load and format the CSV like this:<\/p>\n\n\n\n
factors = pd.read_csv(\"famafactors.csv\")\nfactors['Date'] = pd.to_datetime(factors['Date'],format='%Y%m%d')\nfactors['Mkt-RF'] = factors['Mkt-RF'] \/ 100\nfactors = factors.set_index('Date')<\/pre>\n\n\n\nNow, we combine the factors and the returns tables:<\/p>\n\n\n\n
factors = factors['2010-01-04':'2020-08-18']\ntable = factors.merge(table, left_index=True, right_index=True, how='inner')\n<\/pre>\n\n\n\n<\/p>\n\n\n\n
Step 4: Running regressions to find alpha<\/h2>\n\n\n\n At this point, it’s easy to plot regression lines for individual stocks. If we want to see a plot for NL, we can do this:<\/p>\n\n\n\n
sns.regplot(y='Mkt-RF', x='NL', data=table)<\/pre>\n\n\n\nNow, we can start running correlations. For the first test, we will do a regression on the daily data. This will obviously result in a low correlation, due to the amount of noise in the data. However, it gives us a baseline from which we can do further analysis.<\/p>\n\n\n\n
To do this, we use the following code:<\/p>\n\n\n\n
regressions = pd.DataFrame(columns = ['ticker', 'alpha', 'beta', 'rsquared'])\n\n# start at 6 to skip date and fama factors\nfor i in range(6, len(table.columns)):\n \n # for calculating y, we must subtract the risk free rate\n y = table[table.columns[i]] - table['_RF']\n X = table['Mkt-RF']\n\n ro = y.between(y.quantile(.01), y.quantile(.99))\n y = y[ro]\n X = X[ro]\n \n X = sm.add_constant(X)\n \n \n model = sm.OLS(y, X).fit()\n \n regressions.loc[i-6] = [ \n table.columns[i],\n model.params['const'],\n model.params['Mkt-RF'],\n model.rsquared\n ]\n <\/pre>\n\n\n\nNote that we filtered out outliers here that are in the top or bottom 1%. It’s up to you whether you think that makes sense in this particular context.<\/p>\n\n\n\n
For a sanity check we can first run the regressions against a list of ETFs. This should give us the broader market ETFs (S&P 500 ETFs for example) first, and they should be nearly 100% correlated to the Mkt-RF. Here is the output:<\/p>\n\n\n\n
124 CORP -0.002448 0.000005 1.215519e-10\n1008 SHY -0.002567 -0.000072 5.407849e-08\n505 IBND -0.002573 0.000222 1.326580e-07\n758 NOM -0.002519 0.001175 9.270459e-07\n653 LQD -0.002432 0.000688 2.020755e-06\n ... ... ... ...\n1071 SSO -0.002655 1.920029 9.420202e-01\n1060 SPXS -0.002690 -2.859140 9.637690e-01\n1061 SPXU -0.002657 -2.854800 9.639438e-01\n1059 SPXL -0.002696 2.873916 9.641320e-01\n1134 UPRO -0.002690 2.881056 9.644571e-01<\/code><\/pre>\n\n\n\nAs you can see, the best correlated are UPRO and SPXL, which are both triple-long S&P 500 ETFs. The beta for both of these tickers is nearly 3, which makes sense. SPXS, the triple-short S&P 500 ETF, is next, with a beta of -3. And at the bottom, we have a corporate bond ETF.<\/p>\n\n\n\n
Now, we can return to using stock tickers. After running the code again, regressions is a table of all tickers with their daily alpha, beta, and r-squared factors. We can sort it to get the most well correlated and the least well correlated tickers:<\/p>\n\n\n\n
ticker alpha beta rsquared\n1015 GFI -0.002056 0.003114 8.205919e-07\n2578 WHLM -0.002060 -0.009481 6.090840e-06\n650 CVR -0.002334 0.006751 1.254616e-05\n1948 PRPH -0.001908 0.010896 1.521615e-05\n2355 THM -0.002350 -0.019028 1.782250e-05\n ... ... ... ...\n32 ACN -0.002495 0.974155 5.241445e-01\n843 EV -0.002993 1.272874 5.278900e-01\n2401 TROW -0.002700 1.121679 5.570539e-01\n132 AMP -0.002748 1.392772 5.767456e-01\n337 BLK -0.002516 1.229884 5.860815e-01<\/code><\/pre>\n\n\n\nYou can see that the least correlated ticker is GFI, which is Gold Fields Limited, one of the largest gold mining firms. Obviously, gold related instruments are going to move very differently from most equities. The most closely correlated is BlackRock.<\/p>\n\n\n\n
We can look at these on a chart (the S&P 500 is in orange, BlackRock is in blue, and GFI is in green):<\/p>\n\n\n\nChart of alpha, beta correlations for BlackRock, GFI, and the S&P 500. From Yahoo Charts.<\/figcaption><\/figure>\n\n\n\nResampling to Monthly<\/h3>\n\n\n\n Running these regressions on daily data may be interesting, but not particularly useful. Due to the error inherent in CAPM, we need to look at longer time periods. So we now resample the data to monthly.<\/p>\n\n\n\n
First, we load the monthly factor data, which is in a slightly different format.<\/p>\n\n\n\n
factors = pd.read_csv(\"d:\\\\famafactorsmonthly.csv\")\nfactors['Date'] = pd.to_datetime(factors['Date'],format='%Y%m')\nfactors['Mkt-RF'] = factors['Mkt-RF'] \/ 100\nfactors = factors.set_index('Date')<\/pre>\n\n\n\nThe monthly factor data is the data from the end of each month. Unfortunately, this loads the monthly data on the first of each month. So we need to change the dates to end of month:<\/p>\n\n\n\n
factors.index = factors.index.to_period('M').to_timestamp('M')\n<\/pre>\n\n\n\nNow, we can run the ETF test again on the monthly time scale. And here are our results:<\/p>\n\n\n\n
ticker alpha beta rsquared\n1012 SJB -0.059726 0.005377 0.000010\n96 CCZ -0.049075 -0.020540 0.000076\n426 GDXJ -0.066987 0.096735 0.000570\n753 NMI -0.053687 0.055460 0.000775\n1266 YCL -0.065084 0.057360 0.000893\n ... ... ... ...\n1105 TQQQ -0.051655 3.366393 0.625342\n1131 UMDD -0.068596 3.494975 0.648961\n684 MIDU -0.069756 3.542325 0.658577\n1059 SPXL -0.061149 3.023759 0.666595\n1134 UPRO -0.060995 3.033314 0.667091<\/code><\/pre>\n\n\n\nSJB is a short bond ETF, so it makes sense that it would be uncorrelated to equities. Meanwhile, we see the same ETFs at the top of the list.<\/p>\n\n\n\n
Now, let’s look at the list of equities, this time sorting by alpha:<\/p>\n\n\n\n
ticker alpha beta rsquared\n2380 TOPS -0.209049 1.177885 0.028528\n448 CEI -0.189496 1.496256 0.035676\n684 DCTH -0.184497 0.085347 0.000182\n1012 GEVO -0.184126 2.639587 0.261636\n1709 NSPR -0.181267 2.010658 0.112558\n ... ... ... ...\n784 EHTH -0.008732 -0.119886 0.000704\n2306 TAL -0.006959 0.486709 0.023229\n847 EVI -0.006877 0.974036 0.030031\n1253 INSG -0.004456 1.306819 0.047048\n728 DQ -0.003193 1.698251 0.076006<\/code><\/pre>\n\n\n\nSo here we go, the alphas of all stocks — and there’s one thing we notice immediately: they’re all less than 0. <\/p>\n\n\n\n
Testing the Fama-French Model<\/h2>\n\n\n\n One thing we learn from this is that the CAPM model is not a very good fit for current stock prices. When we run a regression, our r-squared value maxes out at less than .5 for non-ETFs, meaning the fit is not very good.<\/p>\n\n\n\n
There is another popular and more recent model called the Fama-French<\/a> model. This model adds additional variables to the CAPM equation. The Fama-French equation is:<\/p>\n\n\n\nThe Fama-French equation is <\/p>\n\n\n\n
Ri<\/sub> – Rf<\/sub> = \u03b1i <\/sub>+ \u03b2i<\/sub>(Rm<\/sub> – Rf<\/sub>) + sp<\/sub>SMB + hp<\/sub>HML + \u03b5i<\/sub><\/em><\/p>\n\n\n\nWhere the alpha, beta, and epsilon terms remain the same, but two new terms are added:<\/p>\n\n\n\n
sp<\/sub>SMB<\/em><\/em>, which is a variable sp<\/sub><\/em><\/em> multiplied by a precalculated value of the difference between small and big portfolios<\/p>\n\n\n\nhp<\/sub>HML<\/em><\/em>, which is a variable hp<\/sub><\/em><\/em> multiplied by a precalculated value of the difference between the highest book to market ratio and the lowest.<\/p>\n\n\n\nThe creators of this model publish the values of HML and SMB at the link above.<\/p>\n\n\n\n
<\/p>\n\n\n\n
Conclusion<\/h2>\n\n\n\n There we have it, we have used Python and Pandas to find alphas for each stock in our dataset. From here, we can start looking into using these values for strategies, such as Mean-Variance Optimization, and basic statistical arbitrage.<\/p>\n","protected":false},"excerpt":{"rendered":"
In this article, we get started examining the CAPM and Fama\/French alphas by calculating their values for real stocks. Understanding this procedure allows us to build on these models in other articles. Basu and Fama\/French provided important methods for modeling excess returns based on factors beyond the standard Capital Asset Pricing Model. Unfortunately, not all […]<\/p>\n","protected":false},"author":1,"featured_media":402,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[6],"tags":[7,13,12,11,10,14],"_links":{"self":[{"href":"https:\/\/firemymoneymanager.com\/wp-json\/wp\/v2\/posts\/221"}],"collection":[{"href":"https:\/\/firemymoneymanager.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/firemymoneymanager.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/firemymoneymanager.com\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/firemymoneymanager.com\/wp-json\/wp\/v2\/comments?post=221"}],"version-history":[{"count":1,"href":"https:\/\/firemymoneymanager.com\/wp-json\/wp\/v2\/posts\/221\/revisions"}],"predecessor-version":[{"id":1197,"href":"https:\/\/firemymoneymanager.com\/wp-json\/wp\/v2\/posts\/221\/revisions\/1197"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/firemymoneymanager.com\/wp-json\/wp\/v2\/media\/402"}],"wp:attachment":[{"href":"https:\/\/firemymoneymanager.com\/wp-json\/wp\/v2\/media?parent=221"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/firemymoneymanager.com\/wp-json\/wp\/v2\/categories?post=221"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/firemymoneymanager.com\/wp-json\/wp\/v2\/tags?post=221"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}