I am a Quantitative Analyst/Developer and Data Scientist with backgroud of Finance, Education, and IT industry. This site contains some exercises, projects, and studies that I have worked on. If you have any questions, feel free to contact me at ih138 at columbia dot edu.
###Assumption
The dependent variable –> A linear relationship with just one independent variable.
Ex) CAPM or single factor model.
The excel implementation is introduced in the book. The result will be similar to the following:

The book contains only excel implementation. If it is implemented with matlab, it will be similar to the following:
clear;
clc;
[num, txt] = xlsread('file', 'SheetName');
%% clean data
txtday = txt(2:end, 1);
datenumday = datenum(txtday, 'mm/dd/yyyy');
datestrday = datestr(datenumday, 'yyyymmdd');
tday = str2double(cellstr(datestrday));
idx = num(:,end);
%% Since the data is price, change it to return
r = diff(log(num));
%% Separate data
x = r(:,2); %SP
y = r(:,1); %Amex
list = {'R','beta', 'adjrsquare','rsquare', 'tstat'};
stats = regstats(y, x, 'linear', list);
%% Organize the result
cfs = stats.beta;
arsq = stats.adjrsquare;
rsq = stats.rsquare;
t = stats.tstat.t;
pv = stats.tstat.pval;
%% graph
p = polyfit(x,y,1);
f = polyval(p,x);
figure;
plot(x,y,'.');
hold on
plot(x,f,'-r');
xlabel('SP500');
ylabel('AMEX');
text(-0.02,0.6, ['y= ' , num2str(cfs(2)), 'x + ', num2str(cfs(1)), ', R^2 = ', num2str(rsq)]);


# Simple linear regression
# Data read from excel file
library(gdata)
data = read.xls("Filename", sheet = 2, header = TRUE) 
# See how it looks
names(data)
attach(data)
layout(1)
plot(SP500, Amex)
# Subsetting data
tickers = c("Amex", "SP500")
prices = data[tickers] 
returns = data.frame(diff(as.matrix(log(prices))))
# Regression
L1 = lm(returns$Amex ~ returns$SP500)
summary(L1)
# graph
plot(returns$Amex ~ returns$SP500)
abline(L1, col="red")
 

import numpy as np
from scipy import stats
import pandas as pd
dts = pd.read_excel('filep path', 'Sheet2', index_col=0)
log_dts = np.log(dts)
log_rts = log_dts[1:].values-log_dts[:-1]
slope, intercept, r_value, p_value, std_err = stats.linregress( log_rts["SP500"], log_rts["Amex"])
print slope
print intercept
print "r-squared:", r_value**2
print "std error:", std_err
degrees_of_freedom = len(log_rts)-2
predict_y = intercept + slope*log_rts["Amex"]
pred_error = log_rts["SP500"] - predict_y
residual_std_error = np.sqrt(np.sum(pred_error**2)/degrees_of_freedom)
import numpy as np import pandas as pd import statsmodels.api as sm y = log_rts.Amex # response X = log_rts.SP500 # predictor X = sm.add_constant(X) # Adds a constant term to the predictor est = sm.OLS(y, X) est = est.fit() est.summary()
The summary result shows :

[References] [1] Alexander, Carol. Market Risk Analysis. Vol. I. Chichester, England: Wiley, 2008. Print.