I am a Quantitative Analyst/Developer and Data Scientist with backgroud of Finance, Education, and IT industry. This site contains some exercises, projects, and studies that I have worked on. If you have any questions, feel free to contact me at ih138 at columbia dot edu.
Implemented LDA introduced at “Introduction to Statistical Learning: With Applications in R. Chapter 4 Lab”
After we got the result from logistric regression, LDA is tried to get the better result.
Direction as a response, "Lag1" and "Lag2" as predictors.
from sklearn.lda import LDA
lda = LDA()
lda.fit(X_train, y_resp_train)
lda.score(X_test, y_resp_test)
# 0.55952380952380953 correct rate
from sklearn.qda import QDA
qda = QDA()
qda.fity(X_train, _resp_train)
qda.score(X_test, y_resp_test)
# 0.59920634920634919 correct rate
qda.means_
qda.priors_
qda.rotations_
qda.classes_
lda. priors_
lda. coef_
from sklearn.neighbors import KNeighborsClassifier
knn = KNeighborsClassifier(3)
knn.fit(X_train, y_resp_train)
knn.predict(X_test)
knn.score(X_test,y_resp_test)
# 0.53174603174603174 correct rate
From this dataset and classification methods, QDA has the best correct rate.
[References] [1] James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. An Introduction to Statistical Learning: With Applications in R. Print.
[2] Hauck, Trent. Scikit-learn Cookbook. Packt, 2014. Print.