Chris IJ Hwang

I am a Quantitative Analyst/Developer and Data Scientist with backgroud of Finance, Education, and IT industry. This site contains some exercises, projects, and studies that I have worked on. If you have any questions, feel free to contact me at ih138 at columbia dot edu.

View My GitHub Profile



Contents

LDA

Implemented LDA introduced at “Introduction to Statistical Learning: With Applications in R. Chapter 4 Lab”

After we got the result from logistric regression, LDA is tried to get the better result.

Using scikit-learn

Direction as a response, "Lag1" and "Lag2" as predictors.


from sklearn.lda import LDA
lda = LDA()
lda.fit(X_train, y_resp_train)
lda.score(X_test, y_resp_test)
# 0.55952380952380953 correct rate

QDA

Using scikit-learn


from sklearn.qda import QDA
qda = QDA()
qda.fity(X_train, _resp_train)
qda.score(X_test, y_resp_test)
# 0.59920634920634919 correct rate

qda.means_
qda.priors_
qda.rotations_ 
qda.classes_ 
lda. priors_ 
lda. coef_ 

KNN

Using scikit-learn


from sklearn.neighbors import KNeighborsClassifier
knn = KNeighborsClassifier(3)
knn.fit(X_train, y_resp_train)
knn.predict(X_test)
knn.score(X_test,y_resp_test)
# 0.53174603174603174 correct rate

From this dataset and classification methods, QDA has the best correct rate.



[References] [1] James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. An Introduction to Statistical Learning: With Applications in R. Print.

[2] Hauck, Trent. Scikit-learn Cookbook. Packt, 2014. Print.