2015 Fall Data Science Term Project
Team: Metalicus
Input File:
FLAT_RCL_Out_14.txt: traing data file
FLAT_RCL_Out_15.txt: prediction data file
stopwords_long.txt: stop word list for unnecessary words
2014RecallNo_NoClassification.csv: classification data of 2014 year
2014RecallNo_Software.csv: classification data of 2014 year
2014RecallNo_nonSoftware.csv: classification data of 2014 year
Main File:
bayes.py: main python file that skeleton code come from "Machin Learing in Action written by Peter Harrington, Chapter 4"
Output File:
predictionResult.txt
Run python in your console, and
import bayes
bayes.run()
[1] I. Androutsopoulos, J. Koutsias, K. V. Chandrinos, and C. D. Spyropoulos, “An Experimental Comparison of Naive Bayesian and Keyword-Based Anti-Spam Filtering with Personal E-mail Messages,” Proc. 23rd Annu. Int. ACM SIGIR Conf. Res. Dev. Inf. Retr., pp. 24–28, 2000.
[2] M. Sahami, S. Dumais, D. Heckerman, and E. Horvitz, “A Bayesian approach to filtering junk e-mail,” vol. 62, no. Cohen, pp. 98–105, 1998.