Data Mining
jdwang@asia.edu.tw, Room:I517, ext:1847
Time: Wednesday, 9:10am~12:00am, Room: A206
Text Book
Data Mining: Practical Machine Learning Tools and Techniques (2nd Edition),
Ian H. Witten & Eibe Frank. Morgan Kaufmann, June 2005, ISBN 0-12-088407-0
http://www.cs.waikato.ac.nz/ml/weka/book.html
Reference Book
Data Mining: Concepts and Techniques (2nd Edition)
Han& kamber, 2007, Morgan Kaufmann, ISBN 978-0-12-373905-6(東華代理)
Web Data Mining
Exploring Hyperlinks, Contents and Usage Data
Bing Liu, Springer, December,
2006
資料探勘的發展與挑戰(作者:翁慈宗 成功大學資訊管理研究所)
- Install Perl
- Example. 2009_12_8_jdwangForSVMDataProcess
- #Usage: perl ConvertToSVMDataFormat.pl <SrcFile> <NewFile> <Separator>
perl ConvertToSVMDataFormat ClassLabelData.txt ClassLabelData.SVM ","
- #Usage: ComputeResultStatistics.pl <TestFile> <PredictFile>
perl ComputeResultStatistics.pl iris.scale iris.scale.predict- Confusion Matrix
- malibu: malibu is an open source, portable machine learning workbench written in C++. This collection of learning algorithms focuses on supervised learning problems. It includes both third-party and native implementations covering a number of classification algorithms and wrapper methods. malibu also encompasses the most complete set of validation algorithms, metrics, tests and graphs.
- (2010.1.15,4:00pm前) 心得
- Data Preprocessing (Categorical <=>Numerical)
- How can you avoid the over-fitting problem?
- Training Set, Validation Set, Testing Set, k-Fold cross validation
- How can you improve the classification accuracy?
- Attribute Normalization, Classifier Combination
- Accuracy,Precision,Recall,F1,...
- Classifier Comparison ( DT, SVM, Linear SVM, kNN, Neural Network, ....)
- Data Transformation (PCA, LDA,...)
- Cluster