Machine Learning (機器學習)
Time:
上課時間:(一) 567 I533
Class period:Mon. 567 I533
jdwang@asia.edu.tw
Score
Final Project (Deadline: 2023/6/12
AWS Academy Machine Learning for Natural Language Processing [36099]
AWS Academy Learner Lab [36100]
References
(English Version)
Python Machine Learning, Sebastian Raschka, Vahid Mirjalili. Packt (3nd Edition), ISBN:9781789955750 )
Code Example(183MB)
Text Analytics with Python: A Practitioner's Guide to Natural Language Processing (2nd version) Sarkar, Dipanjan,2019-05-22,ISBN-13:9781484243534
code-downloads
Book_TextAnalyticswithPython_Notes.html
AWS Academy Machine Learning Foundations [35678]
Chapter 1: Giving Computers the Ability to Learn from Data
Chapter 2: Training Simple Machine Learning Algorithms for Classification
Naive Bayes Classifier (Probability model)
Training : Find the conditional independent probablity of variable.
Bayesian Decision Theory
Naive Bayes Classifier (From Tom M. Mitchell)
Naive_Bayes_training_example_Tennis.htm
Naive_Bayes_training_example_Tennis_ans.htm
Linear Classifier (Vecor Space Model)
Training : Find the hyperplanes that can separate the instances of one class from the other classes
Rocchi Alogirhtm (Linear Classifier)LinearClassifier_jdwang.xls
Chapter 3: A Tour of Machine Learning Classifiers Using scikit-learn
Chapter 4: Building Good Training Datasets – Data Preprocessing
HomeWork1:
(0) How do you collect your own data (Friends)? (FB, Google, IG, Line?)
=>(At least 100 records=> 5-fold, You have to make As robust as Possible)
=> Is your dataset Robust enough?
What characteristis (features),
Nation(Country),
sex (M or F),
Zodiac Signs (Star Signs)"
"Blood Type"(A,B,AB,O)
Personal Favors(Spicy? Colors? ...)
Chapter 5: Compressing Data via Dimensionality Reduction
Chapter 6: Learning Best Practices for Model Evaluation and Hyperparameter Tuning
Middle Project (40%)
(1) you choose for predicting "Zodiac Signs (Star Signs)" and "Blood Type"
(2) Classifying with your own data (CV K-fold?)(Confution Matrix, Accuracy, F-Measure)
(3) From your own experimental results, can we pridict the "Zodiac Signs (Star Signs)" or "Blood Type" of some ones
(4) What are your points to improve your experimental results if possible
?
Final Project (50%)
Top 10 CancerTypes in PubMed
PubMed_Cancer_jdwang2021_9_28.7z
Topic : Text Classification of Cancers via Medical Articeles in PubMed Corpus
The types of Cancer(Choose five types at least)
Resource : PubMed
(Download PedMed Articles via PubMed API
(Download PedMed Articles via Download MEDLINE/PubMed Data
Try to train and construct a classifier for determining the class ("cancer type") of PubMed articles
(Python Package : NLTK Natural Language ToolKit)
(bag-of-words)
Term-Frequency (TF)
Inverse-Document-Frequency (IDF)
Term Weighting : TF*IDF
stop-word removal
Word stemming : Porter stemming algorithm
Single-class Classification (Sentiment Analysis (positive or negative)) vs. Multi-classes Classification (Which type of cancers is ?)
Computation Power limitation : Out-of-core learning
As in Chapter 9, construct a Web interace for user to upload one PubMed articels and then to determine the cancer types of that PubMed article via above trained classifier.
Example: Please enter your movie review/ <
Web Service with machine learning (Text Classification)(Sentimental Analysis)
Package "pickle" (model persistence)
Package "flask" (microframework) : pip install flask