Machine Learning (機器學習)
Outline
Time:
上課時間:(一)234 I627
Class period:Mon.234 I627
jdwang@asia.edu.tw
Score
Class Switching (4/29, 9:10am-12:00am => 4/22, 18:00pm- 21:00pm (You just have YouTube On-Line Vedio by yourself))
2019 International Conference on Soft Computing & Machine Learning(SCML2019) April 26th-29th, Wuhan, China
(Jing-Doo Wang)Invited Speakers:Title: Applications using the Class Frequency Distribution of Maximal Repeats from Tagged Sequential Data.
(教育部智慧聯網技術與應用人才培育計畫-107年度磨課師課程發展計畫(國立成功大學 電機系 李順裕 教授))
ECG MOOCs (the 1~3 week)
(Provided By Prof. Shuenn-Yuh Lee, NCKU)(Language : Mandarin Chinese, Subtile:English)
第一週
16'45" 單元一:醫療照護感測器電路與系統_1_醫療照護感測器
11'12" 單元二:醫療照護感測器電路與系統_2_醫療照護系統模組
18'15" 單元三:醫療照護感測器電路與系統_3_醫療照護系統晶片
第二週
16'54" 單元一:穿戴式模組(Trianswer )介紹_1_模組製作動機
16'44" 單元二:穿戴式模組(Trianswer )介紹_2_生理訊號檢測
10'45" 單元三:穿戴式模組(Trianswer )介紹_3_模組操作
第三週
08'46" 單元一:臨床醫學知識_1_臨床醫學知識介紹
13'54" 單元二:臨床醫學知識_2_認識心臟的節律
15'01" 單元三:臨床醫學知識_3_如何偵測心臟的節律
Take a view about these vedio in MOOCs and find the answer for
What is ECG? EMG? PPG?
What is the stages of ""?
Text Book
(Chinese Version)
Python 機器學習 (第二版)MP11804 ( Sebastian Raschka, Vahid Mirjalili )(劉立民, 吳建華 譯 博碩)
博碩文化 中區業務經理 林世昌(Rick Lin)Cell Phone:0925-275-775 LINE ID 0925275775 Mail:rick@drmaster.com.tw
範例(183MB)
(English Version)
Python Machine Learning, Sebastian Raschka, Vahid Mirjalili. Packt (2nd Edition), ISBN:9789864343324 )
博碩文化 中區業務經理 林世昌(Rick Lin)Cell Phone:0925-275-775 LINE ID 0925275775 Mail:rick@drmaster.com.tw
Code Example(183MB)
Reference Book
Python Deep Learning
code(Packt)
Python程式設計學習經典:工程分析x資料處理x專案開發 (碁峰)(作者: 吳翌禎, 黃立政 ISBN:9789864768837)(出版日期:2018/08/23)
林政益 (scott_lin@gotop.com.tw) 電話: 04-2452-7051 分機 11,
Code Examples
Chapter 11: Pandas Package
How to read data (files: Txt, Execel, HTML) to "series" and "DataFrame" ?
CH_11_3_1_EX_1_Pandas_Reading_CSV_File.py
CH_11_3_2_EX_1_Pandas_Reading_Excel_File.py
CH_11_3_3_EX_1_Pandas_Reading_HTML_File.py
Chapter 12: Matplotlib (2D or 3D Data Visiualization)
CH_12_2_EX_1_Basic_Plot_Using_Pyplot.py
CH_12_3_EX_1_Pyplot_Figure.py
CH_12_4_EX_1_Pyplot_Subplot_Common_Usages.py
The IPython notebook
Data Mining:Practical Machine Learning Tools and Techniques (4th Edition), 2017,Morgan Kaufmann.(ISBN-13: 978-0128042915)
Content
- Supervised Learning
Chapter 2 - Training Machine Learning Algorithms for Classification
Data Sets for Data Mining
Decision Tree (J48(java version of C4.5)) with "iris.arff"
iris.data
How to constuct one of Decision Tree (DT)?
(Shanon Entropy)(Information Gain)
Weka with weather(YouTube)
WEKA Data Sets (weather)
How to avoid overfitting when tranning a DT?
(Pre-Pruning and Post-Pruning)
How to evaluate the preformance of one classifier?
How to compare the performance of several classifiers?
K-Nearesst Neightbor
Training : determining the best value of k that aci=hieves the best performance ?
Naive Bayes Classifier (Probability model)
Training : Find the conditional independent probablity of variable.
Bayesian Decision Theory
Naive Bayes Classifier (From Tom M. Mitchell)
Naive_Bayes_training_example_Tennis.htm
Linear Classifier (Vecor Space Model)
Training : Find the hyperplanes that can separate the instances of one class from the other classes
Rocchi Alogirhtm (Linear Classifier)LinearClassifier_jdwang.xls
Support Vecotr Machine (Vector Space Model)
Training: Find the support vectors that can maximumize the marge region between two classes
Chapter 3 - A Tour of Machine Learning Classifiers Using Scikit-Learn
wine.data
Chapter 4 - Building Good Training Sets – Data Preprocessing
Missing Values (NaN, Not a Number)
What do you do about "Missing Values"? (impute?)
Categorical data:(1) Nominal feature (color?) (2) ordinal feature (Size?)
Feature Scaling: Normalization vs. standardization
How to choose meaningful features? (L1 regularization & L norm2 )
Dimension Reduction:Feature selection vs. Feature extraction
Chapter 5 Dimension Reduction
Principle component analysis (PCA)(unsupervised)
Linear discriminat analysis (LDA)(Fishr's LDA) (Supervised)
Chapter 8 - Applying Machine Learning To Sentiment Analysis
Opinion mining practice: Internet Movie Database (IMDb)
(0) Download aclImdb_v1.tar.gz
(1) uncompressed "aclImdb_v1.tar.gz" into the working directory
(2) download Chapter08.7z
(3) python package: PyPrimd, NLTK
conda prompte>pip install pyprind
conda prompte>pip install nltk
Vectorization : (n-gram model)(Bag-of-words)(Vect2Word)
word stemming: Poster stemmer algorithm
Pattern Weighting:(TF)(DF)(IDF)
Chapter 9 - Web Application Embedded with Machine learning Mode
http://raschkas.pythonanywhere.com/
download Chapter09.7z
Model persistence (package : packle)
Anacoda Prompt > pip install flask
C:\Users\jdwan\Chapter09\1st_flask_app_1>python app.py
* Restarting with stat
* Debugger is active!
* Debugger PIN: 559-490-217
* Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
WTForms
Anacoda Prompt> pip install WTForms
SQLLite Manager
PythonAnywhere
Chapter 10 - Predicting Continuous Target Variables with Regression Analysis
How to predict the price of one hourse?
housing.data.txt
(506 samples in which contain 14 features)
housing.names
Chapter.7z
conda prompt> pip install seaborn
Why we need the pairplot(Exploratory Data Analysis, EDA)
Perrson Product-monment correlation coefficients, Persson'r
RANdom SAmple Consensus (RANSAC)
Chapter 12 - Implementing a Multi-layer Artificial Neural Network from Scratch
MNIST dataset
Chapter 13 - Parallelizing Neural Network Training with TensorFlow
Chapter 14 - Going Deeper: The Mechanics of TensorFlow
Chapter 15 - Classifying Images with Deep Convolutional Neural Networks
Chapter 16 - Modeling Sequential Data Using Recurrent Neural Networks
- UnSupervised Learning
Chapter 11 - Working with Unlabeled Data – Clustering Analysis
k-means
prototype-based
hierarchical-based
density-based
How many clusters ?
elbow method
hard clustering
soft clustering (or Fuzzy C-means, FCM)
- Reinforcement Learning
- Machine Learning Practice with Python
References
Difference Between Data mining and Machine learning
AI, Machine Learning and Deep Learning
Machine Learning vs. Deep Learning
Understanding of Convolutional Neural Network (CNN) — Deep Learning(Prabhu,2018)
Convolutional Neural Network (CNN)
Recurrent Neural Network (RNN) Tutorial
Deep Learning (DL) In Healthcare
MLCC(Machine Learning Crash Course)
Course Overview -- Microsoft AI Workshop (24 Hours)
Course Overview -- Microsoft Professional Program for Artificial Intelligence (96 Hours)
Homework 1 (15%) Weka Practice (2019/3/25, Report to moodle with 1~3 minutes YouTube presentation(URLsharing) )
(1) Decision Tree (DT) with WEKA Data Sets (labor.arff)
(2) kNN with WEKA Data Sets (glass.arff)
(3) Can you justify which classifier, kNN or DT, achieve better performance with (labor.arff) and (glass.arff) ?
(Which classifier (its parameters ? with CV-5Fold)? "Accuray"? "Receiver Operating Characteristic (ROC) Curve"? "Confusion Matrix"? "F-measure"?"Training Time"? "Testing Time"?)
Middle Project (30%) Zodiac Signs (Star Signs)"
and "Blood Type"(A,B,AB,O)
Classification (or Prediction) of Your Friends (2019/4/8 Presentation, 2019/4/15 Report to moodle with 3~5 minutes YouTube presentation(URLsharing))
(0) How do you collect your own data (Friends)? (FB, Google, IG, Line?)
=>(At least 50 records=> 5-fold, You have to make As robust as Possible)
=> Is your dataset Robust enough?
(1) What characteristis (features), e.g. types of personality, religion, sex (M or F), blood type, you choose for predicting "Zodiac Signs (Star Signs)" and "Blood Type"
(2) Classifying with your own data (Weka ? Python Sci-kit?)(CV K-fold?)(Confution Matrix, Accuracy, F-Measure)
(3) From your own experimental results, can we pridict the "Zodiac Signs (Star Signs)" or "Blood Type" of some ones
(4) What are your points to improve your experimental results if possible
Homework 2 (15%) (Presentation:2019/5/6 (10-15 minutes), Report: 2019/5/13 (Moodle)) Report to moodle with 1~3 minutes YouTube presentation(URLsharing))
MOOCs (Provided By Prof. Shuenn-Yuh Lee, NCKU)
ECG MOOCs (the 2nd&3rd week)
(Choose at least one of three following database).
PhysioBank ATM(On_line)
MIT-BIH Arrhythmia Database (raw data and annotation)
MIT-BIH Arrhythmia Database Directory(raw data and annotation(age, sex, drug usage))
(1) Where were the resource derived from ?
(2) The format of thse ECG data ?
(3) What can you extract from these data?
(4) Try to losd these data using python "Panda", and to view with python "Metplotlib", and to verify (learn for classification) with python "Scikit".
(5) Find Related Potential Applications : MIT ECG signal processing, data mining, classification and clustering
Final Project (40%): (2019/6/10 presentation, 2019/6/17 Report to moodle with 3~5 minutes YouTube presentation(URLsharing))
Topic : Text Classification and Clustering of Cancers via Medical Articeles in PubMed Corpus
The types of Cancer(Choose five types at least)
Resource : PubMed
(Download PedMed Articles via PubMed API
(Download PedMed Articles via Download MEDLINE/PubMed Data
As in Chapter 8, try to train and construct a classifier for determining the class ("cancer type") of PubMed articles
(Python Package : NLTK Natural Language ToolKit)
(bag-of-words)
Term-Frequency (TF)
Inverse-Document-Frequency (IDF)
Term Weighting : TF*IDF
stop-word removal
Word stemming : Porter stemming algorithm
Single-class Classification (Sentiment Analysis (positive or negative)) vs. Multi-classes Classification (Which type of cancers is ?)
Computation Power limitation : Out-of-core learning
As in Chapter 9, construct a Web interace for user to upload one PubMed articels and then to determine the cancer types of that PubMed article via above trained classifier.
Example: Please enter your movie review/ <
Web Service with machine learning (Text Classification)(Sentimental Analysis)
Package "pickle" (model persistence)
Package "flask" (microframework) : pip install flask
AWS Educate Program
AWS Educate
AWS 準備認證
AWS Services
完成 AWS Educate Propgram : Student Account 申請)Apply for an AWS Educate (蘇棻翎同學 提供)
Chapter 01: 建置Python 開發環境
Microsoft Docs
Microsoft Azure Portal
Window Azure Mahine Learning Studio
Linear Regression for Predicting the price of Car
Example: jdwang
Name:
ID:
What do you know about "Machine Learning"?
Why do you need to learn "Machine Learning"?
What are you expect form this course ?
The others: