Text Mining

(MS_Team)101_1_TextMining_AU_jdwang. Time:(Wed) 1:10 pm~4:pm, GMT+8 (Taipei)

AWS Academy Learner Lab - Foundation Services [7319] (For create VM via AWS EC2)


Time: Wed, 1:10pm~4:00pm, Room: I627
jdwang@asia.edu.tw, Room:I517, ext:1847

Pre-Requests
Python Programming or any other programming languages
database

Score


Middle Project Score

Final Project Score


Text Books
Text Analytics with Python: A Practitioner's Guide to Natural Language Processing (2nd version) Sarkar, Dipanjan,2019-05-22,ISBN-13:9781484243534
code-downloads
Book_TextAnalyticswithPython_Notes.html


References books

  • Hands-On Python Natural Language ProcessingAman Kedia , Mayank Rasu, 2020-06-26, ISBN-13:9781838989590
    code-downloads
    code-downloads (github)
  • Python Data Science Handbook (2015) Jake VanderPlas
  • Python Machine Learning: Machine Learning and Deep Learning with Python, scikit-learn, and TensorFlow, 2nd Edition(2017) Sebastian Raschka, Vahid Mirjalil

  • 週次 日期 課程內容

    Weeks Date Course content


    1 Introduction
    2 Python for NLP
    3 Text PreProcessing
    PubMed_Cancer_jdwang2021_9_28.7z
    4 Traditional Feature Engineering Models
    5 Advanced Feature Engineering Model
    6 Text Classification-Vectorization
    7 Text Classification-Classifiers
    8 Middle Project Presentation
    9 Middle Project Report

    10 Text Similarity
    11 Text Clustering
    12 Text Cluster- Visualization
    13 Text Mining vs. Deep Learning(1)
    14 Text Mining vs. Deep Learning(2)
    15 AWS Educate program and AWS Academy program
    16 Text Mining vs. AWS
    17 Final Project presentation: Text Mining + AWS
    18 Final Project (Report): Text Mining + AWS

    Grade
    Attendance (10%) +10 if you attend every week on time, Absent (-1/per week), Late (-0.5/per week)

    Homework : Text Preprocessing :  Top 10 CancerTypes in PubMed

    PubMed_Cancer_jdwang2021_9_28.7z


    Middle Project (30%): (2021/11/3, presentation (ppt))(2021/11/10 report (word(pdf)+YouTube(URL:sharing)) to moodle)

    (2~4 students/per group)(Text Classification : Pubmed articels classification with "cancer", "virus", "vaccine", "COVID-19" and "genome"
    Middle Project Score


    Homework 2: Text Clustering  Document Clustering with Pubmed articles derived from Top 10 Cancertypes



    Final Project (40%): (2~4 students/per group)(AWS+TextMining)<(2022/1/5, presentation (ppt), 2022/1/12 report (word(pdf)+YouTube(URL:sharing)) to moodle)

    Multi-languages Communication Online via AWS services
    Try to integrate AWS services to provide communications without the bottlenecks of using different languages in the world.
    Onlne (Presentation) (1/7) (YouTube URL 1/8, Group Score: 1/9-1/10)Report(1/12)


    References
    Computational Platform in AWS : To someones without sufficient computing power when handling with Big Data)
    AWS Educate Program (Apply Acount)
    AWS Educate Login

    Apply for an AWS Educate (By Suca)
    AWS Educate Program Information
    AWS Educate
    How to activate AWS Educate registration mail (如何回覆AWS確認信)(Thanks for Sharon)感謝 張倖瑜同學 提供(2020/3/13)(Thanks for Sharon
    AWS Educate Login
    How to go to AWS Educate Program Classroom(學生如何登入課程)(Thanks for Sharon)感謝 張倖瑜 同學 提供(2020/12/5)