NLTK Corpora
Learn and Master Python in a Month
Python 從入門到精通:一個月就夠了!
Text Mining in Python: Steps and Examples(Dhilip Subramanian, Aug 22, 2019)
(Tutorial) Text ANALYTICS for Beginners using NLTK(Avinash Navlani,December 14th, 2019)
Text Mining & 網路爬蟲 web crawler | Google新聞與文章文字雲 | Python(2019-03-03 JAMLEECUTE)
What are the differences between Occidental Languages and Oriental Languages?
Applied Natural Language Processing | Using AI to Build Real Products(Arushi Raghuvanshi)
Related Tpoics of Text Mining
Tokenization
frequency of distinct words (Bag of Words)
weighting of distinct words (TF, IDF, TF*IDF)
Similarity (Cosine Similarity)
#from sklearn.feature_extraction.text import CountVectorizer
TextMining_Python_DocumentVectorization_CosineSimilarity_jdwang2020_8_26.7z
Stemming (Porter stemmer)( Lancaster Stemming)
Lemmatization
Stop Words
How to handle missing words? (LSI: latent semantic indexing)
One-hot Encode/Decode vs. Word2vec
Word2Vec Tutorial - The Skip-Gram Model
Word2Vec Tutorial Part 2 - Negative Sampling
讓電腦聽懂人話: 直觀理解 Word2Vec 模型
Text Classification
TF-IDF: Term Frequency(TF)IDF(Inverse Document Frequency)
Label encoding、 One hot encoding
How to One Hot Encode Sequence Data in Python(by Jason Brownlee on July 12, 2017 in Long Short-Term Memory Networks)
Split train and test set
Model Building and Evaluation
Split train and test set (Cross-Validation: 5 fold, 10-fold)
Model Building and Evaluation
Machine learning: Supervised Learning
Text Mining Applications:
Information Retrevial:
Part of speech tagging (POS)
Named entity recognition
Chunking
Sentiment Analysis
keyword extraction (feature weighting ((TF-IDF)), feature clustering)
Text Categorization(classification): Spam mails,
Text Similarity (Clustering),
Anomally and Trend Detection,
Social Network, ...
Text Mining with cybercrime,
Text Streams : Events and trends in text streams,
Related Learning Programs or Tools for Text Mining:
On-Line Learning Program(巨匠) (You need to register in advance in order to take the following courses)