(*)Running a custom java jar on an AWS EMR cluster (Part 3/3)(From: Prof. Patterson, 2017)

Amazon EMR (Articles & Tutorials)

如何在五分鐘內透過AWS的EMR服務快速開啟一個Hadoop叢集?


AWS EMR Cluster + 直執行 (作業1) "word_count_你的學號.jar"(Deadline: 2019/12/6)

(0) Windoop 測試 WordCount_jdwang2019_12_11.zip(Eclipse Java 程式碼)
(Mopdify (1) hdfs (2) Conf.set WordCount_ForAWS_NoDFS_NoConfSet_jdwang2019_12_13.zip)
(1) Export jar "wordcount_ForAWS_YourID.jar"
(2) Login AWS Educate(需先申請)
(3) AWS S3 建立一個 Bucket "S3_YourID"
(4) upload spider.txt to AWS S3 Bucket "S3_YourID"
(5) upload "wordcount_ForAWS_YourID.jar" to AWS S3 Bucket "S3_YourID"
(e.g.WordCount_ForAWS_NoDFS_NoConfset_jdwang.jar)
(6) (AWS EC2 console)產生 key pair "EMR_key_YourID" (For the connection to AWS "EMR_YourID" )
(7) Construct AWS EMR "EMR_YourID"
(a) enter ("EMR_key_YourID")
(b) enter jar and ipnut/output
(c) and start it
(9) run "word_count_YourID.jar" on "EMR_YourID"
(設定 input output)((AWS S3 Bucket "S3_YourID"))
(10) (After finishing your EMR job)
(a) shut down AWS "EMR_YourID"
(b) "Terminate" running AWS EC2 instances
(c)remove AWS "S3_YourID"
=> 否則你的Money$$$$....