GitHub - kkdeok/airpollution-analyzer: It is about my first mapreduce program to analyze air pollution of Seoul, South Korea.

Airpollution Analyzer

This is my first hadoop mapreduce program to analyze air pollution of Seoul, South Korea.

Getting started with airpollution-analyzer-mapreduce

Step1) Install hadoop.

Install brew.
Enter this command line 'brew install hadoop'. Then you can see it.
After installation, you can see hadoop directory in /usr/local/Cellar/hadoop/3.1.0
To access hadoop bin in everywhere, setup HADOOP_HOME in ~/.bash_profile
Go to $HADOOP_HOME/etc/hadoop/hadoop-env.sh and add only JAVA_HOME absolute path like this.

Go to $HADOOP_HOME/etc/hadoop/core-site.xml and add below configurations.

<configuration>
  <property>
    <name>fs.defaultFS</name>
    <value>hdfs://localhost:9000</value>
  </property>
  <property>
    <name>hadoop.tmp.dir</name>
    <value>/usr/local/Cellar/hadoop/data/</value>
  </property>
</configuration>

Go to $HADOOP_HOME/etc/hadoop/hdfs-site.xml and add below configurations.

<configuration>
  <property>
    <name>dfs.replication</name>
    <value>1</value>
  </property>
  <property>
    <name>dfs.http.address</name>
    <value>localhost:50070</value>
  </property>
  <property>
    <name>dfs.secondary.http.address</name>
    <value>localhost:50090</value>
  </property>
</configuration>

Go to $HADOOP_HOME and run ./bin/hdfs namenode -format
Go to $HADOOP_HOME and run ./sbin/start-dfs.sh
Then, you can browse the web interface for the NameNode http://localhost:50070/.
Run cmd hadoop fs -mkdir -p /user/user_name

If you want to know more detail, please take a look official document.

Step2) Run Application

Run cmd cd airpollution-analyzer/airpollution-analyzer-mapreduce
Run cmd gradle clean build
Run cmd hadoop jar build/libs/airpollution-analyzer-mapreduce-1.0-SNAPSHOT.jar output
1. After run cmd, you can see below line.
2. INFO mapreduce.Job: map 100% reduce 100%
3. INFO mapreduce.Job: Job job_local1168524879_0001 completed successfully
To see the mapreduce result, run cmd hadoop fs -cat output/*
1. After run cmd, you can see this.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
airpollution-analyzer-mapreduce		airpollution-analyzer-mapreduce
gradle/wrapper		gradle/wrapper
images		images
.gitignore		.gitignore
README.md		README.md
build.gradle		build.gradle
gradlew		gradlew
settings.gradle		settings.gradle

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

airpollution-analyzer-mapreduce

airpollution-analyzer-mapreduce

gradle/wrapper

gradle/wrapper

images

images

.gitignore

.gitignore

README.md

README.md

build.gradle

build.gradle

gradlew

gradlew

settings.gradle

settings.gradle

Repository files navigation

Airpollution Analyzer

Getting started with airpollution-analyzer-mapreduce

About

Releases

Packages

Languages

kkdeok/airpollution-analyzer

Folders and files

Latest commit

History

Repository files navigation

Airpollution Analyzer

Getting started with airpollution-analyzer-mapreduce

About

Resources

Stars

Watchers

Forks

Languages