In this post I will demonstrate you how to configure the Oozie workflow. let's develop a simple MapReduce program using java, if you find any difficulties in doing it then download the code from my git location.Download
Please follow my earlier post to install and run oozie server, create a job directory say SimpleOozieMR as per following directory structure
---SimpleOozieMR
----workflow
-----lib
------workflow.xml
in the lib folder copy the you hadoop job jar and related jars.
let's configure our workflow.xml and keep it into the workflow directory as shown.
Now configure your properties file PatentCitation.properties as follows
lets create a shell script which will run your first oozie job:
Please follow my earlier post to install and run oozie server, create a job directory say SimpleOozieMR as per following directory structure
---SimpleOozieMR
----workflow
-----lib
------workflow.xml
in the lib folder copy the you hadoop job jar and related jars.
let's configure our workflow.xml and keep it into the workflow directory as shown.
<workflow-app name="WorkFlowPatentCitation" xmlns="uri:oozie:workflow:0.1"> <start to="JavaMR-Job"/> <action name="JavaMR-Job"> <java> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> <prepare> <delete path="${outputDir}"/> </prepare> <configuration> <name>mapred.queue.name</name> <value>default</value> </configuration> <main-class>com.rjkrsinghhadoop.App</main-class> <arg>${citationIn}</arg> <arg>${citationOut}</arg> </java> <ok to="end"/> <error to="fail"/> </action> <kill name="fail"> <message>"Killed job due to error: ${wf:errorMessage(wf:lastErrorNode())}"</message> </kill> <end name="end" /> </workflow-app>
Now configure your properties file PatentCitation.properties as follows
nameNode=hdfs://master:8020 jobTracker=master:8021 queueName=default citationIn=citationIn-hdfs citationOut=citationOut-hdfs oozie.wf.application.path=$(namenode)/user/rks/oozieworkdir/SimpleOozieMR/workflow
lets create a shell script which will run your first oozie job:
#!/bin/sh # export OOZIE_URL="http://localhost:11000/oozie" #copy your input data to the hdfs hadoop fs -copyFromLocal /home/rks/CitationInput.txt citationIn-hdfs #copy SimpleOozieMR to hdfs hadoop fs -put /home/rks/SimpleOozieMR SimpleOozieMR #running the oozie job cd /usr/lib/oozie/bin/ oozie job -config /home/rks/SimpleOozieMR/PatentCitation.properties -run