Oozie是一个开源的工作流调度系统,它能够管理逻辑复杂的多个Hadoop作业,按照指定的顺序将其协同运行起来。例如,我们可能有这样一个需求,某个业务系统每天产生20G原始数据,我们每天都要对其进行处理,处理步骤如下所示:
- 通过Hadoop先将原始数据同步到HDFS上;
- 借助MapReduce计算框架对原始数据进行转换,生成的数据以分区表的形式存储到多张Hive表中;
- 需要对Hive中多个表的数据进行JOIN处理,得到一个明细数据Hive大表;
- 将明细数据进行复杂的统计分析,得到排序后的报表信息;
- 需要将统计分析得到的结果数据同步到业务系统中,供业务调用使用。
上述过程可以通过工作流系统来编排任务,最终生成一个工作流实例,然后每天定时启动运行这个实例即可。在这种依赖于Hadoop存储和处理能力要求的应用场景下,Oozie可能能够简化任务调度和执行。
这里,我们在CentOS 6.2系统下安装Oozie-3.3.2,需要安装相关的依赖软件包,下面我们一步一步地进行安装,包括安装配置依赖软件包。这里,我们使用MySQL数据库存储Oozie数据,Hadoop使用的是1.2.1版本。
安装Oozie Server
Oozie Server可以为我们提供很多管理Job的便捷功能,比如,通过可视化界面去管理Job的运行状态,同时也支持我构建含有多个复杂Hadoop Job流程,各个Job之间的依赖关系完全可以通过一个工作流配置文件组装起来,然后由Oozie Server其管理执行。
- 安装Maven构建工具
下载安装,执行如下命令:
wget http://mirrors.hust.edu.cn/apache/maven/maven-3/3.2.1/binaries/apache-maven-3.2.1-bin.tar.gz tar xvzf apache-maven-3.2.1-bin.tar.gz
加入环境变量,使变量配置生效:
export MAVEN_HOME=/home/shirdrn/cloud/programs/apache-maven-3.2.1 export PATH=$PATH:$MAVEN_HOME/bin
- 安装MySQL数据库
安装MySQL数据库,执行如下命令:
sudo rpm -e --nodeps mysql yum list | grep mysql sudo yum install -y mysql-server mysql mysql-deve
为root用户设置密码:
mysqladmin -u root password '8YOhyo988_Kjo0'
然后可以使用root账号登录MySQL数据库,进行管理:
mysql -u root -p
输入密码登录成功。
- 安装配置Tomcat
下载安装Tomcat Web服务器:
wget http://apache.dataguru.cn/tomcat/tomcat-7/v7.0.52/bin/apache-tomcat-7.0.52.tar.gz tar xvzf apache-tomcat-7.0.52.tar.gz
设置环境变量:
export CATALINA_HOME=/home/shirdrn/cloud/programs/apache-tomcat-7.0.52 export PATH=$PATH:$CATALINA_HOME/bin
如果使用MySQL存储Oozie数据,需要将MySQL的驱动程序拷贝到Tomcat安装目录下,亦即$CATALINA_HOME/lib下面。
- 准备ExtJS工具包
下载ExtJS压缩包:
wget http://extjs.com/deploy/ext-2.2.zip
- 安装Oozie
下载安装,执行如下命令:
wget http://mirror.bit.edu.cn/apache/oozie/3.3.2/oozie-3.3.2.tar.gz tar xvzf oozie-3.3.2.tar.gz cd oozie-3.3.2 bin/mkdistro.sh -DskipTests
构建成后,可以在oozie-3.3.2/distro/target目录下看到构建后的文件,例如我的路径是/home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2,内容如下所示:
[shirdrn@oozie-server oozie-3.3.2]$ pwd /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2 [shirdrn@oozie-server oozie-3.3.2]$ ls bin lib oozie-core oozie-sharelib-3.3.2.tar.gz conf libtools oozie-examples.tar.gz oozie.war docs.zip oozie-client-3.3.2.tar.gz oozie-server release-log.txt
将OOZIE_HOME变量指向该目录,修改~/.bashrc文件:
export OOZIE_HOME=/home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2 export PATH=$PATH:$OOZIE_HOME/bin
将ExtJS工具包拷贝到目录$OOZIE_HOME中:
cp ~/cloud/programs/oozie-3.3.2/ext-2.2.zip $OOZIE_HOME/
在上面的目录下创建libext目录,并将hadoop相关的jar库文件拷贝到libext下面,我使用的是Hadoop 1.2.1版本:
[shirdrn@oozie-server oozie-3.3.2]$ mkdir libext [shirdrn@oozie-server oozie-3.3.2]$ cp ~/cloud/programs/hadoop-1.2.1/hadoop-*.jar libext/ [shirdrn@oozie-server oozie-3.3.2]$ cp ~/cloud/programs/hadoop-1.2.1/lib/*.jar ./libext/
同时,我们使用了MySQL来存储Oozie的元数据,现在需要将MySQL的驱动程序添加到libext目录下:
cp ~/packages/mysql-connector-java-5.1.29/mysql-connector-java-5.1.29/mysql-connector-java-5.1.29-bin.jar libext/
执行下面的命令开始安装:
bin/oozie-setup.sh prepare-war
运行结果,示例如下:
setting CATALINA_OPTS="$CATALINA_OPTS -Xmx1024m" INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/asm-3.2.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/aspectjrt-1.6.11.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/aspectjtools-1.6.11.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/commons-beanutils-1.7.0.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/commons-beanutils-core-1.8.0.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/commons-cli-1.2.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/commons-codec-1.4.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/commons-collections-3.2.1.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/commons-configuration-1.6.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/commons-daemon-1.0.1.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/commons-digester-1.8.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/commons-el-1.0.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/commons-httpclient-3.0.1.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/commons-io-2.1.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/commons-lang-2.4.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/commons-logging-1.1.1.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/commons-logging-api-1.0.4.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/commons-math-2.1.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/commons-net-3.1.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/core-3.1.1.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/hadoop-ant-1.2.1.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/hadoop-capacity-scheduler-1.2.1.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/hadoop-client-1.2.1.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/hadoop-core-1.2.1.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/hadoop-examples-1.2.1.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/hadoop-fairscheduler-1.2.1.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/hadoop-minicluster-1.2.1.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/hadoop-test-1.2.1.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/hadoop-thriftfs-1.2.1.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/hadoop-tools-1.2.1.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/hsqldb-1.8.0.10.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/jackson-core-asl-1.8.8.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/jackson-mapper-asl-1.8.8.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/jasper-compiler-5.5.12.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/jasper-runtime-5.5.12.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/jdeb-0.8.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/jersey-core-1.8.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/jersey-json-1.8.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/jersey-server-1.8.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/jets3t-0.6.1.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/jetty-6.1.26.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/jetty-util-6.1.26.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/jsch-0.1.42.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/junit-4.5.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/kfs-0.2.2.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/log4j-1.2.15.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/mockito-all-1.8.5.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/mysql-connector-java-5.1.29-bin.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/oro-2.0.8.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/servlet-api-2.5-20081211.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/slf4j-api-1.4.3.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/slf4j-log4j12-1.4.3.jar INFO: Adding extension: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/libext/xmlenc-0.52.jar New Oozie WAR file with added 'ExtJS library, JARs' at /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/oozie-server/webapps/oozie.war INFO: Oozie is ready to be started
这样,上述已经生成了/home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/oozie-server/webapps/oozie.war文件。
- 配置Oozie
修改conf/oozie-site.xml配置文件,内容如下所示:
<property> <name>oozie.service.JPAService.jdbc.driver</name> <value>com.mysql.jdbc.Driver</value> <description> JDBC driver class. </description> </property> <property> <name>oozie.service.JPAService.jdbc.url</name> <value>jdbc:mysql://mysql-server:3306/oozie</value> <description> JDBC URL. </description> </property> <property> <name>oozie.service.JPAService.jdbc.username</name> <value>shirdrn</value> <description> DB user name. </description> </property> <property> <name>oozie.service.JPAService.jdbc.password</name> <value>0o21e</value> <description> DB user password. IMPORTANT: if password is emtpy leave a 1 space string, the service trims the value, if empty Configuration assumes it is NULL. </description> </property>
默认情况下,Oozie的配置中有个配置项oozie.service.JPAService.create.db.schema,值为false,设置非自动创建数据库,我们保持默认设置,这样可以通过手动创建Oozie数据库,并对其进行权限控制。然后,我们在MySQL数据库中创建数据库,名称为oozie,并进行访问授权:
CREATE DATABASE oozie; GRANT ALL ON oozie.* TO 'shirdrn'@'oozie-server' IDENTIFIED BY '0o21e'; FLUSH PRIVILEGES;
然后可以执行如下命令,生成Oozie所需要的数据表:
bin/ooziedb.sh create -sqlfile oozie.sql -run
查看控制台输出日志,没有报错,并且在当前目录下可以看到,同时也生成了oozie.sql脚本文件。到MySQL数据库中可以看到生成的表,说明上述操作执行成功。
下面可以启动Oozie,使用如下命令:
bin/oozied.sh start
启动信息,示例如下所示:
Setting OOZIE_HOME: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2 Setting OOZIE_CONFIG: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/conf Sourcing: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/conf/oozie-env.sh setting CATALINA_OPTS="$CATALINA_OPTS -Xmx1024m" Setting OOZIE_CONFIG_FILE: oozie-site.xml Setting OOZIE_DATA: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/data Setting OOZIE_LOG: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/logs Setting OOZIE_LOG4J_FILE: oozie-log4j.properties Setting OOZIE_LOG4J_RELOAD: 10 Setting OOZIE_HTTP_HOSTNAME: oozie-server Setting OOZIE_HTTP_PORT: 11000 Setting OOZIE_ADMIN_PORT: 11001 Setting OOZIE_HTTPS_PORT: 11443 Setting OOZIE_BASE_URL: http://oozie-server:11000/oozie Setting CATALINA_BASE: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/oozie-server Setting OOZIE_HTTPS_KEYSTORE_FILE: /home/shirdrn/.keystore Setting OOZIE_HTTPS_KEYSTORE_PASS: password Setting CATALINA_OUT: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/logs/catalina.out Setting CATALINA_PID: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/oozie-server/temp/oozie.pid Using CATALINA_OPTS: -Xmx1024m -Dderby.stream.error.file=/home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/logs/derby.log Adding to CATALINA_OPTS: -Doozie.home.dir=/home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2 -Doozie.config.dir=/home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/conf -Doozie.log.dir=/home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/logs -Doozie.data.dir=/home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/data -Doozie.config.file=oozie-site.xml -Doozie.log4j.file=oozie-log4j.properties -Doozie.log4j.reload=10 -Doozie.http.hostname=m1 -Doozie.admin.port=11001 -Doozie.http.port=11000 -Doozie.https.port=11443 -Doozie.base.url=http://m1:11000/oozie -Doozie.https.keystore.file=/home/shirdrn/.keystore -Doozie.https.keystore.pass=password -Djava.library.path= Using CATALINA_BASE: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/oozie-server Using CATALINA_HOME: /home/shirdrn/cloud/programs/apache-tomcat-7.0.52 Using CATALINA_TMPDIR: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/oozie-server/temp Using JRE_HOME: /usr/java/jdk1.7.0_25/ Using CLASSPATH: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/oozie-server/bin/tomcat-juli.jar:/home/shirdrn/cloud/programs/apache-tomcat-7.0.52/bin/bootstrap.jar Using CATALINA_PID: /home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/oozie-server/temp/oozie.pid
从上面日志可以看到,Oozie管理控制台连接为http://oozie-server:11000/oozie,可以看到图形化界面。
- 整合Oozie和Hadoop
我们的Hadoop平台使用的是用户shirdrn,用户组为shirdrn,这里配置Hadoop代理用户也使用该用户,部署Oozie的主机名为oozie-server。修改Hadoop的配置文件core-site.xml,增加如下配置内容:
<!-- OOZIE --> <property> <name>hadoop.proxyuser.shirdrn.hosts</name> <value>oozie-server</value> </property> <property> <name>hadoop.proxyuser.shirdrn.groups</name> <value>shirdrn</value> </property>
修改完上述配置后,需要重新启动Hadoop集群才能生效。
安装Oozie Client
我们可以通过在外部的一个Oozie客户端去提交工作流任务,实际上就是一个客户端程序,通过与Oozie Server进行交互,提交任务,并由Oozie Server去调用执行。
我们可以回到前面解压缩Oozie发行包oozie-3.3.2.tar.gz的目录下,通过前面的构建,现在已经可以看到有一个client目录,该目录下就是Oozie的客户端相关文件。含有Oozie客户端脚本的路径,我这里为/home/shirdrn/cloud/programs/oozie-3.3.2/client/target/oozie-client-3.3.2-client/oozie-client-3.3.2。
查看Oozie客户端运行job的命令帮助信息,可以执行如下命令:
cd /home/shirdrn/cloud/programs/oozie-3.3.2/client/target/oozie-client-3.3.2-client/oozie-client-3.3.2 bin/oozie help bin/oozie help job
我们可以找到,Oozie发行包中自带的examples,我这里对应的目录是/home/shirdrn/cloud/programs/oozie-3.3.2/examples/target/oozie-examples-3.3.2-examples/examples/apps,我们可以通过运行这些例子来验证安装是否成功。
首先,将Oozie自带的examples上传到HDFS上:
bin/hadoop fs -mkdir /oozie bin/hadoop fs -copyFromLocal /home/shirdrn/cloud/programs/oozie-3.3.2/examples/target/oozie-examples-3.3.2-examples/examples /user/shirdrn/examples
我们拿examples中的map-reduce来进行验证,修改job.properties文件,配置内容如下所示:
nameNode=hdfs://m1:9000 jobTracker=m1:19830 queueName=default examplesRoot=examples oozie.wf.application.path=${nameNode}/user/${user.name}/${examplesRoot}/apps/map-reduce outputDir=map-reduce
我的环境下,Namenode服务端口为hdfs://m1:9000,JobTracker为m1:19830,运行任务,执行如下命令:
cd /home/shirdrn/cloud/programs/oozie-3.3.2/client/target/oozie-client-3.3.2-client/oozie-client-3.3.2 bin/oozie job -oozie http://oozie-server:11000/oozie -config /home/shirdrn/cloud/programs/oozie-3.3.2/examples/target/oozie-examples-3.3.2-examples/examples/apps/map-reduce/job.properties -run
可以通过OozieWeb管理控制台查看提交运行的任务,如图所示:
以及,job配置,运行状态等信息,如图所示:
上面命令选项-run
表示直接运行一个job,当然你可以使用其他选项,如-submit
是提交job,-rerun
是重新运行job,-suspend
是挂起job等等,可以查看命令帮助,或参考相关文档。
参考链接
- http://oozie.apache.org/docs/3.3.2/DG_QuickStart.html
- http://practicalcloudcomputing.com/post/26337621577/installing-and-running-apache-oozie-3-2-x-and-possibly?543b50f0
- http://www.cnblogs.com/cenyuhai/p/3263756.html
本文基于署名-非商业性使用-相同方式共享 4.0许可协议发布,欢迎转载、使用、重新发布,但务必保留文章署名时延军(包含链接:http://shiyanjun.cn),不得用于商业目的,基于本文修改后的作品务必以相同的许可发布。如有任何疑问,请与我联系。
org.apache.oozie
oozie-hadoop-distcp
${distcp.version}
您好。我编译的时候找不到这个jar包是怎么回事?
找遍了整个maven库都没有这个jar包啊。
你使用的是哪个版本?oozie-hadoop-distcp这个应该是Oozie自带的,在编译Oozie的过程中,会生成这个jar文件。
您好,我按照您的教程安装oozie,到启动oozie这里都没有问题,可是登陆http://oozie-server:11000/oozie时,却显示没有安装install the Ext JS library。请问是直接把ext-2.2.zip放到最终的$OOZIE_HOME下就可以吗?不用解压或者进行其他ext的安装工作?
已解决,谢谢
你好,请问一下,我在生产环境上使用Oozie,我需要每个节点都安装吗?还是在指定的节点进行安装?多谢。
楼主你好,我把ext-2.2.zip放在$OOZIE_HOME下,但进行到这一步时bin/oozie-setup.sh prepare-warOozie提示webconsole disabled, ExtJS library not specified,这个问题怎么解决阿?
你确定OOZIE_HOME设置对了吗?是不是在增加OOZIE_HOME变量到~/.bashrc文件之后,没有进行source ~/.bashrc啊,多数可能是路径的问题
原来的问题已解决,谢谢!还想请教下,我用的是haodoop2.6,没有jobtracker的配置,“jobTracker=m1:19830”这个该配什么纳?
已解决,谢谢
你需要配置RM的地址,1.x版本提交任务是和JT交互,2.X版本需要和RM交互,RM默认端口应该是8032,你配置这个试试
我运行一个job,它的状态一直处于“SUSPENDED”,看不到它运行成功,是哪里配置有问题吗?
你好,请问一下,我在生产环境上使用Oozie,我需要每个节点都安装吗?还是在指定的节点进行安装?多谢。
启动之后能登录,点击页面上的Documentation之后报:HTTP Status 404 – /oozie/docs/index.html 楼主知道什么原因?
您好,请问您这样启动是用的自己安装的tombat而不是自带的吗?我的理解是如果是用自己安装的tomcat,访问时不得从tomcat中访问吗,war包应该复制到tomcat的webapps下?菜鸟,不懂,求指点
我验证了下,如果不用oozie自带的tomcat,是需要用bin/addtowar.sh来打war包,然后将war包复制到tomcat的webapps下,通过http://本机ip:8080/oozie访问。按照楼主的方法可以看到oozie启动时,打印出的信息显示的tomcat版本信息仍然是内置的tomcat
你好,请问一下,我在生产环境上使用Oozie,我需要每个节点都安装吗?还是在指定的节点进行安装?多谢。
不需要,在namenode节点上安装就行;至于能不能再datanode节点上安装,我没有试过
好的,十分感谢,再就是在请教您一个问题,Oozie能否可以实现,某个作业失败或者报错实现给指定用户发邮件或者发短信?
不好意思,我也是刚刚公司要用才接触到的oozie,不太清楚!不过我看到oozie自带的jar包中有个mail.jar,应该是可以发邮件的,短信的话不太清楚
好的,十分感谢,我在今天在安装的时候,遇到了一个HTTP Status 500 – java.lang.NoSuchMethodError: org.eclipse.jdt.internal.compiler.CompilationResult.getProblems()[Lorg/eclipse/jdt/core/compiler/IProblem; 的问题,不知您是否有遇到过,能否加下好友相互学习一下,多谢,扣扣:294386287
您好,我是在hadoop-2.3.0上装的Oozie,能否告知下 hadoop相关的jar库文件拷贝到libext下面 是如何操作的,我总是由于jar问题,安装失败,若您安装成功了,能否告知下原因,多谢多谢。
我也遇到了这么个问题。你解决了么?
已解决,包冲突。删除/usr/lib/oozie/libexec/下面的文件:
(1)jasper*.jar
(2)jsp-api*.jar
(3)servlet-api*.jar
问题解决了,多谢,麻烦在问一下,Oozie和hadoop整合的时候每台机器都要改吗?
用cdh安装时报错了。不知道怎么回事,你能帮我看看什么原因吗?
Failed to create Oozie database tables.
程序: oozie/oozie.sh ["db-command","create"]
Oozie安装的时候需要创建数据库表,多数是没有权限操作数据库,你可以看看操作数据库的权限信息,或者看看日志提示。
楼主,您好!看你的oozie的启动信息里面,tomcat版本与你安装的一致,但是安装你的步骤来装为什么我启动时显示的tomcat版本就是它内置的版本信息?请问你还有过其他什么修改吗?
执行任务 提示2015-08-21 05:03:19,709 INFO BaseJobServlet:539 – USER[-] GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] AuthorizationException
org.apache.oozie.service.AuthorizationException: E0902: Exception occured: [Call to master/192.168.52.129:9000 failed on local exception: java.io.IOException: Broken pipe]
at org.apache.oozie.service.AuthorizationService.authorizeForApp(AuthorizationService.java:401)
请问怎么办?我找了好久,都没有解决
你的Hadoop集群配置,启用Authorization了吧,检查一下你配置Oozie时修改的Hadoop配置文件内容,多数是这里有问题。
oozie.service.AuthorizationService.security.enabled
false
Specifies whether security (user name/admin role) is enabled or not.
If disabled any user can manage Oozie system and manage any job.
确实启动了org.apache.oozie.service.AuthorizationService
但是值是false
org.apache.oozie.servlet.XServletException: E0902: Exception occured: [Call to master/192.168.52.129:9000 failed on local exception: java.io.IOException: Broken pipe]
这个是日志的总体错误描述
at org.apache.oozie.servlet.BaseJobServlet.checkAuthorizationForApp(BaseJobServlet.java:201)
at org.apache.oozie.servlet.BaseJobsServlet.doPost(BaseJobsServlet.java:97)
这个是日志里错误具体描述的前两个
今天弄了一天,还是没有解决,但是发现一个现象,就是在oozie刚开启时,提交任务报java.io.EOFException,第二次提交才是java.io.IOException: Broken pipe
java.io.EOFException中的日志如下
org.apache.oozie.servlet.XServletException: E0902
at org.apache.oozie.servlet.BaseJobServlet.checkAuthorizationForApp(BaseJobServlet.java:201)
大神..安装过程中到 bin/oozie-setup.sh prepare-war这一步, 我也是下载的oozie-3.3.2….但是里面bin没有oozie-setup.sh啊
大神,你这个job明明没有执行成功呀。看状态是suspended,而且mr_node还有个异常,这个不解决了么?
你好,执行任务的时候,报错
2015-12-29 10:47:06,058 WARN ActionStartXCommand:523 – SERVER[myserver] USER[hadoop] GROUP[-] TOKEN[] APP[hive-wf] JOB[0000005-151224163948213-oozie-hado-W] ACTION[0000005-151224163948213-oozie-hado-W@hive-node] Error starting action [hive-node]. ErrorType [TRANSIENT], ErrorCode [JA009], Message [JA009: Unknown protocol: org.apache.hadoop.yarn.api.ApplicationClientProtocolPB
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.getProtocolImpl(ProtobufRpcEngine.java:527)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:566)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956)
]
job.properties中的
jobTracker= myserver:8031
与RM一致
yarn.resourcemanager.resource-tracker.address
myserver:8031
请问是哪得配置有问题么?
知道问题了。。
应该是
yarn.resourcemanager.address
myserver:8032
[root@localhost oozie-4.2.0]# bin/oozied.sh start
Setting OOZIE_HOME: /oozie/oozie4.2.0/oozie-4.2.0
Setting OOZIE_CONFIG: /oozie/oozie4.2.0/oozie-4.2.0/conf
Sourcing: /oozie/oozie4.2.0/oozie-4.2.0/conf/oozie-env.sh
setting CATALINA_OPTS=”$CATALINA_OPTS -Xmx1024m”
Setting OOZIE_CONFIG_FILE: oozie-site.xml
Setting OOZIE_DATA: /oozie/oozie4.2.0/oozie-4.2.0/data
Setting OOZIE_LOG: /oozie/oozie4.2.0/oozie-4.2.0/logs
Setting OOZIE_LOG4J_FILE: oozie-log4j.properties
Setting OOZIE_LOG4J_RELOAD: 10
Setting OOZIE_HTTP_HOSTNAME: localhost
Setting OOZIE_HTTP_PORT: 11000
Setting OOZIE_ADMIN_PORT: 11001
Setting OOZIE_HTTPS_PORT: 11443
Setting OOZIE_BASE_URL: http://localhost:11000/oozie
Setting CATALINA_BASE: /oozie/oozie4.2.0/oozie-4.2.0/oozie-server
Setting OOZIE_HTTPS_KEYSTORE_FILE: /root/.keystore
Setting OOZIE_HTTPS_KEYSTORE_PASS: password
Setting OOZIE_INSTANCE_ID: localhost
Setting CATALINA_OUT: /oozie/oozie4.2.0/oozie-4.2.0/logs/catalina.out
Setting CATALINA_PID: /oozie/oozie4.2.0/oozie-4.2.0/oozie-server/temp/oozie.pid
Using CATALINA_OPTS: -Xmx1024m -Dderby.stream.error.file=/oozie/oozie4.2.0/oozie-4.2.0/logs/derby.log
Adding to CATALINA_OPTS: -Doozie.home.dir=/oozie/oozie4.2.0/oozie-4.2.0 -Doozie.config.dir=/oozie/oozie4.2.0/oozie-4.2.0/conf -Doozie.log.dir=/oozie/oozie4.2.0/oozie-4.2.0/logs -Doozie.data.dir=/oozie/oozie4.2.0/oozie-4.2.0/data -Doozie.instance.id=localhost -Doozie.config.file=oozie-site.xml -Doozie.log4j.file=oozie-log4j.properties -Doozie.log4j.reload=10 -Doozie.http.hostname=localhost -Doozie.admin.port=11001 -Doozie.http.port=11000 -Doozie.https.port=11443 -Doozie.base.url=http://localhost:11000/oozie -Doozie.https.keystore.file=/root/.keystore -Doozie.https.keystore.pass=password -Djava.library.path=
Setting up oozie DB
setting CATALINA_OPTS=”$CATALINA_OPTS -Xmx1024m”
Validate DB Connection
DONE
DB schema exists
The SQL commands have been written to: /tmp/ooziedb-1785982097148799406.sql
Using CATALINA_BASE: /oozie/oozie4.2.0/oozie-4.2.0/oozie-server
Using CATALINA_HOME: /oozie/tomcat7/apache-tomcat-7.0.52
Using CATALINA_TMPDIR: /oozie/oozie4.2.0/oozie-4.2.0/oozie-server/temp
Using JRE_HOME: /usr/java/jdk1.7.0_79
Using CLASSPATH: /oozie/oozie4.2.0/oozie-4.2.0/oozie-server/bin/tomcat-juli.jar:/oozie/tomcat7/apache-tomcat-7.0.52/bin/bootstrap.jar
Using CATALINA_PID: /oozie/oozie4.2.0/oozie-4.2.0/oozie-server/temp/oozie.pid
Existing PID file found during start.
Removing/clearing stale PID file.
我输入http://localhost:11000/oozie看不到图形化界面.
大神老师帮我看看,怎么回事,求回复.
你这最后面的localhost,是在安装Oozie的本机吗?
解决了,用jps后发现有服务没有启动。
老师:我看你文章有一段话不太明白“这样,上述已经生成了/home/shirdrn/cloud/programs/oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/oozie-server/webapps/oozie.war文件。”其中oozie.war有什么用?好像本文中没有使用到,另外请问oozie.war能不能放在windows tomcat下面,直接运行?
Oozie是通过HTTP协议提交作业的,而且也提供了一个可视化的Web界面,oozie.war就对应这个Web Application的。
oozie 启动 也是 提示extjs未安装 。 是什么原因引起?
你们为啥要用这么复杂的东西,azkaban要比这个好用多了
oozie还要装服务端和客户端太麻烦了
用Oozie,一般都是公司业务比较复杂,体量足够大才会选择。
对应小一些的公司,还是用Azkaban灵活一些,不过这两个框架都有各自的优势与劣势,根据实际情况去权衡选型。
初学者在 进行bin/oozie-setup.sh sharelib create -fs hdfs://z01:8020 -locallib oozie-sharelib-4.0.0-cdh5.3.6-yarn.tar.gz 这个过程中发生了以下问题
unknown protocol org.apache.hadoop.hdfs.protocol.clientprotocol
希望大佬有空可以帮忙解答一下