使用Oozie

问题描述

我有一个Oozie工作流程,其中包含一个Pig动作,生成一个零件文件作为输出

/user/wf_user/app_dir/output/part-v003-o000-r-00100

Pig操作之后,有一个fs操作,该操作将生成完成标志文件并将part-v003-o000-r-00100移至alert_message(用于重命名),然后更改文件路径{{ 1}},以使后续的工作流程操作均可访问该文件

此后,有一个决策控制节点来检查文件/user/wf_user/app_dir/output/alert_message是否存在并且大小大于零。仅当大小不为零时,才会通过电子邮件发送警报消息。

但是,即使文件存在且大小不为零,决策条件也始终返回false,从而永远不会将警报消息通过电子邮件发送给通知用户

/user/wf_user/app_dir/output/alert_message

以下是相关工作流程操作的摘录

<switch xmlns="uri:oozie:workflow:0.4">
  <case to="message_pref_alert">false</case>
  <default to="success_email" />
</switch>

注意:我尚未将 <action name='generate_preftable_pref_count_report' cred="hcatauth,athensauth"> <pig> <prepare> <delete path="${flag_dir}"></delete> </prepare> <script>generate_diffcount_w_perc_mktg_prefs.pig</script> <param>today=${today}</param> <param>prev_date=${prev_date}</param> <param>lake_tahoe_dump=${lake_tahoe_dump}</param> <param>current_pref_snapshot=${current_pref_snapshot}</param> <param>preference_user=${preference_user}</param> <param>pref_count_report=${pref_count_report}</param> <param>flag_dir=${flag_dir}</param> <file>${common_lib}/elephant-bird-pig.jar#elephant-bird-pig.jar</file> <file>${common_lib}/elephant-bird-core.jar#elephant-bird-core.jar</file> <file>${common_lib}/elephant-bird-hadoop-compat.jar#elephant-bird-hadoop-compat.jar</file> </pig> <ok to="fs-create-report-flag-and-alert" /> <error to="failure_email" /> </action> <action name="fs-create-report-flag-and-alert"> <fs> <chmod path='${flag_dir}' permissions='-rwxrwxrwx' dir-files='true'/> <delete path='${flag_dir}/report_generated_${prev_date}'/> <touchz path='${flag_dir}/report_generated_${today}'/> <move source='${flag_dir}/part*' target='${alert_message_file}' /> <chmod path='${alert_message_file}' permissions='-rwxrwxrwx' dir-files='true'/> </fs> <ok to="if_alert_prefs_present"/> <error to="failure_email"/> </action> <decision name="if_alert_prefs_present"> <switch> <case to="message_pref_alert">${(fs:exists('${alert_message_file}')) and (fs:fileSize('${alert_message_file}') gt 0 )}</case> <default to="success_email"/> </switch> </decision> <action name="message_pref_alert"> <email xmlns="uri:oozie:email-action:0.1"> <to>${notify_to}</to> <subject>PrefTable-PrefCount-Update-${today} : Pref Count Alert</subject> <body>In today's pref counts,the pref count difference >= 5% for some sub/unsub preferences. Please check the below file for sub/unsub details. ${alert_message_file} For further details on count and percentage difference,please check the hive table ${pref_count_report} . </body> </email> <ok to="success_email"/> <error to="failure_email"/> </action> 设置为工作流配置属性,而仅在如下所示的协调器属性文件中进行了设置。

${alert_message_file}

还研究了关于同一主题的其他类似SO讨论: How to check whether the file exist in HDFS location,using oozie?

解决方法

通过${alert_message_file}作为工作流配置属性来解决错误。实际上,变量Alert_message_file(在属性中定义)是不可访问的,除非它通过工作流通过配置属性传递,否则将出现“变量无法解析”错误。

在workflow.xml中

<configuration>
    <property>
        ...
    </property>  
    <property>
        <name>alert_message_file</name>
        <value>${alert_message_file}</value>
    </property>
</configuration>

然后按如下所示更改决策节点

<decision name="if_alert_prefs_present">
    <switch>
<case to="message_pref_alert">${(fs:exists(wf:conf('alert_message_file'))) and (fs:fileSize(wf:conf('alert_message_file')) gt 0 )}</case>
<default to="success_email"/>
</switch>
</decision>