docker – Sqoop – 导入作业失败

我试图通过Sqoop将一个包含3200万条记录的表从sql Server导入Hive.连接是sql Server成功的.但Map / Reduce作业无法成功执行.它给出以下错误

18/07/19 04:00:11 INFO client.RMProxy: Connecting to ResourceManager at /127.0.0.1:8032
18/07/19 04:00:27 DEBUG db.DBConfiguration: Fetching password from job credentials store
18/07/19 04:00:27 INFO db.DBInputFormat: Using read commited transaction isolation
18/07/19 04:00:27 DEBUG db.DataDrivendBInputFormat: Creating input split with lower bound '1=1' and upper bound '1=1'
18/07/19 04:00:28 INFO mapreduce.JobSubmitter: number of splits:1
18/07/19 04:00:29 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1531917395459_0002
18/07/19 04:00:30 INFO impl.YarnClientImpl: Submitted application application_1531917395459_0002
18/07/19 04:00:30 INFO mapreduce.Job: The url to track the job: http://quickstart.cloudera:8088/proxy/application_1531917395459_0002/
18/07/19 04:00:30 INFO mapreduce.Job: Running job: job_1531917395459_0002
    18/07/19 04:43:02 INFO mapreduce.Job: Job job_1531917395459_0002 running in uber mode : false
18/07/19 04:43:03 INFO mapreduce.Job:  map 0% reduce 0%
18/07/19 04:43:04 INFO mapreduce.Job: Job job_1531917395459_0002 Failed with state Failed due to: Application application_1531917395459_0002 Failed 2 times due to ApplicationMaster for attempt appattempt_1531917395459_0002_000002 timed out. Failing the application.
18/07/19 04:43:08 INFO mapreduce.Job: Counters: 0
18/07/19 04:43:08 WARN mapreduce.Counters: Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead
18/07/19 04:43:09 INFO mapreduce.ImportJobBase: Transferred 0 bytes in 2,576.6368 seconds (0 bytes/sec)
18/07/19 04:43:10 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
18/07/19 04:43:10 INFO mapreduce.ImportJobBase: Retrieved 0 records.
18/07/19 04:43:10 ERROR tool.ImportTool: Error during import: Import job Failed!

以下是yarn-site.xml文件的配置

figuration.xsl"?>

figuration>
  dispatcher.exit-on-errorirsirssspath for typical applications.sspathfiguration>

首先,当通过0.0.0.0:8032连接资源管理器时,工作被卡住了.所以我将主机更改为127.0.0.1.然后执行继续进行.但后来发生了上述错误.即使我尝试只用1000行执行此作业,但同样的错误.此外,有时候工作会被杀死.

这是我的sqoop命令

sqoop import --connect "jdbc:sqlserver://system-ip;databaseName=TEST" --driver com.microsoft.sqlserver.jdbc.sqlServerDriver --username user1 --password password --hive-import --create-hive-table --hive-table "customer_data_1000" --table "customer_data_1000" --split-by Account_Branch_Converted -m 1 --verbose

这是我的docker命令,以防万一:

  docker run --hostname=quickstart.cloudera --privileged=true -t -p 127.0.0.1:8888:8888 -p 127.0.

0.1:7180:7180 -p 127.0.0.1:50070:50070 -i 7c41929668d8 /usr/bin/docker-quickstart

这是资源管理器日志:

          2018-07-26 07:18:26,439 INFO org.apache.hadoop.yarn.util.AbstractLivelinessMonitor: Expired:appattempt_1532588462827_0001_000001 Timed out after 600 secs
        2018-07-26 07:24:03,059 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: Updating application attempt appattempt_1532588462827_0001_000001 with final state: Failed,and exit status: -1000
        2018-07-26 07:35:46,609 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1532588462827_0001_000001 State change from LAUNCHED to FINAL_SAVING
        2018-07-26 07:35:49,502 INFO org.apache.hadoop.yarn.util.AbstractLivelinessMonitor: Expired:quickstart.cloudera:36003 Timed out after 600 secs
        2018-07-26 07:39:44,485 INFO org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Deactivating Node quickstart.cloudera:36003 as it is Now LOST
        2018-07-26 07:44:39,238 INFO org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: quickstart.cloudera:36003 Node Transitioned from RUNNING to LOST
        2018-07-26 07:45:09,895 INFO org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: Unregistering app attempt : appattempt_1532588462827_0001_000001
        2018-07-26 07:49:43,848 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService: Node not found resyncing quickstart.cloudera:36003
        2018-07-26 07:49:43,916 INFO org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: Application finished,removing password for appattempt_1532588462827_0001_000001
        2018-07-26 07:49:45,738 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1532588462827_0001_000001 State change from FINAL_SAVING to Failed
        2018-07-26 07:49:47,095 WARN org.apache.hadoop.ipc.Server: IPC Server handler 12 on 8032,call org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport from 127.0.0.1:45162 Call#608 Retry#0: output error
        2018-07-26 07:49:47,100 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: The number of Failed attempts is 1. The max attempts is 2
        2018-07-26 07:49:47,887 INFO org.apache.hadoop.ipc.Server: IPC Server handler 12 on 8032 caught an exception
        java.nio.channels.ClosedChannelException
                at sun.nio.ch.socketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:265)
                at sun.nio.ch.socketChannelImpl.write(SocketChannelImpl.java:474)
                at org.apache.hadoop.ipc.Server.channelWrite(Server.java:2621)
                at org.apache.hadoop.ipc.Server.access$1900(Server.java:134)
                at org.apache.hadoop.ipc.Server$Responder.processResponse(Server.java:989)
                at org.apache.hadoop.ipc.Server$Responder.doRespond(Server.java:1054)
                at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2141)
        2018-07-26 07:49:49,127 INFO org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: Registering app attempt : appattempt_1532588462827_0001_000002
        2018-07-26 07:49:49,127 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1532588462827_0001_000002 State change from NEW to SUBMITTED
        2018-07-26 07:49:49,127 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Cleaning master appattempt_1532588462827_0001_000001
        2018-07-26 07:49:50,458 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1532588462827_0001_01_000001 Container Transitioned from RUNNING to KILLED
        2018-07-26 07:49:50,459 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt: Completed container: container_1532588462827_0001_01_000001 in state: KILLED event:KILL
        2018-07-26 07:49:50,460 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=root     OPERATION=AM Released Container TARGET=SchedulerApp     RESULT=SUCCESS  APPID=application_1532588462827_0001    CONTAINERID=container_1532588462827_0001_01_000001
        2018-07-26 07:49:50,550 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: Released container container_1532588462827_0001_01_000001 of capacity irscheduler: Application attempt appattempt_1532588462827_0001_000001 released container container_1532588462827_0001_01_000001 on node: host: quickstart.cloudera:36003 #containers=0 available=8192 used=0 with event: KILL
        2018-07-26 07:49:50,580 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.Fairscheduler: Removed node quickstart.cloudera:36003 cluster capacity: irscheduler: Application appattempt_1532588462827_0001_000001 is done. finalState=Failed
        2018-07-26 07:49:50,581 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo: Application application_1532588462827_0001 requests cleared
        2018-07-26 07:49:51,860 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.Fairscheduler: Added Application Attempt appattempt_1532588462827_0001_000002 to scheduler from user: root
        2018-07-26 07:49:52,125 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1532588462827_0001_000002 State change from SUBMITTED to SCHEDULED
        2018-07-26 07:50:04,533 INFO org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: quickstart.cloudera:36003 Node Transitioned from NEW to RUNNING
        2018-07-26 07:50:04,534 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.Fairscheduler: Added node quickstart.cloudera:36003 cluster capacity: irscheduler: Container container_1532588462827_0001_01_000001 completed with event FINISHED
        2018-07-26 07:50:06,022 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1532588462827_0001_02_000001 Container Transitioned from NEW to ALLOCATED
        2018-07-26 07:50:06,023 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=root     OPERATION=AM Allocated Container        TARGET=SchedulerApp     RESULT=SUCCESS  APPID=application_1532588462827_0001    CONTAINERID=container_1532588462827_0001_02_000001
        2018-07-26 07:50:06,025 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: Assigned container container_1532588462827_0001_02_000001 of capacity ecurity.NMTokenSecretManagerInRM: Sending NMToken for nodeId : quickstart.cloudera:36003 for container : container_1532588462827_0001_02_000001
        2018-07-26 07:50:06,026 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1532588462827_0001_02_000001 Container Transitioned from ALLOCATED to ACQUIRED
        2018-07-26 07:50:06,026 INFO org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM: Clear node set for appattempt_1532588462827_0001_000002
        2018-07-26 07:50:06,026 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: Storing attempt: AppId: application_1532588462827_0001 AttemptId: appattempt_1532588462827_0001_000002 MasterContainer: Container: [ContainerId: container_1532588462827_0001_02_000001,NodeId: quickstart.cloudera:36003,NodeHttpAddress: quickstart.cloudera:8042,Resource: terappattempt_1532588462827_0001_000002
        2018-07-26 07:50:06,027 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Setting up container Container: [ContainerId: container_1532588462827_0001_02_000001,] for AM appattempt_1532588462827_0001_000002
        2018-07-26 07:50:06,027 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Command to launch container container_1532588462827_0001_02_000001 : $JAVA_HOME/bin/java -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=ecurity.AMRMTokenSecretManager: Create AMRMToken for ApplicationAttempt: appattempt_1532588462827_0001_000002
        2018-07-26 07:50:06,027 INFO org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: Creating password for appattempt_1532588462827_0001_000002
        2018-07-26 07:50:06,128 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Done launching container Container: [ContainerId: container_1532588462827_0001_02_000001,129 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1532588462827_0001_000002 State change from ALLOCATED to LAUNCHED
        2018-07-26 07:50:06,953 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1532588462827_0001_02_000001 Container Transitioned from ACQUIRED to RUNNING
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.Fairscheduler: Container container_1532588462827_0001_01_000001 completed with event FINISHED
        2018-07-26 07:50:32,951 INFO SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for appattempt_1532588462827_0001_000002 (auth:SIMPLE)
        2018-07-26 07:50:33,014 INFO org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: AM registration appattempt_1532588462827_0001_000002
emanager.scheduler.fair.Fairscheduler: Container container_1532588462827_0001_01_000001 completed with event FINISHED
        2018-07-26 07:50:34,887 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.Fairscheduler: Container container_1532588462827_0001_01_000001 completed with event FINISHED
        2018-07-26 07:50:34,893 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1532588462827_0001_02_000002 Container Transitioned from NEW to ALLOCATED
        2018-07-26 07:50:34,893 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=root     OPERATION=AM Allocated Container        TARGET=SchedulerApp     RESULT=SUCCESS  APPID=application_1532588462827_0001    CONTAINERID=container_1532588462827_0001_02_000002
        2018-07-26 07:50:34,894 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: Assigned container container_1532588462827_0001_02_000002 of capacity irscheduler: Container container_1532588462827_0001_01_000001 completed with event FINISHED
        2018-07-26 07:50:36,478 INFO org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM: Sending NMToken for nodeId : quickstart.cloudera:36003 for container : container_1532588462827_0001_02_000002
        2018-07-26 07:50:36,479 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1532588462827_0001_02_000002 Container Transitioned from ALLOCATED to ACQUIRED
        2018-07-26 07:50:36,898 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.Fairscheduler: Container container_1532588462827_0001_01_000001 completed with event FINISHED
        2018-07-26 07:50:38,113 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1532588462827_0001_02_000002 Container Transitioned from ACQUIRED to RUNNING
        2018-07-26 07:50:38,113 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo: checking for deactivate...
        2018-07-26 07:50:54,379 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1532588462827_0001_02_000002 Container Transitioned from RUNNING to COMPLETED
        2018-07-26 07:50:54,525 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt: Completed container: container_1532588462827_0001_02_000002 in state: COMPLETED event:FINISHED
        2018-07-26 07:50:54,553 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=root     OPERATION=AM Released Container TARGET=SchedulerApp     RESULT=SUCCESS  APPID=application_1532588462827_0001    CONTAINERID=container_1532588462827_0001_02_000002
        2018-07-26 07:50:54,555 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: Released container container_1532588462827_0001_02_000002 of capacity irscheduler: Application attempt appattempt_1532588462827_0001_000002 released container container_1532588462827_0001_02_000002 on node: host: quickstart.cloudera:36003 #containers=1 available=6144 used=2048 with event: FINISHED
        2018-07-26 07:50:55,386 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.Fairscheduler: Container container_1532588462827_0001_02_000002 completed with event FINISHED
    438 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt: Completed container: container_1532588462827_0001_02_000001 in state: COMPLETED event:FINISHED
        2018-07-26 07:51:00,438 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=root     OPERATION=AM Released Container TARGET=SchedulerApp     RESULT=SUCCESS  APPID=application_1532588462827_0001    CONTAINERID=container_1532588462827_0001_02_000001
        2018-07-26 07:51:00,438 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: Released container container_1532588462827_0001_02_000001 of capacity Failed,and exit status: 0
        2018-07-26 07:51:00,439 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1532588462827_0001_000002 State change from RUNNING to FINAL_SAVING
        2018-07-26 07:51:00,439 INFO org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: Unregistering app attempt : appattempt_1532588462827_0001_000002
        2018-07-26 07:51:00,439 INFO org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: Application finished,removing password for appattempt_1532588462827_0001_000002
        2018-07-26 07:51:00,439 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1532588462827_0001_000002 State change from FINAL_SAVING to Failed
        2018-07-26 07:51:00,439 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: The number of Failed attempts is 2. The max attempts is 2
        2018-07-26 07:51:00,439 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Updating application application_1532588462827_0001 with final state: Failed
        2018-07-26 07:51:00,457 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1532588462827_0001 State change from RUNNING to FINAL_SAVING
        2018-07-26 07:51:00,458 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Updating info for app: application_1532588462827_0001
        2018-07-26 07:51:00,458 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.Fairscheduler: Application appattempt_1532588462827_0001_000002 is done. finalState=Failed
        2018-07-26 07:51:00,458 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo: Application application_1532588462827_0001 requests cleared
        2018-07-26 07:51:00,458 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Cleaning master appattempt_1532588462827_0001_000002
        2018-07-26 07:51:05,760 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Application application_1532588462827_0001 Failed 2 times due to AM Container for appattempt_1532588462827_0001_000002 exited with  exitCode: 0
        For more detailed output,check application tracking page:http://quickstart.cloudera:8088/proxy/application_1532588462827_0001/Then,click on links to logs of each attempt.
        Diagnostics: Failing this attempt. Failing the application.
        2018-07-26 07:51:05,781 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1532588462827_0001 State change from FINAL_SAVING to Failed
        2018-07-26 07:51:05,785 WARN org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=root     OPERATION=Application Finished - Failed TARGET=RMAppManager     RESULT=FAILURE  DESCRIPTION=App Failed with state: Failed       PERMISSIONS=Application application_1532588462827_0001 Failed 2 times due to AM Container for appattempt_1532588462827_0001_000002 exited with  exitCode: 0
        For more detailed output,click on links to logs of each attempt.
        Diagnostics: Failing this attempt. Failing the application.     APPID=application_1532588462827_0001
        2018-07-26 07:51:05,819 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary: appId=application_1532588462827_0001,name=customer_data_1000.jar,user=root,queue=root.root,state=Failed,trackingUrl=http://quickstart.cloudera:8088/cluster/app/application_1532588462827_0001,appMasterHost=N/A,startTime=1532588804719,finishTime=1532591460451,finalStatus=Failed
        2018-07-26 07:51:05,821 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.Fairscheduler: Container container_1532588462827_0001_02_000001 completed with event FINISHED
        2018-07-26 07:51:05,822 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.Fairscheduler: Container container_1532588462827_0001_02_000002 completed with event FINISHED

我哪里错了?

最佳答案
我不能给你一个精确的解决方案,但我能做的是告诉你根本原因是什么:

>尝试以非root用户身份运行Sqoop作业.
>检查主机上是否正确安装了JDK,并正确设置了JAVA_HOME.
>检查您是否已授予您正在使用的数据库的正确权限.

由于上述原因之一,您的工作失败了.因为你有足够的v-cores和内存可用,并且还在创建容器.因此,处理端的所有内容都很好,但必须存在配置错误.

相关文章

Docker是什么Docker是 Docker.Inc 公司开源的一个基于 LXC技...
本文为原创,原始地址为:http://www.cnblogs.com/fengzheng...
镜像操作列出镜像:$ sudo docker imagesREPOSITORY TAG IMA...
本文原创,原文地址为:http://www.cnblogs.com/fengzheng/p...
在 Docker 中,如果你修改了一个容器的内容并希望将这些更改...
在Docker中,--privileged 参数给予容器内的进程几乎相同的权...