Sqoop 作业被自动杀死

问题描述

我正在尝试使用 Apache Sqoop 使用以下命令将数据从远程机器上的 MysqL 加载到 HDFS:

sqoop import --connect jdbc:MysqL://<IP>/<DB> --table dashboard_data --username <user> --password <pass> --fields-terminated-by ',' --target-dir '/user/mohit/dashboard_data'

最初,mapreduce 作业启动并开始导入数据。但是,一段时间后,Sqoop 作业在导入部分数据后会自动终止。这是输出-

sqoop import --connect jdbc:MysqL://<IP>/<DB> --table dashboard_data --username <username> --password <password> --fields-terminated-by ',' --target-dir '/user/mohit/dashboard_data'
Warning: /home/hadoop/sqoop/../hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: /home/hadoop/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /home/hadoop/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /home/hadoop/sqoop/../zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/hadoop/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/hadoop/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2021-03-18 12:50:43,868 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
2021-03-18 12:50:44,033 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
2021-03-18 12:50:44,227 INFO manager.MysqLManager: Preparing to use a MysqL streaming resultset.
2021-03-18 12:50:44,227 INFO tool.CodeGenTool: Beginning code generation
Loading class `com.MysqL.jdbc.Driver'. This is deprecated. The new driver class is `com.MysqL.cj.jdbc.Driver'. The driver is automatically registered via the SPI and manual loading of the driver class is generally unnecessary.
2021-03-18 12:50:44,877 INFO manager.sqlManager: Executing sql statement: SELECT t.* FROM `dashboard_data` AS t LIMIT 1
2021-03-18 12:50:44,937 INFO manager.sqlManager: Executing sql statement: SELECT t.* FROM `dashboard_data` AS t LIMIT 1
2021-03-18 12:50:44,952 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/hadoop/hadoop
Note: /tmp/sqoop-hadoop/compile/78a1e7883adda2c73bc011562c3849da/dashboard_data.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
2021-03-18 12:50:48,571 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/78a1e7883adda2c73bc011562c3849da/dashboard_data.jar
2021-03-18 12:50:48,601 WARN manager.MysqLManager: It looks like you are importing from MysqL.
2021-03-18 12:50:48,602 WARN manager.MysqLManager: This transfer can be faster! Use the --direct
2021-03-18 12:50:48,602 WARN manager.MysqLManager: option to exercise a MysqL-specific fast path.
2021-03-18 12:50:48,602 INFO manager.MysqLManager: Setting zero DATETIME behavior to convertToNull (MysqL)
2021-03-18 12:50:48,610 INFO mapreduce.ImportJobBase: Beginning import of dashboard_data
2021-03-18 12:50:48,611 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead,use mapreduce.jobtracker.address
2021-03-18 12:50:48,780 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2021-03-18 12:50:48,815 INFO Configuration.deprecation: mapred.jar is deprecated. Instead,use mapreduce.job.jar
2021-03-18 12:50:49,653 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead,use mapreduce.job.maps
2021-03-18 12:50:49,802 INFO client.RMProxy: Connecting to ResourceManager at /192.168.175.122:5555
2021-03-18 12:50:50,689 INFO mapreduce.JobResourceUploader: disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/hadoop/.staging/job_1616071829488_0001
2021-03-18 12:50:50,902 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false,remoteHostTrusted = false
2021-03-18 12:50:51,500 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false,550 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false,606 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false,remoteHostTrusted = false
2021-03-18 12:50:52,087 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false,547 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false,remoteHostTrusted = false
2021-03-18 12:50:53,002 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false,448 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false,914 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false,remoteHostTrusted = false
2021-03-18 12:50:54,374 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false,419 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false,857 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false,908 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false,remoteHostTrusted = false
2021-03-18 12:50:55,352 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false,841 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false,remoteHostTrusted = false
2021-03-18 12:50:56,311 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false,759 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false,remoteHostTrusted = false
2021-03-18 12:50:57,211 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false,664 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false,remoteHostTrusted = false
2021-03-18 12:50:58,109 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false,993 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false,remoteHostTrusted = false
2021-03-18 12:50:59,081 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false,531 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false,566 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false,remoteHostTrusted = false
2021-03-18 12:51:00,015 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false,455 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false,543 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false,978 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false,remoteHostTrusted = false
2021-03-18 12:51:01,017 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false,059 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false,498 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false,947 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false,984 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false,remoteHostTrusted = false
2021-03-18 12:51:02,430 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false,460 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false,490 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false,926 INFO db.DBInputFormat: Using read commited transaction isolation
2021-03-18 12:51:02,927 INFO db.DataDrivendBInputFormat: BoundingValsQuery: SELECT MIN(`id`),MAX(`id`) FROM `dashboard_data`
2021-03-18 12:51:02,930 INFO db.IntegerSplitter: Split size: 3247737; Num splits: 4 from: 47453902 to: 60444853
2021-03-18 12:51:02,963 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false,remoteHostTrusted = false
2021-03-18 12:51:03,435 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false,849 INFO mapreduce.JobSubmitter: number of splits:4
2021-03-18 12:51:04,022 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false,remoteHostTrusted = false
2021-03-18 12:51:04,452 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1616071829488_0001
2021-03-18 12:51:04,452 INFO mapreduce.JobSubmitter: Executing with tokens: []
2021-03-18 12:51:04,825 INFO conf.Configuration: resource-types.xml not found
2021-03-18 12:51:04,825 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
2021-03-18 12:51:05,374 INFO impl.YarnClientImpl: Submitted application application_1616071829488_0001
2021-03-18 12:51:05,552 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1616071829488_0001/
2021-03-18 12:51:05,552 INFO mapreduce.Job: Running job: job_1616071829488_0001
2021-03-18 12:51:17,885 INFO mapreduce.Job: Job job_1616071829488_0001 running in uber mode : false
2021-03-18 12:51:17,887 INFO mapreduce.Job:  map 0% reduce 0%
Killed

部分数据-

hadoop fs -ls /user/mohit/dashboard_data
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/hadoop/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/hadoop/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2021-03-18 13:03:53,603 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 5 items
-rw-r--r--   1 hadoop supergroup          0 2021-03-18 12:55 /user/mohit/dashboard_data/_SUCCESS
-rw-r--r--   1 hadoop supergroup  497481093 2021-03-18 12:54 /user/mohit/dashboard_data/part-m-00000
-rw-r--r--   1 hadoop supergroup  551443674 2021-03-18 12:55 /user/mohit/dashboard_data/part-m-00001
-rw-r--r--   1 hadoop supergroup  551978352 2021-03-18 12:55 /user/mohit/dashboard_data/part-m-00002
-rw-r--r--   1 hadoop supergroup  558321276 2021-03-18 12:55 /user/mohit/dashboard_data/part-m-00003

这是资源管理器日志的片段

2021-03-18 12:51:21,641 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Allocation proposal accepted
2021-03-18 12:51:22,164 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo: checking for deactivate of application :application_1616071829488_0001
2021-03-18 12:51:22,182 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1616071829488_0001_01_000006 Container Transitioned from ALLOCATED to ACQUIRED
2021-03-18 12:51:23,203 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1616071829488_0001_01_000006 Container Transitioned from ACQUIRED to RELEASED
2021-03-18 12:51:23,203 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=hadoop   IP=192.168.175.122      OPERATION=AM Released Container TARGET=SchedulerApp     RESULT=SUCCESS  APPID=application_1616071829488_0001    CONTAINERID=container_1616071829488_0001_01_000006      RESOURCE=<memory:1024,vCores:1>        QUEUENAME=default
2021-03-18 12:51:51,284 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 1297ms
No GCs detected
2021-03-18 12:53:03,537 WARN org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 10448ms
No GCs detected
2021-03-18 12:53:06,891 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 2016ms
No GCs detected
2021-03-18 12:53:07,490 INFO org.apache.hadoop.ipc.Server: Socket Reader #1 for port 8031: readAndProcess from client 192.168.175.122:41224 threw exception [java.io.IOException: Connection reset by peer]
java.io.IOException: Connection reset by peer
        at sun.nio.ch.FiledispatcherImpl.read0(Native Method)
        at sun.nio.ch.socketdispatcher.read(Socketdispatcher.java:39)
        at sun.nio.ch.IoUtil.readIntoNativeBuffer(IoUtil.java:223)
        at sun.nio.ch.IoUtil.read(IoUtil.java:197)
        at sun.nio.ch.socketChannelImpl.read(SocketChannelImpl.java:379)
        at org.apache.hadoop.ipc.Server.channelRead(Server.java:3570)
        at org.apache.hadoop.ipc.Server.access$2600(Server.java:139)
        at org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:2204)
        at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:1394)
        at org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:1250)
        at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:1221)
2021-03-18 12:53:08,490 INFO SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for appattempt_1616071829488_0001_000001 (auth:SIMPLE)
2021-03-18 12:55:17,697 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo: checking for deactivate of application :application_1616071829488_0001
2021-03-18 13:00:29,952 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler: Release request cache is cleaned up

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)