如何使用Yarn ResourceManager REST API终止Spark应用程序

问题描述

我正在尝试使用Yarn REST资源管理器API杀死在Yarn上运行的spark应用程序。 以下是我尝试杀死该应用程序的两个不同的PUT命令:

  1. 第一命令
curl -X PUT 'http://<HOSTNAME>:8088/ws/v1/cluster/apps/<APPLICATION_ID>/state' -d '{"state": "KILLED"}'

结果:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?><remoteexception><exception>WebApplicationException</exception><javaClassName>javax.ws.rs.WebApplicationException</javaClassName></remoteexception>
  1. 第二个命令
curl -v -X PUT -H "Content-Type: application/json" -d '{"state": "KILLED"}' 'http://<HOSTNAME>:8088/ws/v1/cluster/apps/<APPLICATION_ID>/state'

结果:

* About to connect() to <HOSTNAME> port 8088 (#0)
*   Trying <IP>...
* Connected to <HOSTNAME> (<IP>) port 8088 (#0)
> PUT /ws/v1/cluster/apps/<APPLICATION_ID>/state HTTP/1.1
> User-Agent: curl/<SOME IP>
> Host: <HOSTNAME>:8088
> Accept: */*
> Content-Type: application/json
> Content-Length: 19
>
* upload completely sent off: 19 out of 19 bytes
< HTTP/1.1 403 Forbidden
< Cache-Control: no-cache
< Expires: Mon,07 Sep 2020 18:26:46 GMT
< Date: Mon,07 Sep 2020 18:26:46 GMT
< Pragma: no-cache
< Expires: Mon,07 Sep 18:26:46 GMT
< Pragma: no-cache
< Content-Type: application/json
< x-frame-options: SAMEORIGIN
< transfer-encoding: chunked
< Server: Jetty(<SOME IP>.hwx)
<
* Connection #0 to host <HOSTNAME> left intact
{"remoteexception":{"exception":"ForbiddenException","message":"java.lang.Exception: The default static user cannot carry out this operation.","javaClassName":"org.apache.hadoop.yarn.webapp.ForbiddenException"}}

在这里缺少什么吗?还是需要提供用户ID。 什么是杀死该应用程序的正确命令。请提出建议。

谢谢

解决方法

根据此ResourceManager API documentPUT 请求需要进行身份验证。

一般来说,如果我们提到 Hadoop 中的身份验证,最基本的就是 Kerberos 身份验证。

所以需要先确认对HDFS和YARN启用了Web Console的Kerberos认证。
如果您使用 Cloudera Manager 来管理 CDH/CDP 集群,则可以参考此 document。 如果您使用的是原始 Hadoop 或其他 Hadoop 产品,请找到相应的文档进行操作。

为集群和 Web 控制台启用基本身份验证后,您可以使用任何能够与 Kerberos 集成的方式来执行 HTTP API 请求。
下面是一个例子:

  1. 在 c4669-node2 上提交 MapReduce 作业:
[root@c4669-node2 63-hdfs-DATANODE]# yarn jar /opt/cloudera/parcels/CDH-6.3.4-1.cdh6.3.4.p0.6626826/jars/hadoop-mapreduce-client-jobclient-3.0.0-cdh6.3.4-tests.jar sleep -Dmapred.job.queue.name=a1 -m 1 -r 1 -rt 1200000 -mt 20
WARNING: YARN_OPTS has been replaced by HADOOP_OPTS. Using value of YARN_OPTS.
...
21/01/08 07:06:08 INFO client.RMProxy: Connecting to ResourceManager at c4669-node4.coelab.cloudera.com/172.25.39.199:8032
21/01/08 07:06:08 INFO hdfs.DFSClient: Created token for cloudera: HDFS_DELEGATION_TOKEN owner=cloudera@COELAB.CLOUDERA.COM,renewer=yarn,realUser=,issueDate=1610089568852,maxDate=1610694368852,sequenceNumber=4,masterKeyId=4 on 172.25.34.78:8020
21/01/08 07:06:08 INFO security.TokenCache: Got dt for hdfs://c4669-node2.coelab.cloudera.com:8020; Kind: HDFS_DELEGATION_TOKEN,Service: 172.25.34.78:8020,Ident: (token for cloudera: HDFS_DELEGATION_TOKEN owner=cloudera@COELAB.CLOUDERA.COM,masterKeyId=4)
21/01/08 07:06:08 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /user/cloudera/.staging/job_1610089441463_0001
...
21/01/08 07:06:10 INFO impl.YarnClientImpl: Submitted application application_1610089441463_0001
21/01/08 07:06:10 INFO mapreduce.Job: The url to track the job: http://c4669-node4.coelab.cloudera.com:8088/proxy/application_1610089441463_0001/
21/01/08 07:06:10 INFO mapreduce.Job: Running job: job_1610089441463_0001
21/01/08 07:06:20 INFO mapreduce.Job: Job job_1610089441463_0001 running in uber mode : false
21/01/08 07:06:20 INFO mapreduce.Job:  map 0% reduce 0%
21/01/08 07:06:25 INFO mapreduce.Job:  map 100% reduce 0%
21/01/08 07:06:42 INFO mapreduce.Job:  map 100% reduce 67%
21/01/08 07:07:06 INFO mapreduce.Job:  map 100% reduce 68%
21/01/08 07:07:11 INFO mapreduce.Job:  map 0% reduce 0%
21/01/08 07:07:11 INFO mapreduce.Job: Job job_1610089441463_0001 failed with state KILLED due to: Application application_1610089441463_0001 was killed by user cloudera
21/01/08 07:07:11 INFO mapreduce.Job: Counters: 0
[root@c4669-node2 63-hdfs-DATANODE]#

注意:

application_1610089441463_0001 被用户 cloudera 杀死”

是由于来自下面的 PUT 请求。

  1. 在c4669-node3上,使用curl工具发送PUT请求:
[root@c4669-node3 yum.repos.d]# klist
Ticket cache: FILE:/tmp/krb5cc_0
Default principal: cloudera@COELAB.CLOUDERA.COM

Valid starting       Expires              Service principal
01/08/2021 06:39:56  01/09/2021 06:39:56  krbtgt/COELAB.CLOUDERA.COM@COELAB.CLOUDERA.COM
01/08/2021 06:56:13  01/09/2021 06:39:56  HTTP/c4669-node4.coelab.cloudera.com@
01/08/2021 06:56:13  01/09/2021 06:39:56  HTTP/c4669-node4.coelab.cloudera.com@COELAB.CLOUDERA.COM
[root@c4669-node3 yum.repos.d]# yarn application -list -appStates 'NEW,NEW_SAVING,SUBMITTED,ACCEPTED,RUNNING'
WARNING: YARN_OPTS has been replaced by HADOOP_OPTS. Using value of YARN_OPTS.
21/01/08 07:00:36 INFO client.RMProxy: Connecting to ResourceManager at c4669-node4.coelab.cloudera.com/172.25.39.199:8032
Total number of applications (application-types: [],states: [NEW,RUNNING] and tags: []):1
                Application-Id      Application-Name        Application-Type          User           Queue                   State             Final-State             Progress                        Tracking-URL
application_1610088875054_0001             Sleep job               MAPREDUCE      cloudera         root.a1                 RUNNING               UNDEFINED               83.52% http://c4669-node4.coelab.cloudera.com:44759
[root@c4669-node3 yum.repos.d]# clear
[root@c4669-node3 yum.repos.d]# yarn application -list -appStates 'NEW,RUNNING'
WARNING: YARN_OPTS has been replaced by HADOOP_OPTS. Using value of YARN_OPTS.
21/01/08 07:06:32 INFO client.RMProxy: Connecting to ResourceManager at c4669-node4.coelab.cloudera.com/172.25.39.199:8032
Total number of applications (application-types: [],RUNNING] and tags: []):1
                Application-Id      Application-Name        Application-Type          User           Queue                   State             Final-State             Progress                        Tracking-URL
application_1610089441463_0001             Sleep job               MAPREDUCE      cloudera         root.a1                 RUNNING               UNDEFINED                  50% http://c4669-node3.coelab.cloudera.com:35559
[root@c4669-node3 yum.repos.d]# curl --negotiate -u : -b ~/cookiejar.txt -c ~/cookiejar.txt http://c4669-node4.coelab.cloudera.com:8088/ws/v1/cluster/apps/application_1610089441463_0001/state
{"state":"RUNNING"}[root@c4669-node3 yum.repos.d]# curl --negotiate -u : -b ~/cookiejar.txt -c ~/cookiejar.txt -XPUT -H "Content-type: application/json" -d '{
>   "state":"KILLED"
> }' 'http://c4669-node4.coelab.cloudera.com:8088/ws/v1/cluster/apps/application_1610089441463_0001/state'
{"state":"FINAL_SAVING"}[root@c4669-node3 yum.repos.d]#

相关问答

Selenium Web驱动程序和Java。元素在(x,y)点处不可单击。其...
Python-如何使用点“。” 访问字典成员?
Java 字符串是不可变的。到底是什么意思?
Java中的“ final”关键字如何工作?(我仍然可以修改对象。...
“loop:”在Java代码中。这是什么,为什么要编译?
java.lang.ClassNotFoundException:sun.jdbc.odbc.JdbcOdbc...