问题描述
我们有基于 horton-works HDP 2.6.4 版和 ambari 2.6.1 版的大数据 Hadoop 集群
所有机器都是RHEL 7.2版本
在我们的集群中,我们有超过 540 台机器,并且在所有机器上我们都有与 ambari 服务器通信的 ambari-agent,(Ambari 服务器仅安装在一台机器上)而 ambari-agent 安装在所有机器上
直到使用 ansible 一切都很好,当我们做 ambari-agent 升级和 ambari-agent 重启时
但最近我们开始使用 ansible ( ansible-playbook ) 来自动化安装
并且 ansible 在所有机器上运行
所以当任务执行 ambari-agent restart 时,我们很快就会注意到 ansible 执行停止并被杀死
经过一些调查,我们发现 ambari 代理正在使用以下端口
url_port = 8440
secured_url_port = 8441
ping_port = 8670
但我没有看到在端口上面使用了任何 ansible 进程,所以我们认为它不相关
但基本问题很清楚
当在远程机器上执行 ansible 任务 - ambari-agent restart 时,它会导致 ansible 中断和 ansible 被杀死
ambari-agent 配置如下
[server]
hostname = datanode02.gtfactory.com
url_port = 8440
secured_url_port = 8441
connect_retry_delay = 10
max_reconnect_retry_delay = 30
[agent]
logdir = /var/log/ambari-agent
piddir = /var/run/ambari-agent
prefix = /var/lib/ambari-agent/data
loglevel = INFO
data_cleanup_interval = 86400
data_cleanup_max_age = 2592000
data_cleanup_max_size_mb = 100
ping_port = 8670
cache_dir = /var/lib/ambari-agent/cache
tolerate_download_failures = true
run_as_user = root
parallel_execution = 0
alert_grace_period = 5
status_command_timeout = 5
alert_kinit_timeout = 14400000
system_resource_overrides = /etc/resource_overrides
[security]
keysdir = /var/lib/ambari-agent/keys
server_crt = ca.crt
passphrase_env_var_name = AMBARI_PAsspHRASE
ssl_verify_cert = 0
credential_lib_dir = /var/lib/ambari-agent/cred/lib
credential_conf_dir = /var/lib/ambari-agent/cred/conf
credential_shell_cmd = org.apache.hadoop.security.alias.CredentialShell
[network]
use_system_proxy_settings = true
[services]
pidlookuppath = /var/run/
[heartbeat]
state_interval_seconds = 60
dirs = /etc/hadoop,/etc/hadoop/conf,/etc/hbase,/etc/hcatalog,/etc/hive,/etc/oozie,/etc/sqoop,/var/run/hadoop,/var/run/zookeeper,/var/run/hbase,/var/run/templeton,/var/run/oozie,/var/log/hadoop,/var/log/zookeeper,/var/log/hbase,/var/log/hive
log_lines_count = 300
idle_interval_min = 1
idle_interval_max = 10
[logging]
syslog_enabled = 0
目前我们正在考虑以下事项:
可能因为 TLSv1 受限(传输层安全)导致 ansible 崩溃,默认是 ambari-agent 连接到 TLSv1
所以我们认为在 ambari 代理配置中设置 force_https_protocol=PROTOCOL_TLSv1_2
,但这只是假设
我们的建议和可能有帮助的新配置?
[security]
force_https_protocol=PROTOCOL_TLSv1_2 <------ the new update
keysdir = /var/lib/ambari-agent/keys
server_crt = ca.crt
passphrase_env_var_name = AMBARI_PAsspHRASE
ssl_verify_cert = 0
credential_lib_dir = /var/lib/ambari-agent/cred/lib
credential_conf_dir = /var/lib/ambari-agent/cred/conf
credential_shell_cmd = org.apache.hadoop.security.alias.CredentialShell
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)