问题描述
当我们尝试插入覆盖 hive 表时,hive 创建子文件夹 -ext-10000。 这些表中的数据对 Spark 不可见。 只有行数少的表有这些问题。
spark 版本:3.1.1 版 蜂巢版本:蜂巢3.1.0.3.1.4.0-315
我们尝试设置
"hive.input.dir.recursive" = "TRUE"
"hive.mapred.supports.subdirectories" = "TRUE"
"hive.supports.subdirectories" = "TRUE"
"mapred.input.dir.recursive" = "TRUE"
它不会影响
查询示例:
insert overwrite table categories
select
n2.id as category1_ccode,n2.name as category1_name,n3.id as category2_ccode,n3.name as category2_name
from nomenclature as n1
left join nomenclature as n2
on n1.id = n2.parent_id
left nomenclature as n3
on n2.id = n3.parent_id
where
n1.name = 'Goods'
and n1.delete_mark = '00'
and n2.delete_mark = '00'
and n3.delete_mark = '00'
and n1.is_group = '00'
and n2.is_group = '00';
文件以 ORC 格式存储
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)