我尝试的Hive插入是否有解决方法

问题描述

我使用以下方式将schema2.card_master的结构复制到schema1.card_master：

hive> create table schema1.card_master like schema2.card_master;

这可行，并且像原始字段一样被分区。这个新表具有数百个字段，因此不方便列出，但是我希望使用Join过滤器从原始表中填充所有字段。现在，我想使用JOIN填充它：

hive> insert overwrite table schema1.card_master (select * from schema2.card_master ccm INNER JOIN schema1.accounts da on ccm.cm13 = da.cm13);

Failed: SemanticException 1:23 Need to specify partition columns because the destination table is partitioned. Error encountered near token 'cmdl_card_master'

我检查了要复制的分区，它是一个mkt_cd字段，可以采用2个值，US或PR。

所以我尝试

hive> insert overwrite table schema1.card_master PARTITION (mkt_cd='US')  (select * from schema2.card_master ccm INNER JOIN schema1.accounts da on ccm.cm13 = da.cm13);
Failed: SemanticException [Error 10044]: Line 1:23 Cannot insert into target table because column number/types are different ''US'': Table insclause-0 has 255 columns,but query has 257 columns.
hive>

这是怎么回事？是否有任何工作可以加载我的数据，而不必在schema2.card_master的Select语句中明确提及所有字段？

解决方法

select *从联接中的每个表中选择列。使用select ccm.*代替select *仅从ccm表中选择列。还要删除静态分区规范（'US'），而应使用动态分区，因为ccm.*包含分区列，并且在加载静态分区时，不应在选择中包含分区列。

set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict;

insert overwrite table schema1.card_master partition(mkt_cd) --dynamic partition
select ccm.* --use alias 
  from schema2.card_master ccm 
       INNER JOIN schema1.accounts da on ccm.cm13 = da.cm13
;

hive hive-partitions hiveql insert