将mysql查询重写为配置单元

问题描述

我试图加入两个表:

表X

PlayerID   | Name      | Team
007        | Sancho    | Dortmund
010        | Messi     | Barcelona
011        | Werner    | Chelsea
001        | De Gea    | Man Utd
009        | Lewan..ki | Bayern Mun
006        | Pogba     | Man Utd
017        | De Bruyne | Man City
029        | Harvertz  | Chelsea
005        | Upamecano | Leipzig

表Y

PlayerID.   |Name      | Team
010         | Messi    | Man City
007         | Sancho   | Man Utd
006         | Pogba    | Man Utd
017         | De Bruyne| Man City
011         | Werner   | Liverpool
006         | Pogba    | Real Madrid

使用此query

select avg(y.playerID is not null) as accuracy_ratio
from x
left join y 
    on  y.playerID = x.playerID
    and y.name     = x.name
    and y.team     = x.team

但是,当我运行查询时,我得到一个Only numeric or string type arguments are accepted but boolean is passed。我假设上面的查询只能在MysqL中完成。如何在Hive中重写它?

解决方法

我意识到这与您以前的文章有关,GMB在MySQL中提供了解决方案。这就是您需要做的。

select avg(case when y.playerID is not null then 1 else 0 end) as accuracy_ratio
 from x
left join y 
    on  y.playerID = x.playerID
    and y.name     = x.name
    and y.team     = x.team
,

@ learning_2_code 我尝试根据您的Dateset在蜂巢中的以下代码。它给我0.22。请让我知道这是否适用于Hive。

select count(y_pid)/count(*) from (   
select x.pid,y.pid as y_pid  
from tablex x  
left join     
tabley y  
    on  y.pid = x.pid  
    and y.ply_name     = x.ply_name  
    and y.team     = x.team )A