问题描述
我正在使用 ODBC
和 dbplyr
连接两个相对简单的表。但是,我的连接键出现错误,它抛出一个 ambiguous column name error
。使用 dplyr 连接通常不会发生这种情况,我不知道如何像使用 a.key = b.key
一样使用 dbplyr
。
Error: nanodbc/nanodbc.cpp:1655: 42000: [Microsoft][ODBC sql Server Driver][sql Server]Ambiguous column name 'Calendar_key'. [Microsoft][ODBC sql Server Driver][sql Server]Statement(s) Could not be prepared.
<sql> 'SELECT "Calendar_key","Organization_key","Product_Key","Promotion_Key","Shift_Key","ETL_source_system_key","Pack_Size","Qty_Sold","Inv_Unit_Qty","Extended_Cost","Extended_Purchase_Rebate","Extended_Sales_Rebate","Extended_Sales","Ent_Source_Hdr_Key","Ent_Source_Dtl_Key","Day_Date","Day_Of_Week_ID","Day_Of_Week","Holiday","Type_Of_Day","Calendar_Month_No","Calendar_Month_Name","Calendar_Qtr_No","Calendar_Qtr_Desc","Calendar_Year","Fiscal_Week","Fiscal_Period_No","Fiscal_Period_Desc","Fiscal_Year"
FROM "Item_Sales_Fact" AS "LHS"
LEFT JOIN "calendar" AS "RHS"
ON ("LHS"."Calendar_key" = "RHS"."calendar_key")
这是下面的代码块:我的连接叫做 con
con <- dbConnect(odbc(),Driver = "sql Server",Server = "192.168.139.1",Database = "pdi_warehouse_2304_01",UID = XXXX,PWD = XXXX,Port = 1433)
item.sales <- tbl(con,"Item_Sales_Fact")
calendar <- tbl(con,"calendar")
organization <- tbl(con,"Organization")
test.df <- item.sales %>%
left_join(calendar,by = c("Calendar_key" = "calendar_key")) %>%
collect()
解决方法
SQL
生成的 dbplyr
不正确,因为 Calendar_key
可以来自 RHS
或 LHS
,因为 SQL
不是区分大小写且与 R 不同,不区分 Calendar_key
和 calendar_key
:
SELECT "Calendar_key",...
问题似乎来自这样一个事实:虽然 SQL
不区分大小写,但 SQL Server
处理区分大小写的列名。
一种解决方法是重命名两个键之一以获得完全相同的区分大小写的名称:
item.sales <- tbl(con,"Item_Sales_Fact")
calendar <- tbl(con,"calendar") %>% rename(Calendar_key = calendar_key)
test.df <- item.sales %>%
left_join(calendar,by = c("Calendar_key" = "Calendar_key")) %>%
collect()