将雪花任务结果复制到stage并下载到csv

问题描述

基本上我需要在雪花任务中自动执行以下所有操作

  • 在 SNowflake 中创建/替换 csv 文件格式和阶段
  • 运行任务查询(每隔几天运行一次以获取一些统计信息)
  • 每次运行到 Stage csv 时将查询结果卸载
  • 将 stage csv 的内容下载到我机器上的本地文件

我做对的是copY INTO阶段,每次运行的时候怎么把任务的结果卸载到stage中?

我不知道在 FROM 语句中放什么 - 无法识别 TITANLOADSUCCESSVSFAIL 但这是任务的名称

copY INTO @TitanLoadStage/unload/ FROM TITANLOADSUCCESSVSFAIL FILE_FORMAT = TitanLoadSevendays

第一次使用 stage,并使用 SF 在本地下载,因此请感谢有关如何启动和运行它的任何建议!

谢谢, 尼克

完整代码


-- create a csv file format 
CREATE OR REPLACE FILE FORMAT TitanLoadSevendays
    type = 'CSV'
    field_delimiter = '|';

--create a sNowflake staging table using the csv 
CREATE OR REPLACE STAGE TitanLoadStage
file_format = TitanLoadSevendays;   
    
    
CREATE TASK IF NOT EXISTS TitanLoadSuccessVsFail
    WAREHOUSE = ITSM_LWH
     SCHEDULE = 'USING CRON 1 * * * * Australia/Canberra' --every minute for testing purposes 
     COMMENT = 'Last 7 days of Titan game success vs fail load %'
AS
    WITH    SUCCESSCTE AS (
SELECT  CLIENTNAME,COUNT(EVENTTYPE) AS SuccessLoad --count success load events for that game 
FROM    vw_fact_gameload60
WHERE   EVENTTYPE = 103 --success load events
    AND     USERTYPE = 1 --real users
    AND     APPID = 2 --titan games
    AND     EVENTARRIVALDATE >= DATEADD(DAY,-7,CAST(GETDATE() AS DATE)) --only looking at the last week  
GROUP BY CLIENTNAME
),FAILCTE AS ( --same as above but for Failed loads
SELECT  CLIENTNAME,COUNT(EVENTTYPE) AS FailedLoads -- count Failed load events for that game
FROM    vw_fact_gameload60
WHERE   EVENTTYPE = 106 -- Failed load events 
    AND     USERTYPE = 1 -- real users 
    AND     APPID = 2 -- Titan games
    AND     EVENTARRIVALDATE >= DATEADD(DAY,CAST(GETDATE() AS DATE)) -- last 7 days 
  --AND     FACTEVENTARRIVALDATE BETWEEN DATEADD(DAY,GETDATE())AND GETDATE() -- last 7 days 
GROUP BY CLIENTNAME
)
SELECT  COALESCE(s.CLIENTNAME,f.CLIENTNAME) AS ClientName,ZEROIFNULL(s.SuccessLoad) + ZEROIFNULL(f.FailedLoads) AS TotalLoads --sum the success and Failed loads found for 103,106 events only,calculated in CTEs,ZEROIFNULL(s.SuccessLoad) AS Cnt_SuccessLoad --count from success cte,ZEROIFNULL(f.FailedLoads) AS Cnt_FailedLoads --count from fail cte,CONCAT(ZEROIFNULL(ROUND(s.SuccessLoad * 100.0 / TotalLoads,2)),'%') As Pct_Success --percentage of SuccessLoads against total,CONCAT(ZEROIFNULL(ROUND(f.FailedLoads * 100.0 / TotalLoads,'%') AS Pct_Fail---percentage of FailedLoads against total
FROM    SUCCESSCTE s 
FULL OUTER JOIN FAILCTE f -- outer join in the fail CTE by game name,outer required because some titan games sucess or fail events are NULL  
            ON  s.CLIENTNAME = f.Clientname
ORDER BY CLIENTNAME ASC



--copy the results from the query to the sNowflake staging table created above 
copY INTO @TitanLoadStage/unload/ FROM TITANLOADSUCCESSVSFAIL FILE_FORMAT = TitanLoadSevendays


-- export the stage data to csv located in common folder 
GET @TitanLoadStage/unload/data_0_0_0.csv.gz file:\\itsm\group\ITS%20Management\Common\All%20Staff\SMD\Games\SNowflake%20and%20GamesDNA\SNowflake\SNowflakeCSV\TitanLoad.csv 


-- start the task 
ALTER TASK IF EXISTS TitanLoadSuccessVsFail RESUME

解决方法

如果您想获得通过任务运行的查询结果,您需要将所述查询的结果具体化到一个表中。

你现在拥有的:

CREATE TASK mytask_minute
  WAREHOUSE = mywh
  SCHEDULE = '5 MINUTE'
AS
SELECT 1 x;

COPY INTO @TitanLoadStage/unload/
FROM mytask_minute;

mytask_minute 不是表格,因此您无法从中选择)

你应该做什么:

CREATE TASK mytask_minute
  WAREHOUSE = mywh
  SCHEDULE = '5 MINUTE'
AS
CREATE OR REPLACE TABLE task_results_table
AS
SELECT 1 x;

COPY INTO @TitanLoadStage/unload/
SELECT *
FROM task_results_table;