问题描述
我正在努力了解如何在 cube.js 架构中表示以下类型的 postgres sql 查询:
SELECT
CASE
WHEN COUNT(tpp.net_total_amount) > 0 THEN
SUM(tpp.net_total_amount) / COUNT(tpp.net_total_amount)
ELSE
NULL
END AS average_spend_per_customer
FROM
(
SELECT
SUM(ts.total_amount) AS net_total_amount
FROM
postgres.transactions AS ts
WHERE
ts.transaction_date >= '2020-11-01' AND
ts.transaction_date < '2020-12-01'
GROUP BY
ts.customer_id,ts.event_id
) AS tpp
;
我有一种感觉,预聚合可能是我所追求的,但在研究它们之后似乎并非如此。我可以使用以下架构获取每个客户每个事件花费的总金额列表:
cube(`TransactionTotalAmountByCustomerAndEvent`,{
sql: `SELECT * FROM postgres.transactions`,joins: {
},measures: {
sum: {
sql: `SUM(total_amount)`,type: `number`
}
},dimensions: {
eventId: {
sql: `event_id`,type: `string`
},customerId: {
sql: `customer_id`,transactionDate: {
sql: `transaction_date`,type: `time`
}
},preAggregations: {
customerAndEvent: {
type: `rollup`,measureReferences: [sum],dimensionReferences: [customerId,eventId]
}
}
});
但这实际上只是给了我按客户和事件分组的内部 SELECT 语句的输出。如何查询多维数据集以获取我所追求的每个事件的平均客户支出?
解决方法
您可能会发现将数据集建模为两个不同的立方体 Customers
和 Transactions
更容易。然后,您需要在多维数据集之间建立连接,然后创建一个特殊维度,并将 subQuery
属性设置为 true
。我在下面提供了一个示例来帮助您理解:
cube('Transactions',{
sql: `SELECT * FROM postgres.transactions`,measures: {
spend: {
sql: `total_amount`,type: `number`,},dimensions: {
eventId: {
sql: `event_id`,type: `string`
},customerId: {
sql: `customer_id`,transactionDate: {
sql: `transaction_date`,type: `time`
},})
cube('Customers',{
sql: `SELECT customer_id FROM postgres.transactions`,joins: {
Transactions: {
relationship: `hasMany`,sql: `${Customers}.id = ${Transactions}.customerId`
}
},measures: {
averageSpend: {
sql: `${spendAmount}`,type: `avg`,dimensions: {
id: {
sql: `customer_id`,spendAmount: {
sql: `${Transactions.spend}`,subQuery: true
},}
})
上找到更多信息