如何在SQL中简单有效地查询嵌套关系?

我正在寻找最简单,最有效的SQL查询来检索与给定用户相关的所有事件.

建立

这是我的架构的简单表示:

有几点需要注意:

>用户通过会员资格属于团队.
>团队可以拥有许多馆藏,应用程序和webhook.
>集合也可以有很多webhook.
> webhooks可以属于团队或集合,但只能属于一个.
>事件可以属于任何对象,但只能属于一个对象.

对于大多数SaaS类型的公司来说,这似乎是一个相当基本的设置(例如Slack或Stripe).一切都由团队“拥有”,但用户属于团队并与界面交互.

问题

鉴于该设置,我想创建一个解决…的SQL查询

Find all of the events that are related (directly or indirectly) to a given user by id.

我可以轻松编写直接或间接通过特定方式查找的查询.例如…

Find all of the events that are directly related to a user by id.

SELECT *
FROM events
WHERE user_id = ${id}

要么…

Find all of the events that are indirectly related to a user via their teams.

SELECT events.*
FROM events
JOIN memberships ON memberships.team_id = events.team_id
WHERE memberships.user_id = ${id}

甚至…

Find all of the events that are indirectly related to a user via any collections of their teams.

SELECT events.*
FROM events
JOIN collections ON collections.id = events.collection_id
JOIN memberships ON memberships.team_id = collections.team_id
WHERE memberships.user_id = ${id}

Webhooks变得更复杂,因为它们可以通过两种不同的方式相关联……

Find all of the events that are indirectly related to a user via any webhooks of their teams or collections.

SELECT *
FROM events
WHERE webhook_id IN (
  SELECT webhooks.id
  FROM webhooks
  JOIN memberships ON memberships.team_id = webhooks.team_id
  WHERE memberships.user_id = ${id}
)
OR webhook_id IN (
  SELECT webhooks.id
  FROM webhooks
  JOIN collections ON collections.id = webhooks.collection_id
  JOIN memberships ON memberships.team_id = collections.team_id
  WHERE memberships.user_id = ${id}
)

但正如您所看到的,通过所有这些路径,用户可以通过许多不同的方式与发生的事件相关联!因此,当我尝试成功获取所有相关事件的查询时,它最终看起来像……

SELECT * 
FROM events
WHERE user_id = ${id}
OR app_id IN (
  SELECT apps.id
  FROM apps
  JOIN memberships ON memberships.team_id = apps.team_id
  WHERE memberships.user_id = ${id}
)
OR collection_id IN (
  SELECT collections.id
  FROM collections
  JOIN memberships ON memberships.team_id = collections.team_id
  WHERE memberships.user_id = ${id}
)
OR memberships_id IN (
  SELECT id
  FROM memberships
  WHERE user_id = ${id}
)
OR team_id IN (
  SELECT team_id
  FROM memberships
  WHERE user_id = ${id}
)
OR webhook_id IN (
  SELECT webhooks.id
  FROM webhooks
  JOIN memberships ON memberships.team_id = webhooks.team_id
  WHERE memberships.user_id = ${id}
)
OR webhook_id IN (
  SELECT webhooks.id
  FROM webhooks
  JOIN collections ON collections.id = webhooks.collection_id
  JOIN memberships ON memberships.team_id = collections.team_id
  WHERE memberships.user_id = ${id}
)

问题

>最终“全部包含”查询效率是否非常低效?
>有更有效的方式来编写它吗?
>是否有更简单,更易于阅读的方式来编写它?

解决方法

与任何查询一样,最有效的方法是“它依赖”.有很多变量在起作用 – 表中的行数,行长度,索引是否存在,服务器上的RAM等等.

我能想到处理这类问题的最佳方法(思考可维护性和效率的方法)是通过使用CTE,它允许您创建临时结果并在整个查询中重用该结果. CTE使用WITH关键字,并且基本上将结果别名为表,以便您可以多次JOIN对它:

WITH user_memberships AS (
    SELECT *
    FROM memberships
    WHERE user_id = ${id}
),user_apps AS (
    SELECT *
    FROM apps
    INNER JOIN user_memberships
        ON user_memberships.team_id = apps.team_id
),user_collections AS (
    SELECT *
    FROM collections
    INNER JOIN user_memberships
        ON user_memberships.team_id = collections.team_id
),user_webhooks AS (
    SELECT *
    FROM webhooks
    LEFT OUTER JOIN user_collections ON user_collections.id = webhooks.collection_id
    INNER JOIN user_memberships
        ON user_memberships.team_id = webhooks.team_id
        OR user_memberships.team_id = user_collections.team_id
)

SELECT events.* 
FROM events
WHERE app_id IN (SELECT id FROM user_apps)
OR collection_id IN (SELECT id FROM user_collections)
OR membership_id IN (SELECT id FROM user_memberships)
OR team_id IN (SELECT team_id FROM user_memberships)
OR user_id = ${id}
OR webhook_id IN (SELECT id FROM user_webhooks)
;

这样做的好处是:

>每个CTE都可以利用相应JOIN谓词的索引,并更快地返回该子集的结果,而不是让执行计划程序尝试解析一系列复杂谓词> CTE可以单独维护,使子集的故障排除问题更容易>你没有违反DRY原则>如果CTE具有查询之外的值,则可以将其移动到存储过程中并引用它

相关文章

SELECT a.*,b.dp_name,c.pa_name,fm_name=(CASE WHEN a.fm_n...
if not exists(select name from syscolumns where name=&am...
select a.*,pano=a.pa_no,b.pa_name,f.dp_name,e.fw_state_n...
要在 SQL Server 2019 中设置定时自动重启,可以使用 Window...
您收到的错误消息表明数据库 'EastRiver' 的...
首先我需要查询出需要使用SQL Server Profiler跟踪的数据库标...