问题描述
我有3个不同的表格,国家/地区,城市和客户。表格如下所示:
国家/地区表:
id country_name
1 UK
2 US
3 Brazil
:
n Canada
城市表
id city_name postal_code country_id
1 London 30090 1
2 Dallas 20909 2
3 Rio 29090 3
4 Atlanta 30318 2
:
n Vancouver 32230 n
客户表
id customer_name city_id
1 John 1
2 Pete 3
3 Dave 2
4 May 2
5 Chuck 4
6 Sam 3
7 Henry 3
*** country.id是引用city.country_id,city.id是引用customer.city_id
我想编写一个查询,该查询可以提取国家名称,城市名称以及相关城市的客户数量。但是在一种情况下,查询将返回所有客户数量超过所有城市平均客户数量的所有城市
如下所示,这是正确的输出
UK London 2
Brazil Rio 3
但是我一直得到这个输出,这是不正确的
UK London 2
US Dallas 2
US Atlanta 1
Brazil Rio 3
SELECT country.country_name,city.city_name,COUNT(customer.city_id) from country
JOIN city on country.id = city.country_id
JOIN customer on city.id = customer.city_id
Group by city_name,country.country_name;
我想知道如何做到这一点并修复我的代码?
解决方法
您需要将查询嵌套到子查询中,以便可以获取计数的平均值并将其与当前计数进行比较。如果您使用的是支持CTE的SQL,则可以使用一个,例如
WITH cnts AS (
SELECT country.country_name,city.city_name,COUNT(customer.city_id) AS cnt
FROM country
JOIN city on country.id = city.country_id
JOIN customer on city.id = customer.city_id
GROUP BY city_name,country.country_name
)
SELECT *
FROM cnts
WHERE cnt > (SELECT AVG(cnt) FROM cnts)
否则,查询会变得更加复杂,同时WHERE
子句中的子查询也需要主查询:
SELECT country.country_name,COUNT(customer.city_id) AS cnt
FROM country
JOIN city on country.id = city.country_id
JOIN customer on city.id = customer.city_id
GROUP BY city_name,country.country_name
HAVING COUNT(customer.city_id) > (SELECT AVG(cnt) FROM (
SELECT country.country_name,country.country_name
) cnts2)
在两种情况下,示例数据的输出均为:
country_name city_name cnt
Brazil Rio 3
US Dallas 2
,
您可以使用窗口功能:
SELECT cc.*
FROM (SELECT co.country_name,ci.city_name,COUNT(*) AS cnt,AVG(COUNT(*)) OVER () as avg_count
FROM country co JOIn
city ci
ON co.id = ci.country_id JOIN
customer cu
ON ci.id = cu.city_id
GROUP BY ci.city_name,co.country_name
) cc
WHERE cnt > avg_count;