如何在三个表40k行中优化缓慢的“选择不同的”查询,该查询仅返回22个结果

问题描述

| 因此,我有一个我正在尝试重构的其他人编写的此查询,该查询为某项(通常是鞋子)提取了某些功能/材料。 有很多产品,因此有很多连接表项,但是只有少数几个功能可用。我在想,必须有一种方法来减少涉及“大”项目列表,获得这些功能的需要,而且我听说应避免使用明显的方法,但我不希望这样做。这里没有可以替换\“ distinct \”选项的声明。 根据我的日志,我得到的结果速度很慢:   Query_time:7 Lock_time:0发送的行数:32已检查的行数:5362862      Query_time:8 Lock_time:0发送的行数:22已检查的行数:6581994 就像消息中说的那样,有时需要7或8秒,有时甚至每次要查询500万行以上。 这可能是由于同时发生其他负载,因为这是直接从mysql命令行在数据库上运行的选择:
mysql> SELECT DISTINCT features.FeatureId,features.Name
       FROM features,itemsfeatures,items
       WHERE items.FlagStatus != \'U\'
         AND items.TypeId = \'13\'
         AND features.Type = \'Material\'
         AND features.FeatureId = itemsfeatures.FeatureId
       ORDER BY features.Name;
+-----------+--------------------+
| FeatureId | Name               |
+-----------+--------------------+
|        40 | Alligator          |
|        41 | Burnished Calfskin |
|        42 | Calfskin           |
|        59 | Canvas             |
|        43 | Chromexcel         |
|        44 | Cordovan           |
|        57 | Cotton             |
|        45 | Crocodile          |
|        58 | Deerskin           |
|        61 | Eel                |
|        46 | Italian Leather    |
|        47 | Lizard             |
|        48 | Nappa              |
|        49 | NuBuck             |
|        50 | Ostrich            |
|        51 | Patent Leather     |
|        60 | Rubber             |
|        52 | Sharkskin          |
|        53 | Silk               |
|        54 | Suede              |
|        56 | Veal               |
|        55 | Woven              |
+-----------+--------------------+
22 rows in set (0.00 sec)

mysql> select count(*) from features;
+----------+
| count(*) |
+----------+
|      122 |
+----------+
1 row in set (0.00 sec)

mysql> select count(*) from itemsfeatures;
+----------+
| count(*) |
+----------+
|    38569 |
+----------+
1 row in set (0.00 sec)

mysql> select count(*) from items;
+----------+
| count(*) |
+----------+
|     8656 |
+----------+
1 row in set (0.00 sec)

explain SELECT DISTINCT features.FeatureId,features.Name  FROM features,items    WHERE items.FlagStatus != \'U\'  AND items.TypeId = \'13\'  AND features.Type = \'Material\' AND features.FeatureId = itemsfeatures.FeatureId  ORDER BY features.Name;
+----+-------------+---------------+------+-------------------+-----------+---------+---------------------------------+------+----------------------------------------------+
| id | select_type | table         | type | possible_keys     | key       | key_len | ref                             | rows | Extra                                        |
+----+-------------+---------------+------+-------------------+-----------+---------+---------------------------------+------+----------------------------------------------+
|  1 | SIMPLE      | features      | ref  | PRIMARY,Type      | Type      | 33      | const                           |   21 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | itemsfeatures | ref  | FeatureId         | FeatureId | 4       | sherman_live.features.FeatureId |  324 | Using index; Distinct                        |
|  1 | SIMPLE      | items         | ALL  | TypeId,FlagStatus | NULL      | NULL    | NULL                            | 8656 | Using where; Distinct; Using join buffer     |
+----+-------------+---------------+------+-------------------+-----------+---------+---------------------------------+------+----------------------------------------------+
3 rows in set (0.04 sec)
编辑: 以下是没有区别的示例结果(但有限制,因为否则会挂起)以进行比较:
SELECT features.FeatureId,features.Name        FROM features,items        WHERE items.FlagStatus != \'U\'          AND items.TypeId = \'13\'          AND features.Type = \'Material\'          AND features.FeatureId = itemsfeatures.FeatureId        ORDER BY features.Name limit 10;
+-----------+-----------+
| FeatureId | Name      |
+-----------+-----------+
|        40 | Alligator |
|        40 | Alligator |
|        40 | Alligator |
|        40 | Alligator |
|        40 | Alligator |
|        40 | Alligator |
|        40 | Alligator |
|        40 | Alligator |
|        40 | Alligator |
|        40 | Alligator |
+-----------+-----------+
10 rows in set (23.30 sec)
这是使用分组依据,而不是选择不同的:
SELECT features.FeatureId,items        WHERE items.FlagStatus != \'U\'          AND items.TypeId = \'13\'          AND features.Type = \'Material\'          AND features.FeatureId = itemsfeatures.FeatureId        group by features.name ORDER BY features.Name;
+-----------+--------------------+
| FeatureId | Name               |
+-----------+--------------------+
|        40 | Alligator          |
|        41 | Burnished Calfskin |
|        42 | Calfskin           |
|        59 | Canvas             |
|        43 | Chromexcel         |
|        44 | Cordovan           |
|        57 | Cotton             |
|        45 | Crocodile          |
|        58 | Deerskin           |
|        61 | Eel                |
|        46 | Italian Leather    |
|        47 | Lizard             |
|        48 | Nappa              |
|        49 | NuBuck             |
|        50 | Ostrich            |
|        51 | Patent Leather     |
|        60 | Rubber             |
|        52 | Sharkskin          |
|        53 | Silk               |
|        54 | Suede              |
|        56 | Veal               |
|        55 | Woven              |
+-----------+--------------------+
22 rows in set (13.28 sec)
编辑:添加了赏金 ...由于我正试图了解这个普遍的问题,因此,除了这个查询特别容易造成的速度缓慢之外,一般如何替换错误的选择不同的查询。 我想知道,选择的唯一身份替代品通常不是一组吗(尽管在这种情况下,由于它仍然很慢,所以不是一个全面的解决方案)?     

解决方法

        如Joe所述,似乎确实缺少联接条件 这是您当前的查询
SELECT DISTINCT 
        features.FeatureId,features.Name
FROM    features,itemsfeatures,items
WHERE   items.FlagStatus != \'U\'
        AND items.TypeId = \'13\'
        AND features.Type = \'Material\'
        AND features.FeatureId = itemsfeatures.FeatureId
ORDER BY features.Name
这是带有显式联接的查询
SELECT DISTINCT 
        features.FeatureId,features.Name
FROM    features INNER JOIN
        itemsfeatures on features.FeatureId = itemsfeatures.FeatureId CROSS JOIN
        items
WHERE   items.FlagStatus != \'U\'
        AND items.TypeId = \'13\'
        AND features.Type = \'Material\'
ORDER BY features.Name
我无法100%确定,但看起来删除对items表的任何引用应该会给您完全相同的结果
SELECT DISTINCT 
        features.FeatureId,itemsfeatures
WHERE   features.Type = \'Material\'
        AND features.FeatureId = itemsfeatures.FeatureId
ORDER BY features.Name
查询的编写方式似乎需要typeID为13且Flagstatus <> U的物料的物料清单。如果是这种情况,原始查询返回的结果是错误的。它只是返回所有物料的所有物料。 因此,正如Joe陈述的那样,为项目添加内部联接,并使用显式联接,因为它们使含义更清楚。我更喜欢使用分组依据,但distinct会执行相同的操作。
SELECT  features.FeatureId,features.Name
FROM    features INNER JOIN
        itemsfeatures on features.FeatureId = itemsfeatures.FeatureId INNER JOIN
        items on itemsfeatures.ItemID = items.ItemID
WHERE   items.FlagStatus != \'U\'
        AND items.TypeId = \'13\'
        AND features.Type = \'Material\'
GROUP BY features.FeatureId,features.Name
ORDER BY features.Name
现在排序了,现在有了速度。 创建以下三个索引。
FeaturesIndex(Type,FeatureID,Name)
ItemsFeaturesIndex(FeatureId)
ItemsIndex(TypeId,FlagStatus,ItemID)
这样可以加快当前查询和我列出的查询的速度。     ,        似乎您缺少将
itemsfeatures
链接到
items
的JOIN条件。如果您使用显式的JOIN操作编写查询,则更为明显。
SELECT DISTINCT f.FeatureId,f.Name  
    FROM features f
        INNER JOIN itemsfeatures ifx
            ON f.FeatureID = ifx.FeatureID
        INNER JOIN items i
            ON ifx.ItemID = i.ItemID /* This is the part you\'re missing */
    WHERE i.FlagStatus != \'U\'  
        AND i.TypeId = \'13\'  
        AND f.Type = \'Material\' 
    ORDER BY f.Name;
    ,        我几乎相信乔的回答是正确的。但是,如果您认为Joe是错误的,并且希望获得与原始查询相同的结果,但是速度更快,请使用以下查询:
SELECT DISTINCT features.FeatureId,features.Name
    FROM features,itemsfeatures
    WHERE features.Type = \'Material\'
        AND features.FeatureId = itemsfeatures.FeatureId
    ORDER BY features.Name;
    

相关问答

依赖报错 idea导入项目后依赖报错,解决方案:https://blog....
错误1:代码生成器依赖和mybatis依赖冲突 启动项目时报错如下...
错误1:gradle项目控制台输出为乱码 # 解决方案:https://bl...
错误还原:在查询的过程中,传入的workType为0时,该条件不起...
报错如下,gcc版本太低 ^ server.c:5346:31: 错误:‘struct...