问题描述
想象一下这种情况。 我有一张桌子,上面放着所有的汽车品牌和型号,类似于此:
+---------------------------+-------------------+
| name | alternative_names |
+---------------------------+-------------------+
| Peugeot 207 | |
| Peugeot 208 | |
| Peugeot 308 | |
| Peugeot 308 Station Wagon | estate sw |
| Peugeot 307 | |
+---------------------------+-------------------+
我需要从大多数情况下脏的字符串开始识别保存在数据库中的汽车。
“标致308” 应该返回标致308
“ Peugeot 308旅行车” 应该返回 Peugeot 308旅行车
“自动标致308” 应该返回标致308
“ sw标致308” 应该返回标致308旅行车
有什么主意我应该如何解决这个问题?
解决方法
很难预见所有可能的脏话,因此更好地在PHP中创建帮助程序,这将使首字母大写的单词替换为
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-------+-+-------------+-------------------------------+
|F|R|R|R| opcode|M| Payload len | Extended payload length |
|I|S|S|S| (4) |A| (7) | (16/64) |
|N|V|V|V| |S| | (if payload len==126/127) |
| |1|2|3| |K| | |
+-+-+-+-+-------+-+-------------+ - - - - - - - - - - - - - - - +
| Extended payload length continued,if payload len == 127 |
+ - - - - - - - - - - - - - - - +-------------------------------+
| |Masking-key,if MASK set to 1 |
+-------------------------------+-------------------------------+
| Masking-key (continued) | Payload Data |
+-------------------------------- - - - - - - - - - - - - - - - +
: Payload Data continued ... :
+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +
| Payload Data continued ... |
+---------------------------------------------------------------+
至sw
的单词
Station Wagon
您还可以规范化DB中的数据,例如,将function modelNormalizer($model)
{
$possible_replace = [
'sw' => 'Station Wagon',];
$byWord = explode(' ',$model);
foreach ($byWord as $i => $word) {
foreach ($possible_replace as $from => $to) {
$byWord[$i] = ucfirst(strtolower(str_replace($from,$to,$word)));
}
}
return implode(' ',$byWord);
}
之前的单词移到末尾。
peugeot