php – 合并3个具有相似数据但不同ID的MySQL数据库?

我们正在合并一些在不同阶段设置的独立公司数据库,它们都具有大致相同的数据,但不是在相同的订单/ ID中.

这是3个数据库的伪覆盖,每个数据库有2个示例表:

 ———— ———– —————
|数据库1 | | |
 ———— ———– —————
| fruit_id |名字| |
| 1 |橙色| |
| 2 |苹果| |
| 3 |香蕉| |
| | | |
| sales_id | fruit_ids | |
| 1924年| 2,3 |苹果,香蕉|
| 1925年| 1,3 |橙色,苹果|
| | | |
|数据库2 | | |
| fruit_id |名字| |
| 1 |苹果| |
| 2 |橙色| |
| 3 |香蕉| |
| | | |
| sales_id | fruit_ids | |
| 1924年| 2,3 |橙色,香蕉|
| 1925年| 1,3 |苹果,香蕉|
| | | |
|数据库3 | | |
| fruit_id |名字| |
| 1 |香蕉| |
| 2 |苹果| |
| 3 |橙色| |
| | | |
| sales_id | fruit_ids | |
| 1950年| 2,3 |苹果,橙色|
| 1951年| 1,3 |香蕉,橙色|
 ———— ———– —————

您将看到某些数据库sales_id实际上是重复的,甚至fruit_ids也与每个表中的不同项目相关.

我们正在使用的真实数据库将在许多地方的数据库周围点缀fruit_ids,而一些数据库包含1m行,因此手动MySQL查询有点不合适.

我们需要组合的实际数据库比上面的伪表具有更高的复杂性,但是我想看看是否有任何逻辑/工具/软件可以帮助将MysqL数据库与这种类似的数据结合起来?

解决方法:

到目前为止,答案并不多.这是一些初步的努力,以显示在具有相同格式的一些数据库之间生成公共密钥:

// Source databases or companies
// -----------------------------
$corp = array();
$corp[0] = 'XYZ Corp';
$corp[1] = 'ABC Corp';
$corp[2] = '123 Corp';

// Source fruit lists
// ------------------
$fruit = array();

$fruit[0]['orange'] = 1;
$fruit[0]['apple']  = 2;
$fruit[0]['banana'] = 3;
$fruit[0]['kiwi']   = 4;

$fruit[1]['apple']  = 1;
$fruit[1]['orange'] = 2;
$fruit[1]['banana'] = 3;
$fruit[1]['pear']   = 4;

$fruit[2]['banana'] = 1;
$fruit[2]['apple']  = 2;
$fruit[2]['orange'] = 3;
$fruit[2]['grape']  = 4;

// Master fruit list
// -----------------
echo "Generating common fruit key list...<br>\n";

// Generate common keys in sorted item order
// -----------------------------------------
$common = array();
foreach ($corp as $corpid => $name)
{
  echo "<br>\n";
  echo "... examining fruit keys for " . $corp[$corpid] . "<br>\n";
  foreach ($fruit[$corpid] as $name => $id)
  {
    $common[$name] = 0;
    echo "... ... fruit $name has key $id<br>\n";
  }
}

ksort($common);

$i = 0;
foreach ($common as $name => $dummy )
   $common[$name] = ++$i;

echo "<br>\n";   
print_r($common);
echo "<br><br>\n";

// Demonstrate index conversions
// -----------------------------
echo "Demonstrating conversions to common indexes per company...<br>\n";

foreach ($corp as $corpid => $name)
{
  echo "<br>\n";
  echo "... examining key conversions for " . $corp[$corpid] . "<br>\n";
  foreach ($fruit[$corpid] as $name => $id)
  {
    $new = $common[$name];
    $old = $id;
    echo "... ... fruit $name key changes from $old to $new<br>\n";
  }
}

您将获得类似于以下内容的结果:

Generating common fruit key list...

... examining fruit keys for XYZ Corp
... ... fruit orange has key 1
... ... fruit apple has key 2
... ... fruit banana has key 3
... ... fruit kiwi has key 4

... examining fruit keys for ABC Corp
... ... fruit apple has key 1
... ... fruit orange has key 2
... ... fruit banana has key 3
... ... fruit pear has key 4

... examining fruit keys for 123 Corp
... ... fruit banana has key 1
... ... fruit apple has key 2
... ... fruit orange has key 3
... ... fruit grape has key 4

Array ( [apple] => 1 [banana] => 2 [grape] => 3 [kiwi] => 4 [orange] => 5 [pear] => 6 ) 

Demonstrating conversions to common indexes per company...

... examining key conversions for XYZ Corp
... ... fruit orange key changes from 1 to 5
... ... fruit apple key changes from 2 to 1
... ... fruit banana key changes from 3 to 2
... ... fruit kiwi key changes from 4 to 4

... examining key conversions for ABC Corp
... ... fruit apple key changes from 1 to 1
... ... fruit orange key changes from 2 to 5
... ... fruit banana key changes from 3 to 2
... ... fruit pear key changes from 4 to 6

... examining key conversions for 123 Corp
... ... fruit banana key changes from 1 to 2
... ... fruit apple key changes from 2 to 1
... ... fruit orange key changes from 3 to 5
... ... fruit grape key changes from 4 to 3

需要考虑的一些问题:

>我不认为您可以安全地更新表,因为密钥将如上所述发生冲突.如果您愿意,可以通过调整新键值以避免与任何现有键值重叠来避免这种情况.例如,在这种情况下,您可以向所有新公共密钥添加一个整数,例如100000.这可能允许对先前的键值进行“就地”更新 – 可能对早期测试或备用转换路径有用.
>您需要一个确定的表列表才能应用转换.如果ID始终使用相同的名称,则可以通过扫描数据库表格式(可能通过describe tablename)自动执行.
>如果你确实在数据库中有一个逗号分隔的ID值列表,如图所示,你将不得不做额外的工作来识别它们,解析它们并适当地更新所有这些.
>您会注意到我为每家公司(即奇异果,梨,葡萄)添加一个独特的库存物品,因为它们可能并非全部都带有完全相同的产品组.

这应该为思考这个提供一个起点.将所有源数据库处理到新数据库然后验证合并结果可能需要多长时间.

相关文章

统一支付是JSAPI/NATIVE/APP各种支付场景下生成支付订单,返...
统一支付是JSAPI/NATIVE/APP各种支付场景下生成支付订单,返...
前言 之前做了微信登录,所以总结一下微信授权登录并获取用户...
FastAdmin是我第一个接触的后台管理系统框架。FastAdmin是一...
之前公司需要一个内部的通讯软件,就叫我做一个。通讯软件嘛...
统一支付是JSAPI/NATIVE/APP各种支付场景下生成支付订单,返...