问题描述
我正在迁移许多数据库,但我看到我的大小超过 50GB 的数据库在一段时间后由于缺乏存储而在 CDC 中失败。
我正在使用复制实例类 dms.r5.large
并且一切运行顺利,直到完全加载完成。
当 CDC 启动时,我收到如下日志消息:
D: There are 188 swap files of total size 93156 Mb. Left to process 188 of size 93156 Mb
但是交换文件永远不会被丢弃,实例会不断累积交换文件,最终实例会耗尽存储空间。
需要注意的是,我在监控指标中的交换使用量接近于零。
我已经尝试过使用 dms.r5.xlarge
并且问题是一样的,这让我觉得内存不是问题。
您知道这种行为的原因是什么吗? 有没有办法调试这个?
谢谢!
更多有用的数据:
复制实例类:dms.r5.large
,我已尝试使用 dms.r5.xlarge
。40GB
的存储空间,我尝试使用 300GB
,但最终 CDC 阶段消耗了所有存储空间。
要迁移的数据库大约为 80GB
。
任务设置:
{
"TargetMetadata": {
"TargetSchema": "","SupportLobs": true,"FullLobMode": false,"LobChunkSize": 0,"LimitedSizeLobMode": true,"LobMaxSize": 32,"InlineLobMaxSize": 0,"LoadMaxFileSize": 0,"ParallelLoadThreads": 0,"ParallelLoadBufferSize": 0,"BatchApplyEnabled": false,"TaskRecoveryTableEnabled": false,"ParallelLoadQueuesPerThread": 0,"ParallelApplyThreads": 0,"ParallelApplyBufferSize": 0,"ParallelApplyQueuesPerThread": 0
},"FullLoadSettings": {
"TargetTablePrepMode": "DROP_AND_CREATE","CreatePkAfterFullLoad": false,"StopTaskCachedChangesApplied": false,"StopTaskCachedChangesNotApplied": false,"MaxFullLoadSubTasks": 8,"TransactionConsistencyTimeout": 600,"CommitRate": 10000
},"Logging": {
"EnableLogging": true,"LogComponents": [{
"Id": "SOURCE_UNLOAD","Severity": "LOGGER_SEVERITY_DEFAULT"
},{
"Id": "SOURCE_CAPTURE",{
"Id": "TARGET_LOAD",{
"Id": "TARGET_APPLY","Severity": "LOGGER_SEVERITY_INFO"
},{
"Id": "TASK_MANAGER","Severity": "LOGGER_SEVERITY_DEBUG"
}]
},"ControlTablesSettings": {
"historyTimeslotInMinutes": 5,"ControlSchema": "","HistoryTimeslotInMinutes": 5,"HistoryTableEnabled": false,"SuspendedTablesTableEnabled": false,"StatusTableEnabled": false
},"StreamBufferSettings": {
"StreamBufferCount": 3,"StreamBufferSizeInMB": 8,"CtrlStreamBufferSizeInMB": 5
},"ChangeProcessingDdlHandlingPolicy": {
"HandleSourceTableDropped": true,"HandleSourceTableTruncated": true,"HandleSourceTableAltered": true
},"ErrorBehavior": {
"DataErrorPolicy": "LOG_ERROR","DataTruncationErrorPolicy": "LOG_ERROR","DataErrorEscalationPolicy": "SUSPEND_TABLE","DataErrorEscalationCount": 0,"TableErrorPolicy": "SUSPEND_TABLE","TableErrorEscalationPolicy": "STOP_TASK","TableErrorEscalationCount": 0,"RecoverableErrorCount": -1,"RecoverableErrorInterval": 5,"RecoverableErrorThrottling": true,"RecoverableErrorThrottlingMax": 1800,"RecoverableErrorStopRetryAfterThrottlingMax": false,"ApplyErrorDeletePolicy": "IGnorE_RECORD","ApplyErrorInsertPolicy": "LOG_ERROR","ApplyErrorUpdatePolicy": "LOG_ERROR","ApplyErrorEscalationPolicy": "LOG_ERROR","ApplyErrorEscalationCount": 0,"ApplyErrorFailOnTruncationDdl": false,"FullLoadIgnoreConflicts": true,"FailOnTransactionConsistencyBreached": false,"FailOnNoTablesCaptured": false
},"ChangeProcessingTuning": {
"BatchApplyPreserveTransaction": true,"BatchApplyTimeoutMin": 1,"BatchApplyTimeoutMax": 30,"BatchApplyMemoryLimit": 500,"BatchSplitSize": 0,"MinTransactionSize": 1000,"CommitTimeout": 1,"MemoryLimitTotal": 1024,"MemoryKeepTime": 60,"StatementCacheSize": 50
},"ValidationSettings": {
"EnableValidation": true,"ValidationMode": "ROW_LEVEL","ThreadCount": 5,"PartitionSize": 10000,"FailureMaxCount": 10000,"RecordFailureDelayInMinutes": 5,"RecordSuspendDelayInMinutes": 30,"MaxKeyColumnSize": 8096,"TableFailureMaxCount": 1000,"ValidationOnly": false,"HandleCollationDiff": false,"RecordFailureDelayLimitInMinutes": 0,"SkipLobColumns": false,"ValidationPartialLobSize": 0,"ValidationQueryCdcDelaySeconds": 0
},"PostProcessingRules": null,"CharacterSetSettings": null,"LoopbackPreventionSettings": null,"Beforeimagesettings": null
}
解决方法
由于高目标延迟而导致根本原因是数据库表结构的问题。
具有大量记录的表缺少导致全表扫描的 primary keys
或 unique identifiers
,更改未应用,然后保存在复制实例中存储。
最终实例将耗尽存储空间。
要解决此问题,您应该运行迁移前评估以检查您的数据库是否适用于 DMS 迁移。
解决此问题的另一种方法是在迁移中添加额外的列以创建唯一键并在迁移后将其删除。