问题描述
我为相同的功能构建了两个具有不同转换的管道。
有没有基准可以比较这两个管道的效率和/或资源利用率?
要详细解释: 管道1:仅使用2个映射数据流。一个带有4个转换,另一个带有20个转换。 管道2:使用2个映射数据流。一个具有4个转换,第二个DF具有15个转换,并带有Databricks笔记本。
我想从以下方面比较这两个管道 1.效率 2.资源利用 3.费用
有输入吗?
谢谢
解决方法
我认为您可以比较管道的输出,输出包含所需的值。
这是管道执行的输出示例:
{
"dataRead": 8192,"dataWritten": 612,"filesRead": 1,"sourcePeakConnections": 1,"sinkPeakConnections": 2,"rowsRead": 1,"rowsCopied": 1,"copyDuration": 12,"throughput": 0.667,"errors": [],"effectiveIntegrationRuntime": "DefaultIntegrationRuntime (East US)","usedDataIntegrationUnits": 4,"billingReference": {
"activityType": "DataMovement","billableDuration": [
{
"meterType": "AzureIR","duration": 0.06666666666666667,"unit": "DIUHours"
}
]
},"usedParallelCopies": 1,"executionDetails": [
{
"source": {
"type": "AzureBlobStorage","region": "Central US"
},"sink": {
"type": "AzureSqlDatabase","region": "East US"
},"status": "Succeeded","start": "2020-09-01T08:20:09.1734161Z","duration": 12,"profile": {
"queue": {
"status": "Completed","duration": 9
},"transfer": {
"status": "Completed","duration": 3,"details": {
"listingSource": {
"type": "AzureBlobStorage","workingDuration": 0
},"readingFromSource": {
"type": "AzureBlobStorage","writingToSink": {
"type": "AzureSqlDatabase","workingDuration": 0
}
}
}
},"detailedDurations": {
"queuingDuration": 9,"transferDuration": 3
}
}
],"dataConsistencyVerification": {
"VerificationResult": "NotVerified"
},"durationInQueue": {
"integrationRuntimeQueue": 0
}
}
在门户网站上: