问题描述
这是我正在尝试为其编写单元测试用例的python脚本中的功能之一,因为它使用全局变量以及审计和大型查询功能,这些功能被编写为不同的实用程序脚本,我不了解如何编写@修补并执行相同的单元测试用例。
- 如何修补全局变量?
- 如何修补没有任何回报的函数,例如:audit_event_source_table,我们可以在单元测试期间忽略这些函数吗?如果可以的话,该怎么做?
- 由于我没有任何返回值但有logger.info消息,因此如何进行断言?
import logging
from datetime import datetime
from pathlib import Path
import sys
import __main__
from intient_research_rdm_common.utils.audit_utils import audit_event_source_table,audit_event_job_table,\
get_job_id,get_source_object_id
from intient_research_rdm_kg_core.common_utils.utils.bigquery_utils import bigquery_data_read
from intient_research_rdm_kg_core.common_utils.utils.conf_read import read_args,read_source_config,read_env_config
global project_id,service_account,conn_ip,debug,node_table_list,edge_table_list,source_name
def edge_validation():
global edge_table_list
global source_name
edge_table_na = []
edge_table_list_rowcount_zero = []
dataset_e = "prep_e_" + source_name
row_count = 0
edge_table = ""
source_object_start_timestamp = datetime.now()
source_object_id = get_source_object_id(source_name,source_object_start_timestamp)
source_object_type = AUDIT_SOURCE_OBJECT_TYPE_BIGQUERY
job_id = get_job_id(source_object_start_timestamp)
source_object_name = dataset_e
try:
for edge_table in edge_table_list:
sql_query = " SELECT * FROM " + "`" + project_id + "." + dataset_e + ".__TABLES__` WHERE table_id =" + "'" + edge_table + "'"
data_read,col_names = bigquery_data_read(service_account,sql_query,project_id)
for ind in data_read.index:
row_count = (data_read['row_count'][ind])
if len(data_read.index) == 0:
edge_table_na.append(edge_table)
elif row_count == 0:
edge_table_list_rowcount_zero.append(edge_table)
if len(edge_table_na) > 0:
logging.info("Missing Edge tables in preprocessing layer {} ".format(edge_table_na))
if len(edge_table_list_rowcount_zero) > 0:
logging.info("Edge tables with row count as zero in Pre-processing layer {} ".format(edge_table_list_rowcount_zero))
if len(edge_table_na) == 0 and len(edge_table_list_rowcount_zero) == 0:
logging.info(
"Edge list validation for the source {} has been successfully completed with no discrepancies".format(
source_name))
audit_event_source_table(source_object_id,source_object_name,source_object_type,source_name,job_id,AUDIT_JOB_STATUS_PASS,source_object_start_timestamp,datetime.now(),'NA',project_id)
if len(edge_table_na) > 0 or len(edge_table_list_rowcount_zero) > 0:
audit_event_source_table(source_object_id,project_id)
sys.exit(1)
except Exception as e:
msg = '{} : Issue with the edge validation for {} is: \n{}\n'.format(datetime.now(),edge_table,e)
logging.error(msg)
audit_event_source_table(source_object_id,AUDIT_JOB_STATUS_FAIL,AUDIT_ERROR_TYPE_PREPROCESSING_KG_LAYER_VALIDATION,msg,project_id)
raise Exception(msg)
解决方法
- 修补全局变量-与修补类的方法相同,也可以修补全局变量。在代码片段中尚不清楚定义全局变量的位置(即,是从其他模块导入这些变量还是在Python脚本的顶部分配这些变量)。无论哪种方式,都可以在使用该功能的名称空间中进行修补。如果您可以确认更多详细信息,我将为您提供帮助。
- 就个人而言,我修补和测试无返回值的函数的方式是相同的。例如,如果我想修补source_object_start_timestamp变量,可以使用:
source_object_start_timestamp = patch('pandas.datetime.utcnow',return_value="2020-08-16 20:36:06.578174").start()
。对于BigQuery函数,我仍然会对其进行修补,但是在您的单元测试中,请使用unittest.mock.mock类的mock_call_count方法来测试该函数是否已被调用。 - 第2点解决了您的第三个查询-使用mock_call_count方法检查模拟被调用了多少次