问题描述
我正在尝试使用python将文件本地上传到bigquery中。每当我运行它时,我都会报错
ValueError: Could not determine schema for table
Table(TableReference(DatasetReference('database-150318','healthanalytics'),'pres_kmd'))'. Call client.get_table() or pass in a list of schema fields to the selected_fields argument.
client = bigquery.Client(project="database-150318")
job_config = bigquery.LoadJobConfig(autodetect=True)
table_ref = client.dataset('healthanalytics').table('pres_kmd')
table = client.get_table(table_ref)
#table = dataset.table("test_table")
deidrows = []
for filename in glob.glob('/Users/janedoe/kmd/health/*dat.gz'):
with gzip.open(filename) as f:
for line in f:
#line = line.decode().strip().split('|')
deidrows.append(line)
client.insert_rows(table,deidrows)
pdb.set_trace()
有人可以帮忙吗?我已经想过,如果我在其中放置自动检测功能,那就可以了。
预先感谢!
解决方法
您可以尝试以下示例:
import csv
client = bigquery.Client()
table_ref = client.dataset('bq_poc').table('new_emp')
table = client.get_table(table_ref)
filename = "data.csv"
with open(filename) as f:
for line in f:
reader = csv.reader(f,skipinitialspace=True)
rows = [[int(row[0]),str(row[1]),int(row[2])] for row in reader]
client.insert_rows(table,rows)
注意:
-
job_config
未使用,可以将其删除 - 数据需要转换为特定格式(称为
rows
)