问题描述
我有 json 和 csv 格式的数据,有时需要将其提供给外部分析师。不过,他们习惯于使用 sav
文件,所以我想给他们一些实用功能,使他们能够加载数据并以与使用表单相同的方式处理数据一个 sav 文件。
我熟悉 python,但我可以访问 Stata 来尝试一些东西(但我从未打开过它)。
用于此目的的数据示例如下:
import pandas as pd
import numpy as np
variable_value_labels = {
"col_a": {
1: "first thing",2: "something second",3: "meaning of three",},"col_b": {
1: "No",2: "Yes",}
column_names_to_labels = {
"col_a": "this is a column here","col_b": "and another","col_c": "this is a column without variable value labels",}
N = 10
df = pd.DataFrame(
{
"col_a": np.random.choice(list(variable_value_labels["col_a"].keys()),N),"col_b": np.random.choice(list(variable_value_labels["col_b"].keys()),"col_c": np.random.rand(N),}
)
以上的原始数据是:
用于标记 json 的列名:
{"col_a": "this is a column here","col_c": "this is a column without variable value labels"}
变量值标签json:
{"col_a": {"1": "first thing","2": "something second","3": "meaning of three"},"col_b": {"1": "No","2": "Yes"}}
数据框 csv:
col_a,col_b,col_c
1,1,0.8360787635373775
2,0.3373961604172684
1,2,0.6481718720511972
2,0.36824153984054797
2,0.9571551589530464
3,0.14035078041264515
1,0.8700872583584364
3,0.4736080452737105
1,0.8009107519796442
1,0.5204774795512048
我想要的是一些功能/过程,可以实现诸如:
function( dataframe_path,variable_value_labels_path,column_names_to_labels_path ):
return <sav file with above info>
如果可以将路径传递给包含这些数据的目录,那也很好。
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)