如何在使用 Faker 生成假数据时在 python 中互连字段?

问题描述

如果我想把“兴趣”栏和“技能”与“购买倾向”联系起来,比如80%的技能有运动员的购买倾向最高,(0.7-1),80%的唱歌购买倾向中等(0.3-0.6),80%的画作购买倾向最低(0-0.3)。如何在faker中实现这个逻辑?

导入csv 从 faker 进口 Faker 随机导入

定义数据生成(记录,标题): fake = Faker('en_US')

#age_range = ["18-45","45-65","14-17"]
interests = ["movies,bikes","youtube,food","hockey,books"]
profession = ["doctor","engineer","architect"]    
preferred_language = ["English","hindi","french"]
schools = ["High School","Degree","PhD"]
skills = ["painting","singing","athlete"]
campaign_content = "get_the_best_insurance"
no_of_days =[0,1,2,3,4,5,6,7,8,9,10]



with open("C:\\Users\\PRIYANKI\\OneDrive\\Documents\\My_matches\\PropensityToBuy_test1.csv",'wt') as csvFile:
    writer = csv.DictWriter(csvFile,fieldnames=headers)
    writer.writeheader()
    
    countries = ["Canada","India","US"]
    cities = {"Canada":["Orlando","Ontario"],"India":["Delhi","Mumbai"],"US":["New York","Washington"]}
    
    
    
    for i in range(records):
        
        country = random.choice(countries)
        city = random.choice(cities[country])
        
   
        
        writer.writerow({
                #"campaign_transaction_id" : fake.fixed_width(data_columns=[(30,'name'),(5,'pyint',{'min_value':30,'max_value':40})],align='right',num_rows=1),"campaign_transaction_id" : fake.sha1(raw_output=False),"organization_id" : "1aadefa1-4a35-45fb-bcf5-0ed1e2921bff","campaign_id": "1fe32202-8473-4218-af74-11d1df5e5dbc","referred_by" : fake.name(),"referred_to" : fake.name(),"loyalty_id" : "603c695b-1dd4-49c9-b80b-2d9637bbda71","campaign_start_date" : fake.date_this_month(),"campaign_end_date" : fake.future_date(end_date='+10d',tzinfo=None),#"campaign_end_date" : fake.date_time_between_dates(),#"campaign_end_date": fake.date(),"age" :random.randint(14,70),"country" : country,"city" : city,#"city" : fake.city(),#"country" : fake.country(),"interests" : random.choice(interests),"profession" : random.choice(profession),"preferred_language" : random.choice(preferred_language),"schools" : random.choice(schools),"skills": random.choice(skills),"time_stamp": fake.date_time_between_dates(),"campaign_content" : campaign_content,"no_of_days" : random.choice(no_of_days),# "propensity_to_buy" : random.uniform(0,1) 
            
                "propensity_to_buy" : np.random.choice(no_of_days)/10,})

if name == 'ma​​in': 记录 = 10000 headers = [“campaign_transaction_id”、“organization_id”、“campaign_id”、“loyalty_id”、“referred_by”、“referred_to”、“campaign_start_date”、“campaign_end_date”、 “年龄”、“城市”、“国家”、“兴趣”、“职业”、“首选语言”、“学校”、“技能”、“时间戳”、“campaign_content”、“no_of_days”、“propensity_to_buy”] 数据生成(记录,标题) print("CSV 生成完成!")

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)

相关问答

Selenium Web驱动程序和Java。元素在(x,y)点处不可单击。其...
Python-如何使用点“。” 访问字典成员?
Java 字符串是不可变的。到底是什么意思?
Java中的“ final”关键字如何工作?(我仍然可以修改对象。...
“loop:”在Java代码中。这是什么,为什么要编译?
java.lang.ClassNotFoundException:sun.jdbc.odbc.JdbcOdbc...