问题描述
我有一个大数据框 [870MB,shape=(938050,31)],我正在尝试运行带有 statsmodels 的多项 Logit 回归。 当我运行回归行时,直到我得到
进程完成,退出代码 -1066598273 (0xC06D007F)
我成功地在 10k 行上运行,但我想检查整个 df。 我该怎么做才能有效地运行此流程?
感谢您的帮助!
代码行:
import statsmodels.api as st
import pandas as pd
import os
writer = pd.ExcelWriter(path=os.path.join(export_path,f'regression.xlsx'),engine='xlsxwriter')
vars_matrix_df = pd.read_csv(data_path,skipinitialspace=True)
corr_cols = ['sales_vs_service','agent_experience','minutes_passed_since_shift_started','stage_in_conv','current_cust_wait_time','prev_cust_line_words','total_cust_words_in_conv','agent_total_turns','sentiment_score','max_sentiment','min_sentiment','last_sentiment','agent_response_time','customer_response_rate','is_last_cust_answered','conversation_opening','queue_length','total_lines_from_rep','agent_number_of_conversations','concurrency','rep_shift_start_time','first_cust_line_num_of_words','queue_wait_time','day_of_week','time_of_day']
reg_equation = st.formula.mnlogit(f'visitor_was_answered ~C(day_of_week)+C(time_of_day)+{"+".join(corr_cols)} ',vars_matrix_df).fit()
reg_summary_as_html = vars_summary.tables[0].as_html()
reg_summary_df = (pd.read_html(reg_summary_as_html,header=0,index_col=0)[0]).reset_index()
reg_summary_df.to_excel(writer,sheet_name='reg_summary_results',index=False)
writer.save()
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)