应用 SciPy 牛顿法优化 Pandas 数据帧 Weibull sum


我是一名新手程序员,但了解我的 excel 方法。但是,我正在尝试自学 Python 以使自己能够处理更大的数据集,主要是因为我发现它非常有趣和令人愉快。

我想弄清楚如何在我编写的脚本中重新创建 Excel 目标查找函数(我相信 SciPy newton 应该是等效的)。但是,我无法定义一个简单的函数 f(x) 来找到其根,而是试图找到数据帧列的总和的根。我不知道如何解决这个问题。


import pandas as pd
import os
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import weibull_min

# need to use a gamma function later on,so import math

import math

%matplotlib inline

# create dataframe using lidar experimental data

df = pd.read_csv(r'C:\Users\Latitude\Documents\Coursera\Wind Resource\Proj' \
                 'ect\Wind_Lidar_40and140.txt',sep=' ',header=None,names=['Year','Month','Day','Hour','v_40','v_140'])

# add in columns for veLocity cubed

df['v_40_cubed'] = df['v_40']**3
df['v_140_cubed'] = df['v_140']**3

# calculate mean wind speed,mean cubed wind speed,mean wind speed cubed
# use these to calculate energy patter factor,c and k

v_40_bar = df['v_40'].mean()
v_40_cubed_bar = df['v_40_cubed'].mean()
v_40_bar_cubed = v_40_bar ** 3

# energy pattern factor = epf

epf = v_40_cubed_bar / v_40_bar_cubed

# shape parameter = k

k_40 = 1 + 3.69/epf**2

# scale factor = c
# use imported math library to use gamma function math.gamma

c_40 = v_40_bar / math.gamma(1+1/k_40)

# create new dataframe from current,using bins of 0.25 and generate frequency for these
#' bins'

bins_1 = np.linspace(0,16,65,endpoint=True)
freq_df = df.apply(pd.Series.value_counts,bins=bins_1)

# tidy up the dataframe by dropping superfluous columns and adding in a % time column for 
# frequency

freq_df_tidy = freq_df.drop(['Year','v_40_cubed','v_140_cubed'],axis=1)
freq_df_tidy['v_40_%time'] = freq_df_tidy['v_40']/freq_df_tidy['v_40'].sum()

# add in usable bin value for potential calculation of weibull

freq_df_tidy['windspeed_bin'] = np.linspace(0,64,endpoint=False)

# calculate weibull column and wind power density from the weibull fit

freq_df_tidy['Weibull_40'] = weibull_min.pdf(freq_df_tidy['windspeed_bin'],k_40,loc=0,scale=c_40)/4
freq_df_tidy['Wind_Power_Density_40'] = 0.5 * 1.225 * freq_df_tidy['Weibull_40'] * freq_df_tidy['windspeed_bin']**3

# calculate wind power density from experimental data

df['Wind_Power_Density_40'] = 0.5 * 1.225 * df['v_40']**3

在这个阶段,威布尔数据的结果 round(freq_df_tidy['Wind_Power_Density_40'].sum(),2) 给出了 98.12。

实验数据的结果,round(df['Wind_Power_Density_40'].mean(),2) 给出了 101.14。

我现在的目标是优化参数 c_40,该参数用于 weibull 计算的 weibull 功率密度 (98.12),以便函数 round(freq_df_tidy['Wind_Power_Density_40'].sum(),2) 的结果为接近等于实验风功率密度(101.14)。

对此的任何帮助将不胜感激。如果我在请求中输入了太多代码,我很抱歉 - 我想提供尽可能多的细节。根据我的研究,我认为 SciPy 牛顿法应该可以解决问题,但我不知道如何在这里应用它。




