pandas数据框：随着时间的推移集成会产生奇怪的结果

问题描述

我具有一段时间内加速度值的表格数据，例如以下示例：

time(s) acc_x   acc_y
0.1 0   0
0.2 -0.98   1.66
0.3 1.42    1.72
0.4 -1.98   -0.3
0.5 -0.3    -0.79
0.6 -1.15   1.65
0.7 1.2 -0.5
0.8 1.97    0.51
0.9 -0.74   -0.39
1   -0.47   -1.06
1.1 1.77    0.87
1.2 -0.35   -0.67
1.3 1.4 0.51
1.4 1.72    1.47
1.5 -0.37   -0.83
1.6 1.65    -0.07
1.7 1.51    -0.53
1.8 -0.46   -0.8
1.9 -0.35   -0.18
2   0   0

由此，我想通过双重积分计算位置值，以将位置坐标用作搅拌机中的关键帧位置。因为我并不总是知道输入数据的时基，所以我想将其重新采样到帧间时间间隔中。

到目前为止，这是我尝试过的方法，主要是尝试修改以下代码示例：Rolling integral over pandas dataframe with time index

import pandas as pd
from scipy import integrate
    
cur_fps=25 #bpy.context.scene.render.fps
acc_table=pd.read_excel("C:/Temp/exampleaccelerations.xlsx",index_col=0) #read the table from disk
timedeltastrings={"interp":"%d"%(timedelta_base/100)+"us","vel":"%d"%(timedelta_base/10)+"us","pos":"%d"%(timedelta_base)+"us"}
acc_table_interp=acc_table.resample(timedeltastrings["interp"]).interpolate(method="linear")
vel_table=acc_table_interp.rolling(timedeltastrings["vel"]).apply(integrate.trapz)
vel_table_interp=vel_table.resample(timedeltastrings["vel"]).interpolate(method="linear")
pos_table=vel_table.rolling(timedeltastrings["pos"]).apply(integrate.trapz)
pos_table_interp=pos_table.resample(timedeltastrings["pos"]).interpolate(method="linear")

代码可能不是特别整洁，但是可以工作并给出结果。但是，与手动评估（例如，在Excel中）相比，结果值太高了。我完全不知道如何在结果和输入之间建立一种精神上的联系。

如果您想知道，重采样应该为滚动积分器提供一些可使用的值。没有重新采样和100ms的窗口大小（类似于我对上面链接的答案的理解），积分的结果就是全零数据帧。

有人可以指出我如何正确使用scipy积分器（或任何其他等效函数）的方向，以便我可以得到正确的结果吗？

解决方法

您可以使用scipy.integrate.cumtrapz进行数值积分。

假设您的数据存储在pd.DataFrame df

中

from scipy.integrate import cumtrapz

velocities = cumtrapz(df[['acc_x','acc_y']],df['time(s)'],axis=0)
positions = cumtrapz(velocities,dx=0.1,axis=0)

要解释积分结果，可以绘制position，velocity和acceleration

import matplotlib.pyplot as plt

plt.figure(figsize=(8,8))
plt.scatter(positions[:,0],positions[:,1],label='Position')
plt.quiver(
    positions[:,velocities[1:,color='blue',width=0.003,angles='xy',label='Velocity')
plt.quiver(
    positions[:,df['acc_x'].iloc[2:],df['acc_y'].iloc[2:],color='green',label='Acceleration')
plt.legend();

出局：

这就是我对物理的理解会做的事情

x和y的距离可以分开考虑，该距离是根据牛顿运动定律计算的：

s = 1/2 * a * t ** 2 + v0 * t + s0

首先设置数据框（numpy数组会更简单）：

import pandas as pd

cols=['time','acc_x','acc_y']
dat=[  [0.1,[0.2,-0.98,1.66],[0.3,1.42,1.72],[0.4,-1.98,-0.3],[0.5,-0.3,-0.79],[0.6,-1.15,1.65],[0.7,1.2,-0.5],[0.8,1.97,0.51],[0.9,-0.74,-0.39],[1,-0.47,-1.06],[1.1,1.77,0.87],[1.2,-0.35,-0.67],[1.3,1.4,[1.4,1.72,1.47],[1.5,-0.37,-0.83],[1.6,1.65,-0.07],[1.7,1.51,-0.53],[1.8,-0.46,-0.8],[1.9,-0.18],[2,0]]

df = pd.DataFrame(data=dat,columns=cols)

通常，如果增量t发生变化（实际/精确测量）所以我将delta t（'dt'）计算为一个辅助列：

df['dt'] = df['time'].diff().fillna(0)

起始条件为v0 = 0和s0 = 0：

df['s_x'] = 0
df['s_y'] = 0
df['v_x'] = 0
df['v_y'] = 0

不幸的是，第i个值始终取决于第i-1个值（lambda或iter不起作用）

for i in range(1,len(df)):
    df['v_x'].loc[i] = df['v_x'].loc[i-1] + df['acc_x'].loc[i]*df['dt'].loc[i]
    df['v_y'].loc[i] = df['v_y'].loc[i-1] + df['acc_y'].loc[i]*df['dt'].loc[i]
    df['s_x'].loc[i] = df['s_x'].loc[i-1] + df['v_x'].loc[i-1]*df['dt'].loc[i] \
                     + 0.5*df['acc_x'].loc[i]*df['dt'].loc[i]**2
    df['s_y'].loc[i] = df['s_y'].loc[i-1] + df['v_y'].loc[i-1]*df['dt'].loc[i] \
                     + 0.5*df['acc_y'].loc[i]*df['dt'].loc[i]**2
    
xpos = round(df['s_x'].loc[len(df)-1],3)
ypos = round(df['s_y'].loc[len(df)-1],3)

print('final position: ',xpos,ypos)

vx_end = round(df['v_x'].loc[len(df)-1],3)
vy_end = round(df['v_y'].loc[len(df)-1],3)

print('final speed (x,y): ',vx_end,vy_end)

我希望这更接近您的期望。这样可以提供对象（s_x，s_y）在每个给定时间点处的速度（v_x，v_y）的位置。

dataframe pandas pandas python scipy scipy