问题描述
我的 dataframe
有 11,516,015
行。我需要将 pre_close
修改为相同 close
的前一个 Trade_code
的 ts_code
值。如何执行批处理操作?样本数据如下:
print (df.head(10))
ts_code Trade_date open high low close pre_close change pct_chg vol amount adj_factor
0 000001.SZ 20210602 2673.79269 2677.15032 2616.71298 2673.79269 23.92 -0.03 -0.1254 497527.02 1176608.126 111.921
1 000002.SZ 20210602 4078.37650 4183.02918 4049.13531 4118.39076 26.60 0.16 0.6015 853545.06 2287264.276 153.901
2 000004.SZ 20210602 65.87744 69.81952 64.69888 67.34048 15.98 0.59 3.6921 47125.57 77192.135 4.064
3 000005.SZ 20210602 16.68240 16.96044 16.40436 16.68240 1.79 0.01 0.5587 82388.69 14812.102 9.268
4 000006.SZ 20210602 193.12203 193.12203 190.56654 191.29668 5.28 -0.04 -0.7576 58093.43 30539.090 36.507
5 000007.SZ 20210602 30.65080 30.73364 30.15376 30.31944 3.70 -0.04 -1.0811 29560.28 10841.980 8.284
6 000008.SZ 20210602 50.86616 51.31432 50.86616 51.09024 2.28 0.00 0.0000 126807.00 28933.202 22.408
7 000009.SZ 20210602 88.81000 89.60929 87.83309 88.09952 10.05 -0.13 -1.2935 253313.77 252740.741 8.881
8 000010.SZ 20210602 43.20775 43.63875 43.10000 43.31550 4.03 -0.01 -0.2481 45925.00 18472.845 10.775
9 000011.SZ 20210602 49.83250 49.98750 48.12750 48.51500 12.97 -0.45 -3.4695 91615.92 115647.098 3.875
print(df[df.Trade_date=='20210601'].head(10))
ts_code Trade_date open high low close pre_close change pct_chg vol amount adj_factor
4315 000001.SZ 20210601 2708.48820 2714.08425 2630.14350 2677.15032 24.20 -0.28 -1.1570 5584.456536 1490476.624 111.921
4316 000002.SZ 20210601 4136.85888 4150.70997 4073.75947 4093.76660 26.70 -0.10 -0.3745 3962.999656 1622419.892 153.901
4317 000004.SZ 20210601 65.30848 65.79616 64.29248 64.94272 16.04 -0.06 -0.3741 5646.429626 36524.364 4.064
4318 000005.SZ 20210601 16.21900 16.86776 16.03364 16.58972 1.75 0.04 2.2857 10139.708675 16756.998 9.268
4319 000006.SZ 20210601 192.75696 193.12203 190.93161 192.75696 5.26 0.02 0.3802 1634.333689 31401.554 36.507
4320 000007.SZ 20210601 30.65080 30.89932 29.90524 30.65080 3.70 0.00 0.0000 4805.589087 14552.560 8.284
4321 000008.SZ 20210601 50.64208 51.09024 50.41800 51.09024 2.26 0.02 0.8850 6865.234738 34876.868 22.408
4322 000009.SZ 20210601 90.76382 90.76382 88.36595 89.25405 10.23 -0.18 -1.7595 42079.017003 376049.211 8.881
4323 000010.SZ 20210601 43.63875 43.63875 43.10000 43.42325 4.05 -0.02 -0.4938 5253.744780 22713.494 10.775
4324 000011.SZ 20210601 50.02625 50.29750 49.05750 50.25875 12.85 0.12 0.9339 18663.819355 92935.863 3.875
print(df[df.ts_code=='000001.SZ'].head(10))
ts_code Trade_date open high low close pre_close change pct_chg vol amount adj_factor
0 000001.SZ 20210602 2673.79269 2677.15032 2616.71298 2673.79269 23.92 -0.03 -0.1254 4445.341089 1176608.126 111.921
4315 000001.SZ 20210601 2708.48820 2714.08425 2630.14350 2677.15032 24.20 -0.28 -1.1570 5584.456536 1490476.624 111.921
8627 000001.SZ 20210531 2723.03793 2745.42213 2676.03111 2708.48820 24.50 -0.30 -1.2245 4604.594401 1244209.045 111.921
12938 000001.SZ 20210528 2762.21028 2765.56791 2704.01136 2742.06450 24.79 -0.29 -1.1698 4399.538335 1200523.315 111.921
17247 000001.SZ 20210527 2787.95211 2815.93236 2744.30292 2774.52159 25.01 -0.22 -0.8796 4503.992638 1246712.048 111.921
21553 000001.SZ 20210526 2757.73344 2811.45552 2742.06450 2799.14421 24.60 0.41 1.6667 8213.505776 2286540.248 111.921
25857 000001.SZ 20210525 2634.62034 2767.80633 2624.54745 2753.25660 23.48 1.12 4.7700 8688.044424 2363145.902 111.921
30156 000001.SZ 20210524 2627.90508 2641.33560 2595.44799 2627.90508 23.49 -0.01 -0.0426 3073.021417 806092.207 111.921
34449 000001.SZ 20210521 2672.67348 2696.17689 2577.54063 2629.02429 23.82 -0.33 -1.3854 4818.896722 1263058.114 111.921
38744 000001.SZ 20210520 2620.07061 2668.19664 2604.40167 2665.95822 23.60 0.22 0.9322 3625.667837 957478.853 111.921
为一个ts_code
执行的语句如下,我需要为所有的ts_code
执行:
df['pre_close']=df[df.ts_code=='000001.SZ'].close.shift(-1)
print(df.head(10))
ts_code Trade_date open high low close pre_close change pct_chg vol amount adj_factor
0 000001.SZ 20210602 2673.79269 2677.15032 2616.71298 2673.79269 2677.15032 -0.03 -0.1254 4445.341089 1176608.126 111.921
1 000002.SZ 20210602 4078.37650 4183.02918 4049.13531 4118.39076 NaN 0.16 0.6015 5546.065718 2287264.276 153.901
2 000004.SZ 20210602 65.87744 69.81952 64.69888 67.34048 NaN 0.59 3.6921 11595.858760 77192.135 4.064
3 000005.SZ 20210602 16.68240 16.96044 16.40436 16.68240 NaN 0.01 0.5587 8889.586750 14812.102 9.268
4 000006.SZ 20210602 193.12203 193.12203 190.56654 191.29668 NaN -0.04 -0.7576 1591.295642 30539.090 36.507
5 000007.SZ 20210602 30.65080 30.73364 30.15376 30.31944 NaN -0.04 -1.0811 3568.358281 10841.980 8.284
6 000008.SZ 20210602 50.86616 51.31432 50.86616 51.09024 NaN 0.00 0.0000 5659.005712 28933.202 22.408
7 000009.SZ 20210602 88.81000 89.60929 87.83309 88.09952 NaN -0.13 -1.2935 28523.113388 252740.741 8.881
8 000010.SZ 20210602 43.20775 43.63875 43.10000 43.31550 NaN -0.01 -0.2481 4262.180974 18472.845 10.775
9 000011.SZ 20210602 49.83250 49.98750 48.12750 48.51500 NaN -0.45 -3.4695 23642.818065 115647.098 3.875
print(df[df.ts_code=='000001.SZ'].head(10))
ts_code Trade_date open high low close pre_close change pct_chg vol amount adj_factor
0 000001.SZ 20210602 2673.79269 2677.15032 2616.71298 2673.79269 2677.15032 -0.03 -0.1254 4445.341089 1176608.126 111.921
4315 000001.SZ 20210601 2708.48820 2714.08425 2630.14350 2677.15032 2708.48820 -0.28 -1.1570 5584.456536 1490476.624 111.921
8627 000001.SZ 20210531 2723.03793 2745.42213 2676.03111 2708.48820 2742.06450 -0.30 -1.2245 4604.594401 1244209.045 111.921
12938 000001.SZ 20210528 2762.21028 2765.56791 2704.01136 2742.06450 2774.52159 -0.29 -1.1698 4399.538335 1200523.315 111.921
17247 000001.SZ 20210527 2787.95211 2815.93236 2744.30292 2774.52159 2799.14421 -0.22 -0.8796 4503.992638 1246712.048 111.921
21553 000001.SZ 20210526 2757.73344 2811.45552 2742.06450 2799.14421 2753.25660 0.41 1.6667 8213.505776 2286540.248 111.921
25857 000001.SZ 20210525 2634.62034 2767.80633 2624.54745 2753.25660 2627.90508 1.12 4.7700 8688.044424 2363145.902 111.921
30156 000001.SZ 20210524 2627.90508 2641.33560 2595.44799 2627.90508 2629.02429 -0.01 -0.0426 3073.021417 806092.207 111.921
34449 000001.SZ 20210521 2672.67348 2696.17689 2577.54063 2629.02429 2665.95822 -0.33 -1.3854 4818.896722 1263058.114 111.921
38744 000001.SZ 20210520 2620.07061 2668.19664 2604.40167 2665.95822 2641.33560 0.22 0.9322 3625.667837 957478.853 111.921
解决方法
您可以尝试使用.groupby()
按ts_code
分组,并在组内的close
列上应用.shift()
函数,如下所示:
df['pre_close'] = df.groupby('ts_code')['close'].shift(-1)