如何批量修改pre_close

问题描述

我的 dataframe11,516,015 行。我需要将 pre_close 修改为相同 close 的前一个 Trade_codets_code 值。如何执行批处理操作?样本数据如下:

print (df.head(10))

     ts_code Trade_date        open        high         low       close  pre_close  change  pct_chg        vol       amount  adj_factor
0  000001.SZ   20210602  2673.79269  2677.15032  2616.71298  2673.79269      23.92   -0.03  -0.1254  497527.02  1176608.126     111.921
1  000002.SZ   20210602  4078.37650  4183.02918  4049.13531  4118.39076      26.60    0.16   0.6015  853545.06  2287264.276     153.901
2  000004.SZ   20210602    65.87744    69.81952    64.69888    67.34048      15.98    0.59   3.6921   47125.57    77192.135       4.064
3  000005.SZ   20210602    16.68240    16.96044    16.40436    16.68240       1.79    0.01   0.5587   82388.69    14812.102       9.268
4  000006.SZ   20210602   193.12203   193.12203   190.56654   191.29668       5.28   -0.04  -0.7576   58093.43    30539.090      36.507
5  000007.SZ   20210602    30.65080    30.73364    30.15376    30.31944       3.70   -0.04  -1.0811   29560.28    10841.980       8.284
6  000008.SZ   20210602    50.86616    51.31432    50.86616    51.09024       2.28    0.00   0.0000  126807.00    28933.202      22.408
7  000009.SZ   20210602    88.81000    89.60929    87.83309    88.09952      10.05   -0.13  -1.2935  253313.77   252740.741       8.881
8  000010.SZ   20210602    43.20775    43.63875    43.10000    43.31550       4.03   -0.01  -0.2481   45925.00    18472.845      10.775
9  000011.SZ   20210602    49.83250    49.98750    48.12750    48.51500      12.97   -0.45  -3.4695   91615.92   115647.098       3.875
print(df[df.Trade_date=='20210601'].head(10))
        ts_code Trade_date        open        high         low       close  pre_close  change  pct_chg           vol       amount  adj_factor
4315  000001.SZ   20210601  2708.48820  2714.08425  2630.14350  2677.15032      24.20   -0.28  -1.1570   5584.456536  1490476.624     111.921
4316  000002.SZ   20210601  4136.85888  4150.70997  4073.75947  4093.76660      26.70   -0.10  -0.3745   3962.999656  1622419.892     153.901
4317  000004.SZ   20210601    65.30848    65.79616    64.29248    64.94272      16.04   -0.06  -0.3741   5646.429626    36524.364       4.064
4318  000005.SZ   20210601    16.21900    16.86776    16.03364    16.58972       1.75    0.04   2.2857  10139.708675    16756.998       9.268
4319  000006.SZ   20210601   192.75696   193.12203   190.93161   192.75696       5.26    0.02   0.3802   1634.333689    31401.554      36.507
4320  000007.SZ   20210601    30.65080    30.89932    29.90524    30.65080       3.70    0.00   0.0000   4805.589087    14552.560       8.284
4321  000008.SZ   20210601    50.64208    51.09024    50.41800    51.09024       2.26    0.02   0.8850   6865.234738    34876.868      22.408
4322  000009.SZ   20210601    90.76382    90.76382    88.36595    89.25405      10.23   -0.18  -1.7595  42079.017003   376049.211       8.881
4323  000010.SZ   20210601    43.63875    43.63875    43.10000    43.42325       4.05   -0.02  -0.4938   5253.744780    22713.494      10.775
4324  000011.SZ   20210601    50.02625    50.29750    49.05750    50.25875      12.85    0.12   0.9339  18663.819355    92935.863       3.875
print(df[df.ts_code=='000001.SZ'].head(10))
         ts_code Trade_date        open        high         low       close  pre_close  change  pct_chg          vol       amount  adj_factor
0      000001.SZ   20210602  2673.79269  2677.15032  2616.71298  2673.79269      23.92   -0.03  -0.1254  4445.341089  1176608.126     111.921
4315   000001.SZ   20210601  2708.48820  2714.08425  2630.14350  2677.15032      24.20   -0.28  -1.1570  5584.456536  1490476.624     111.921
8627   000001.SZ   20210531  2723.03793  2745.42213  2676.03111  2708.48820      24.50   -0.30  -1.2245  4604.594401  1244209.045     111.921
12938  000001.SZ   20210528  2762.21028  2765.56791  2704.01136  2742.06450      24.79   -0.29  -1.1698  4399.538335  1200523.315     111.921
17247  000001.SZ   20210527  2787.95211  2815.93236  2744.30292  2774.52159      25.01   -0.22  -0.8796  4503.992638  1246712.048     111.921
21553  000001.SZ   20210526  2757.73344  2811.45552  2742.06450  2799.14421      24.60    0.41   1.6667  8213.505776  2286540.248     111.921
25857  000001.SZ   20210525  2634.62034  2767.80633  2624.54745  2753.25660      23.48    1.12   4.7700  8688.044424  2363145.902     111.921
30156  000001.SZ   20210524  2627.90508  2641.33560  2595.44799  2627.90508      23.49   -0.01  -0.0426  3073.021417   806092.207     111.921
34449  000001.SZ   20210521  2672.67348  2696.17689  2577.54063  2629.02429      23.82   -0.33  -1.3854  4818.896722  1263058.114     111.921
38744  000001.SZ   20210520  2620.07061  2668.19664  2604.40167  2665.95822      23.60    0.22   0.9322  3625.667837   957478.853     111.921

一个ts_code执行的语句如下,我需要为所有的ts_code执行:

df['pre_close']=df[df.ts_code=='000001.SZ'].close.shift(-1)
print(df.head(10))
     ts_code Trade_date        open        high         low       close   pre_close  change  pct_chg           vol       amount  adj_factor
0  000001.SZ   20210602  2673.79269  2677.15032  2616.71298  2673.79269  2677.15032   -0.03  -0.1254   4445.341089  1176608.126     111.921
1  000002.SZ   20210602  4078.37650  4183.02918  4049.13531  4118.39076         NaN    0.16   0.6015   5546.065718  2287264.276     153.901
2  000004.SZ   20210602    65.87744    69.81952    64.69888    67.34048         NaN    0.59   3.6921  11595.858760    77192.135       4.064
3  000005.SZ   20210602    16.68240    16.96044    16.40436    16.68240         NaN    0.01   0.5587   8889.586750    14812.102       9.268
4  000006.SZ   20210602   193.12203   193.12203   190.56654   191.29668         NaN   -0.04  -0.7576   1591.295642    30539.090      36.507
5  000007.SZ   20210602    30.65080    30.73364    30.15376    30.31944         NaN   -0.04  -1.0811   3568.358281    10841.980       8.284
6  000008.SZ   20210602    50.86616    51.31432    50.86616    51.09024         NaN    0.00   0.0000   5659.005712    28933.202      22.408
7  000009.SZ   20210602    88.81000    89.60929    87.83309    88.09952         NaN   -0.13  -1.2935  28523.113388   252740.741       8.881
8  000010.SZ   20210602    43.20775    43.63875    43.10000    43.31550         NaN   -0.01  -0.2481   4262.180974    18472.845      10.775
9  000011.SZ   20210602    49.83250    49.98750    48.12750    48.51500         NaN   -0.45  -3.4695  23642.818065   115647.098       3.875
print(df[df.ts_code=='000001.SZ'].head(10))
         ts_code Trade_date        open        high         low       close   pre_close  change  pct_chg          vol       amount  adj_factor
0      000001.SZ   20210602  2673.79269  2677.15032  2616.71298  2673.79269  2677.15032   -0.03  -0.1254  4445.341089  1176608.126     111.921
4315   000001.SZ   20210601  2708.48820  2714.08425  2630.14350  2677.15032  2708.48820   -0.28  -1.1570  5584.456536  1490476.624     111.921
8627   000001.SZ   20210531  2723.03793  2745.42213  2676.03111  2708.48820  2742.06450   -0.30  -1.2245  4604.594401  1244209.045     111.921
12938  000001.SZ   20210528  2762.21028  2765.56791  2704.01136  2742.06450  2774.52159   -0.29  -1.1698  4399.538335  1200523.315     111.921
17247  000001.SZ   20210527  2787.95211  2815.93236  2744.30292  2774.52159  2799.14421   -0.22  -0.8796  4503.992638  1246712.048     111.921
21553  000001.SZ   20210526  2757.73344  2811.45552  2742.06450  2799.14421  2753.25660    0.41   1.6667  8213.505776  2286540.248     111.921
25857  000001.SZ   20210525  2634.62034  2767.80633  2624.54745  2753.25660  2627.90508    1.12   4.7700  8688.044424  2363145.902     111.921
30156  000001.SZ   20210524  2627.90508  2641.33560  2595.44799  2627.90508  2629.02429   -0.01  -0.0426  3073.021417   806092.207     111.921
34449  000001.SZ   20210521  2672.67348  2696.17689  2577.54063  2629.02429  2665.95822   -0.33  -1.3854  4818.896722  1263058.114     111.921
38744  000001.SZ   20210520  2620.07061  2668.19664  2604.40167  2665.95822  2641.33560    0.22   0.9322  3625.667837   957478.853     111.921

解决方法

您可以尝试使用.groupby()ts_code分组,并在组内的close上应用.shift()函数,如下所示:

df['pre_close'] = df.groupby('ts_code')['close'].shift(-1)