细分面板数据以应用功能

问题描述

我正在尝试创建一列虚拟变量,以记录记录是否对某公司实施了处理的数据。如果在特定年份应用了处理(grant,则该变量应记录该公司对应的所有年份。我知道使用lapply /sapply函数或dplyr group_by()是合适的,但是我不确定如何应用它。以下是原始数据:

head(q3data_a)
 A tibble: 6 x 30
   year  fcode employ  sales avgsal scrap rework tothrs union grant   d89   d88 totrain hrsemp lscrap lemploy
  <int>  <dbl>  <int>  <dbl>  <dbl> <dbl>  <dbl>  <int> <int> <int> <int> <int>   <int>  <dbl>  <dbl>   <dbl>
1  1987 410032    100 4.70e7  35000    NA     NA     12     0     0     0     0     100  12        NA    4.61
2  1988 410032    131 4.30e7  37000    NA     NA      8     0     0     0     1      50   3.05     NA    4.88
3  1987 410440     12 1.56e6  10500    NA     NA     12     0     0     0     0      12  12        NA    2.48
4  1988 410440     13 1.97e6  11000    NA     NA     12     0     0     0     1      13  12        NA    2.56
5  1987 410495     20 7.50e5  17680    NA     NA     50     0     0     0     0      15  37.5      NA    3.00
6  1988 410495     25 1.10e5  18720    NA     NA     50     0     0     0     1      10  20        NA    3.22
# ... with 14 more variables: lsales <dbl>,lrework <dbl>,lhrsemp <dbl>,lscrap_1 <dbl>,grant_1 <int>,#   clscrap <dbl>,cgrant <int>,clemploy <dbl>,clsales <dbl>,lavgsal <dbl>,clavgsal <dbl>,#   cgrant_1 <int>,chrsemp <dbl>,clhrsemp <dbl>

以下是我的临时解决方案。它可以工作,但是不能一概而论(例如,很难实现超过2的时间段)。

dummy1 = c(rep(0,nrow(q3data_a))) #Encodes the treatment across all time periods 
for (i in 1:nrow(q3data_a)){   #so if a firm receives a treatment in 1988,it receives a 1 in 1987
  if(i%%2 == 0){
    if (q3data_a[i,]$grant == 1){
      dummy1[i-1] = 1
      dummy1[i] = 1
    }
  }
}

谢谢您的建议。

解决方法

这是您需要的吗?

library(dplyr)
df %>% group_by(fcode) %>% mutate(dummy1 = as.integer(any(grant > 0)))

df看起来像这样:

# A tibble: 12 x 3
    year  fcode grant
   <int>  <dbl> <int>
 1  1985 410032     0
 2  1986 410032     1
 3  1987 410032     1
 4  1988 410032     1
 5  1985 410440     1
 6  1986 410440     0
 7  1987 410440     1
 8  1988 410440     1
 9  1985 410495     0
10  1986 410495     0
11  1987 410495     0
12  1988 410495     0

输出为

# A tibble: 12 x 4
# Groups:   fcode [3]
    year  fcode grant dummy1
   <int>  <dbl> <int>  <int>
 1  1985 410032     0      1
 2  1986 410032     1      1
 3  1987 410032     1      1
 4  1988 410032     1      1
 5  1985 410440     1      1
 6  1986 410440     0      1
 7  1987 410440     1      1
 8  1988 410440     1      1
 9  1985 410495     0      0
10  1986 410495     0      0
11  1987 410495     0      0
12  1988 410495     0      0

相关问答

错误1:Request method ‘DELETE‘ not supported 错误还原:...
错误1:启动docker镜像时报错:Error response from daemon:...
错误1:private field ‘xxx‘ is never assigned 按Alt...
报错如下,通过源不能下载,最后警告pip需升级版本 Requirem...