问题描述
如果列的值 (GID) 以字符 "N" 开头,如果 ColB 为空,如何将其替换为 ColB R编程中的Dataframe
代码:
DataFile <- extract_tables("new.pdf",pages = c(87),method = "stream",output = "data.frame",guess = TRUE)
DataFrame<-as.data.frame(DataFile)
#removing No. and A# from columns
df2<-subset(DataFrame,Group!="No." & Group!="A#")
输出:
GID ColA ColB
1 2 2
2 3 4
3 5 4
4 6 5
5 6 5
NG1 8
MG2 8 1
MG3 8 1
NG4 8
预期输出
GID ColA ColB
1 2 2
2 3 4
3 5 4
4 6 5
5 6 5
NG1 8 N
MG2 8 1
MG3 8 1
NG4 8 N
解决方法
我们可以指定一个条件,例如
i1 <- with(df1,substr(GID,1,1) == 'N' & ColB == "")
df1$ColB[i1] <- "N"
或使用 grepl
i1 <- with(df1,grepl("^N",GID) & ColB == "")
df1$ColB[i1] <- "N"
如果我们想替换任何一列
i1 <- with(df1,GID))
nm1 <- setdiff(names(df1),"GID")
df1[nm1] <- lapply(df1[nm1],function(x) replace(x,i1 & x == "","N"))
-输出
df1
# GID ColA ColB
#1 1 2 2
#2 2 3 4
#3 3 5 4
#4 4 6 5
#5 5 6 5
#6 NG1 8 N
#7 MG2 8 1
#8 MG3 8 1
#9 NG4 8 N
数据
df1 <- structure(list(GID = c("1","2","3","4","5","NG1","MG2","MG3","NG4"),ColA = c(2L,3L,5L,6L,8L,8L),ColB = c("2","","1","")),row.names = c(NA,-9L),class = "data.frame")