如果列的值以 R 数据

问题描述

如果列的值 (GID) 以字符 "N" 开头,如果 ColB 为空,如何将其替换为 ColB R编程中的Dataframe

代码

DataFile <- extract_tables("new.pdf",pages = c(87),method = "stream",output = "data.frame",guess = TRUE)
DataFrame<-as.data.frame(DataFile)

#removing No. and A# from columns
df2<-subset(DataFrame,Group!="No." & Group!="A#") 

输出

GID    ColA    ColB 
1       2       2
2       3       4
3       5       4
4       6       5
5       6       5
NG1     8 
MG2     8       1
MG3     8       1
NG4     8 

预期输出

GID    ColA    ColB 
1       2       2
2       3       4
3       5       4
4       6       5
5       6       5
NG1     8       N
MG2     8       1
MG3     8       1
NG4     8       N

解决方法

我们可以指定一个条件,例如

i1 <- with(df1,substr(GID,1,1) == 'N' & ColB == "")
df1$ColB[i1] <- "N"

或使用 grepl

i1 <- with(df1,grepl("^N",GID) & ColB == "")
df1$ColB[i1] <- "N"

如果我们想替换任何一列

i1 <- with(df1,GID))
nm1 <- setdiff(names(df1),"GID")
df1[nm1] <- lapply(df1[nm1],function(x) replace(x,i1 & x == "","N"))

-输出

df1
#  GID ColA ColB
#1   1    2    2
#2   2    3    4
#3   3    5    4
#4   4    6    5
#5   5    6    5
#6 NG1    8    N
#7 MG2    8    1
#8 MG3    8    1
#9 NG4    8    N

数据

df1 <-  structure(list(GID = c("1","2","3","4","5","NG1","MG2","MG3","NG4"),ColA = c(2L,3L,5L,6L,8L,8L),ColB = c("2","","1","")),row.names = c(NA,-9L),class = "data.frame")