问题描述
以下数据集具有较宽的格式,并以“ a”,“ b”和“ c”为前缀重复测量“ ql”,“ st”和“ xy”;
df<-data.frame(id=c(1,2,3,4),ex=c(1,1),aql=c(5,4,NA,6),bql=c(5,7,9),cql=c(5,bst=c(3,8,cst=c(8,5,3),axy=c(1,9,cxy=c(5,1,4))
我正在寻找一种在前缀字母“ a”,“ b”和“ c”之后插入点的方法,同时保持其他列(即id,ex)不变。我一直在使用gsub函数来解决此问题,例如
names(df) <- gsub("","\\.",names(df))
但是得到了不希望的结果。预期的输出看起来像
id ex a.ql b.ql c.ql b.st c.st a.xy c.xy
1 1 1 5 5 5 3 8 1 5
2 2 0 4 7 7 7 7 9 3
3 3 0 NA NA NA 8 5 4 1
4 4 1 6 9 9 9 3 4 4
解决方法
尝试
sub("(^[a-c])(.+)","\\1.\\2",names(df))
# [1] "id" "ex" "a.ql" "b.ql" "c.ql" "b.st" "c.st" "a.xy" "c.xy"
或
sub("(?<=^[a-c])",".",names(df),perl = TRUE)
# [1] "id" "ex" "a.ql" "b.ql" "c.ql" "b.st" "c.st" "a.xy" "c.xy"
,
你可以做
setNames(df,sub("(ql$)|(st$)|(xy$)","\\.\\1\\2\\3",names(df)))
#> id ex a.ql b.ql c.ql b.st c.st a.xy c.xy
#> 1 1 1 5 5 5 3 8 1 5
#> 2 2 0 4 7 7 7 7 9 3
#> 3 3 0 NA NA NA 8 5 4 1
#> 4 4 1 6 9 9 9 3 4 4
,
另一种尝试的方式
library(dplyr)
df %>%
rename_at(vars(aql:cxy),~ str_replace(.,"(?<=\\w{1})","\\."))
# id ex a.ql b.ql c.ql b.st c.st a.xy c.xy
# 1 1 1 5 5 5 3 8 1 5
# 2 2 0 4 7 7 7 7 9 3
# 3 3 0 NA NA NA 8 5 4 1
# 4 4 1 6 9 9 9 3 4 4
,
您还可以尝试使用tidyverse
方法来重塑数据,如下所示:
library(tidyverse)
#Data
df<-data.frame(id=c(1,2,3,4),ex=c(1,1),aql=c(5,4,NA,6),bql=c(5,7,9),cql=c(5,bst=c(3,8,cst=c(8,5,3),axy=c(1,9,cxy=c(5,1,4))
#Reshape
df %>% pivot_longer(-c(1,2)) %>%
mutate(name=paste0(substring(name,'.',substring(name,nchar(name)))) %>%
pivot_wider(names_from = name,values_from=value)
输出:
# A tibble: 4 x 9
id ex a.ql b.ql c.ql b.st c.st a.xy c.xy
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 1 5 5 5 3 8 1 5
2 2 0 4 7 7 7 7 9 3
3 3 0 NA NA NA 8 5 4 1
4 4 1 6 9 9 9 3 4 4