每个新行的R循环`sample`函数

问题描述

library(tidyverse)
fruit %>% 
  as_tibble() %>%
  transmute(fruit = value,fruit.abr = substring(value,1,sample(3:6,1)))

#> # A tibble: 80 x 2
#>    fruit        fruit.abr
#>    <chr>        <chr>    
#>  1 apple        app      
#>  2 apricot      apr      
#>  3 avocado      avo      
#>  4 banana       ban      
#>  5 bell pepper  bel      
#>  6 bilBerry     bil      
#>  7 blackBerry   bla      
#>  8 blackcurrant bla      
#>  9 blood orange blo      
#> 10 blueBerry    blu      
#> # ... with 70 more rows

我希望我的缩写水果栏是3到6个字符之间的随机字符串长度。每行的字符串长度都不同(3到6之间)。

我编写代码的方式是在3到6之间的示例被选择一次,然后用于每一行。我如何“回收”或“循环”此sample()函数以使其为每一行选择一个新值(例如3、6、4、3、5等)?

解决方法

添加rowwise()

fruit %>% 
     as_tibble() %>% 
     rowwise() %>% 
     transmute(fruit = value,fruit.abr = substring(value,1,sample(3:6,1)))

# A tibble: 80 x 2
# Rowwise: 
   fruit        fruit.abr
   <chr>        <chr>    
 1 apple        apple    
 2 apricot      apri     
 3 avocado      avocad   
 4 banana       bana     
 5 bell pepper  bell     
 6 bilberry     bil      
 7 blackberry   black    
 8 blackcurrant bla      
 9 blood orange blo      
10 blueberry    blu      
# ... with 70 more rows
,

sample(3:6,1)返回单个值,并将被回收到行的长度。您应该绘制与一次行数相同大小的样本。请记住将replace = TRUE设置为替换样本。

fruit %>% 
  as_tibble() %>%
  transmute(fruit = value,n(),TRUE)))

# # A tibble: 10 x 2
#    fruit        fruit.abr
#    <chr>        <chr>    
#  1 apple        "app"    
#  2 apricot      "apr"    
#  3 avocado      "avoca"  
#  4 banana       "banana" 
#  5 bell pepper  "bell "  
#  6 bilberry     "bilbe"  
#  7 blackberry   "blac"   
#  8 blackcurrant "blac"   
#  9 blood orange "blo"    
# 10 blueberry    "blu"

数据

fruit <- structure(list(value = c("apple","apricot","avocado","banana","bell pepper","bilberry","blackberry","blackcurrant","blood orange","blueberry")),class = "data.frame",row.names = c(NA,-10L))
,

尝试一下,也许更接近您想要的。您可以使用runif在3到6之间创建一个随机索引,然后使用sample()将原始单词中的字符随机排列。这里的代码:

#Data
df <- data.frame(fruit=c('apple','orange'),stringsAsFactors = F)
#My func
myfunc<-function(x)
{
  y <- unlist(strsplit(x,split=''))
  #Number
  index <- round(runif(1,3,6),0)
  #Create id
  var <- paste0(sample(y,index),collapse = '')
  #Return
  return(var)
}
#Apply
df$ID <- apply(df,myfunc)

输出:

   fruit   ID
1  apple eppa
2 orange egnr