将列表的列转换为data.frame中的常规列?

问题描述

这是data.frame,其中第二列是列表的列(请注意,还有一个NULL)。

我们如何将每个列表转换为常规元素,以便该列像其他任何字符类列一样? (NULL可以是NA

df <- structure(list(Year = c(2014L,2014L,2013L,2014L),Country = list(Country = "Canada",Country = "Germany",Country = "France",Country = "Mexico",Country = "Canada",NULL,Country = "United States of America",Country = "Germany")),class = "data.frame",row.names = c(NA,-20L))

注意

df %>% sapply(class)
     Year   Country 
"integer"    "list" 

所需结果:

  • 相同的数据,但
df %>% sapply(class)
     Year   Country 
"integer"    "character" 

解决方法

我建议对您的margin数据使用函数的方法:

df

输出:

myfun <- function(x)
{
  if(is.null(x)) 
    {y <- NA} 
  else
  {
    y <- x[[1]]
  }
  return(y)
}
#Apply  
df$Newvar <- as.vector(do.call(rbind,lapply(df$Country,myfun)))

和一些检查:

   Year                  Country                   Newvar
1  2014                   Canada                   Canada
2  2014                  Germany                  Germany
3  2014                   France                   France
4  2014                  Germany                  Germany
5  2014                   Mexico                   Mexico
6  2014                  Germany                  Germany
7  2014                  Germany                  Germany
8  2014                   Canada                   Canada
9  2014                     NULL                     <NA>
10 2014                  Germany                  Germany
11 2014                   Mexico                   Mexico
12 2014                   Canada                   Canada
13 2014                   Mexico                   Mexico
14 2014                  Germany                  Germany
15 2013                   Canada                   Canada
16 2014 United States of America United States of America
17 2014                   Canada                   Canada
18 2014                   Mexico                   Mexico
19 2014                   Canada                   Canada
20 2014                  Germany                  Germany

str(df) 'data.frame': 20 obs. of 3 variables: $ Year : int 2014 2014 2014 2014 2014 2014 2014 2014 2014 2014 ... $ Country:List of 20 ..$ Country: chr "Canada" ..$ Country: chr "Germany" ..$ Country: chr "France" ..$ Country: chr "Germany" ..$ Country: chr "Mexico" ..$ Country: chr "Germany" ..$ Country: chr "Germany" ..$ Country: chr "Canada" ..$ : NULL ..$ Country: chr "Germany" ..$ Country: chr "Mexico" ..$ Country: chr "Canada" ..$ Country: chr "Mexico" ..$ Country: chr "Germany" ..$ Country: chr "Canada" ..$ Country: chr "United States of America" ..$ Country: chr "Canada" ..$ Country: chr "Mexico" ..$ Country: chr "Canada" ..$ Country: chr "Germany" $ Newvar : chr "Canada" "Germany" "France" "Germany" ... 现在不在列表中。

,

一个选项:

df$Country <- sapply(df$Country,function(x) if (length(x)) x else NA)

另一个:

df$Country[lengths(df$Country) == 0] <- list(NA)
df$Country <- as.vector(df$Country)
,

另一种使它与dplyr的mutate保持一致的方法。

  df2 = df %>% 
  mutate(NewCountry = if_else(
    sapply(df$Country,is.null),"MISSING",as.character(df$Country))
  )

> sapply(df2,class)
       Year     Country  NewCountry 
  "integer"      "list" "character"