JSON 作为一个巨大的列表导入到 R,一些列有表我该如何处理这些?

问题描述

我有一个来自 kaggle 的 JSON 数据集。

beer <- fromJSON('recipes_full.txt')

它作为一个巨大的列表导入。例如,每个啤酒#,我将从 0 开始,列表为 19。

What 'beer' looks like

这些列表中的每一个都有一个值,除了其中 4 个是不同大小的表,如 8x2、5x5、2x4 等。

Expanded '0'

示例数据:

    $`0`
$`0`$name
[1] "Vanilla Cream Ale"

$`0`$url
[1] "/homebrew/recipe/view/1633/vanilla-cream-ale"

$`0`$method
[1] "All Grain"

$`0`$style
[1] "Cream Ale"

$`0`$batch
[1] 21.8

$`0`$og
[1] 1.055

$`0`$fg
[1] 1.013

$`0`$abv
[1] 5.48

$`0`$ibu
[1] 19.44

$`0`$color
[1] 4.83

$`0`$`ph mash`
[1] -1

$`0`$fermentables
     [,1]    [,2]                                  [,3] [,4]  [,5]  
[1,] "2.381" "American - Pale 2-Row"               "37" "1.8" "44.7"
[2,] "0.907" "American - White Wheat"              "40" "2.8" "17"  
[3,] "0.907" "American - Pale 6-Row"               "35" "1.8" "17"  
[4,] "0.227" "Flaked Corn"                         "40" "0.5" "4.3" 
[5,] "0.227" "American - Caramel / Crystal 20L"    "35" "20"  "4.3" 
[6,] "0.227" "American - Carapils (Dextrine Malt)" "33" "1.8" "4.3" 
[7,] "0.113" "Flaked Barley"                       "32" "2.2" "2.1" 
[8,] "0.34"  "Honey"                               "42" "2"   "6.4" 

$`0`$hops
     [,1] [,2]      [,3]     [,5]   [,6]     [,7]    [,8]  
[1,] "14" "Cascade" "pellet" "6.2" "Boil" "60 min" "11.42" "33.3"
[2,] "14" "Cascade" "pellet" "6.2" "Boil" "20 min" "6.92"  "33.3"
[3,] "14" "saaz"    "pellet" "3"   "Boil" "5 min"  "1.1"   "33.3"

$`0`$`hops Summary`
     [,2]               [,3]    [,4]  
[1,] "28" "Cascade (pellet)" "18.34" "66.6"
[2,] "14" "saaz (pellet)"    "1.1"   "33.3"

$`0`$other
     [,1]     [,2]                           [,4]        [,5]     
[1,] "2 oz"   "pure vanilla extract"         "Flavor" "Boil"      "0 min." 
[2,] "1 oz"   "pure vanilla extract"         "Flavor" "Bottling"  "0 min." 
[3,] "1 tsp"  "yeast nutrient"               "Other"  "Boil"      "15 min."
[4,] "1 each" "whirlfloc"                    "Fining" "Boil"      "15 min."
[5,] "4 each" "Vanilla beans - in 2oz Vodka" "Other"  "Secondary" "0 min." 

$`0`$yeast
[1] "Wyeast - Kölsch 2565" "76%"                  "Low"                 
[4] "56"                   "70"                   "Yes"                 

$`0`$rating
[1] 0

$`0`$`num rating`
[1] 16

$`0`$views
[1] 289454

如您所见,除“可发酵物”、“啤酒花”、“其他”和“酵母”之外的所有东西都很容易处理。我不知道如何处理这些表。我一直在搜索 StackOverflow 并尝试不同的方法,但大多数方法都需要将数据集转换为数据帧,但我被这些表“阻止”了。 我想我想隔离它们并尝试将其转换为长数据,但我不确定如何隔离它们。我可以阅读任何建议或图书馆和文档吗?谢谢你的帮助!之前在python中尝试过处理,还是没搞清楚。

编辑:我知道我可以通过“beer[["0"]][["fermentables"]]”单独访问它们,但是我不知道如何一次访问多个,这也让我失望。

dput() 第一瓶啤酒的数据:

list(`0` = list(name = "Vanilla Cream Ale",url = "/homebrew/recipe/view/1633/vanilla-cream-ale",method = "All Grain",style = "Cream Ale",batch = 21.8,og = 1.055,fg = 1.013,abv = 5.48,ibu = 19.44,color = 4.83,`ph mash` = -1L,fermentables = structure(c("2.381","0.907","0.227","0.113","0.34","American - Pale 2-Row","American - White Wheat","American - Pale 6-Row","Flaked Corn","American - Caramel / Crystal 20L","American - Carapils (Dextrine Malt)","Flaked Barley","Honey","37","40","35","33","32","42","1.8","2.8","0.5","20","2.2","2","44.7","17","4.3","2.1","6.4"
),.Dim = c(8L,5L)),hops = structure(c("14","14","Cascade","saaz","pellet","6.2","3","Boil","60 min","20 min","5 min","11.42","6.92","1.1","33.3","33.3"),.Dim = c(3L,8L)),`hops Summary` = structure(c("28","Cascade (pellet)","saaz (pellet)","18.34","66.6",.Dim = c(2L,4L)),other = structure(c("2 oz","1 oz","1 tsp","1 each","4 each","pure vanilla extract","yeast nutrient","whirlfloc","Vanilla beans - in 2oz Vodka","Flavor","Other","Fining","Bottling","Secondary","0 min.","15 min.","0 min."),.Dim = c(5L,yeast = c("Wyeast - Kölsch 2565","76%","Low","56","70","Yes"),rating = 0L,`num rating` = 16L,views = 289454L)

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)