问题描述
我有一个具有这种模式的数据表:
date,key_a,key_b,key_c,key_d,value
我想列出具有以下结构的列表:
[1]
date,value
[2]
date,value
[3]
date,value
我想将我的dt
聚合到一个列表中,其中每个条目都将dt
聚合到一个单独的键中。
这是我的代码
setDT(dt)
list_of_dts[1] <-
dt[,.(value = sum(value)),.(date,key_a)]
list_of_dts[2] <-
dt[,key_b)]
(So on)
有没有更有效的解决方法?
解决方法
也许获取长格式的数据,然后将其汇总:
library(data.table)
setDT(dt)
dt1 <- melt(dt,id.vars = c('date','value'))
dt1 <- dt1[,.(value = sum(value)),.(date,variable)]
现在,如果要获取数据帧列表,可以使用split
:
split(dt1,dt1$variable)
,
我们可以使用tidyverse
library(dplyr)
library(tidyr)
dt %>%
pivot_longer(cols = starts_with('key'),values_to = 'value1') %>%
group_by(date,name) %>%
summarise(value = sum(value1))
一种选择是转换为disk.frame
并按操作进行分组
library(disk.frame)
dt %>%
pivot_longer(cols = starts_with('key'),values_to = 'value1') %>%
as.disk.frame %>%
group_by(date,name) %>%
summarise(value = sum(value1)) %>%
collect()
如果有多个.csv
文件,则可以直接用csv_to_disk.frame
读取
df <- csv_to_disk.frame(file.path(tempdir(),"df.csv"),inmapfn = function(chunk) {
# convert to date_str to date format and store as "date"
chunk[,date := as.Date(date_str,"%Y-%m-%d")]
chunk[,date_str:= NULL]
chunk[,new := col1 + 5]
})
,
尝试进行此<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width,initial-scale=1.0">
<title>Document</title>
<link rel="stylesheet" href="https://unpkg.com/leaflet@1.5.1/dist/leaflet.css" integrity="sha512-xwE/Az9zrjBIphAcBb3F6JVqxf46+CDLwfLMHloNu6KEQCAWi6HcDUbeOfBIptF7tcCzusKFjFw2yuvEpDL9wQ==" crossorigin=""/>
<link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.5.2/css/bootstrap.min.css" integrity="sha384-JcKb8q3iqJ61gNV9KGb8thSsNjpSL0n8PARn9HuZOnIxN0hoP+VmmDGMN5t9UJ0Z" crossorigin="anonymous">
<!-- Bootstrap core JS-->
<script src="https://kit.fontawesome.com/3590d6dbc0.js" crossorigin="anonymous"></script>
<!-- Jquery JS-->
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.5.1/jquery.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/popper.js/1.14.7/umd/popper.min.js"></script>
<script src="https://stackpath.bootstrapcdn.com/bootstrap/4.5.0/js/bootstrap.bundle.min.js"></script>
<script src="https://stackpath.bootstrapcdn.com/bootstrap/4.5.2/js/bootstrap.min.js" integrity="sha384-B4gt1jrGC7Jh4AgTPSdUtOBvfO8shuf57BaghqFfPlYxofvL8/KUEfYiJOMMV+rV" crossorigin="anonymous"></script>
<script src="https://unpkg.com/leaflet@1.7.1/dist/leaflet.js"></script>
<script src="https://code.jquery.com/ui/1.12.1/jquery-ui.min.js" integrity="sha256-VazP97ZCwtekAsvgPBSUwPFKdrwD3unUfSGVYrahUqU=" crossorigin="anonymous"></script>
<style>
#map {
height: 500px;
width: 300px;
background-color: rgb(52,58,64);
}
.popup{
display: none;
background-color: blueviolet;
width: 300px;
position:absolute;
bottom:0;
z-index:1001;
}
.container{
position:relative;
}
</style>
</head>
<body>
<div> Testing </div>
<button id = 'popupbtn'> popup</button>
<div class = 'container'>
<div id= 'map'></div>
<div id = 'popupdiv' class = 'popup'> Info This needs to go over Map div
<br>phone: 324234
<br>email: asdfsa@ssdf.sdf</div>
</div>
<div>
testing and thinds
</div>
</body>
<script>
</script>
</html>
-本机尝试:
data.table
拆分,用于原始数据:
dt <- data.table(date=c(1,1,2),key_a=c(11,11,13),key_b=c(21,21,23),key_c=c(31,31,33),key_d=c(41,41,43),value=c(51,51,53))
keynames <- grep("^key",colnames(dt),value = TRUE)
othnames <- setdiff(colnames(dt),keynames)
keynames
# [1] "key_a" "key_b" "key_c" "key_d"
othnames
# [1] "date" "value"
或针对您的汇总问题:
lapply(setNames(nm = keynames),function(kn) subset(dt,select = c(othnames,kn)))
# $key_a
# date value key_a
# 1: 1 51 11
# 2: 1 51 11
# 3: 2 53 13
# $key_b
# date value key_b
# 1: 1 51 21
# 2: 1 51 21
# 3: 2 53 23
# $key_c
# date value key_c
# 1: 1 51 31
# 2: 1 51 31
# 3: 2 53 33
# $key_d
# date value key_d
# 1: 1 51 41
# 2: 1 51 41
# 3: 2 53 43