问题描述
我是Vega-Lite的新手,正在尝试按周汇总我的数据。现有的按周显示数据的选项不适合我,因为我希望将周从星期一开始(而不是现在的星期日),实际上周数是错误的。
下面是我的基本代码。
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json","data": {
"values": [
{"date": "2020-09-29","count": "13","outcome": "invalid"},{"date": "2020-09-29","count": "14","outcome": "fail"},"count": "20","outcome": "pass"},{"date": "2020-09-27","count": "70","count": "30",{"date": "2020-09-26","count": "5","count": "15","outcome": "pass"}
]
},"width": 280,"height": 200,"mark": {"type": "bar","tooltip": true},"encoding": {
"x": {
"title": "Week","field": "date","type": "ordinal","timeUnit": "week","axis": {"format": "%W"}
},"y": {
"title": "Number of tests","field": "count","aggregate": "sum","type": "quantitative","axis": {"orient": "right"}
},"color": {
"field": "outcome","type": "nominal","scale": {
"domain": ["invalid","fail","pass"],"range": ["#c7c7c7","#8fd7f9","#ef9292"]
},"legend": {"title": "Test results"}
}
}
}
原则上,我可以使用下面的摘录中的window函数来计算每周的计数,但是每个日期都有多个实例,并且我不希望在“结果”变量中崩溃。而且,我的数据可以在任意日期开始,因此也不可以选择从0开始的星期数。
{"calculate": "day(datum.date) == 0","as": "sundays"},{
"window": [{"op": "sum","field": "sundays","as": "week"}],"sort": "date"
}
我还想到了一个不太优雅的解决方案-在x轴上执行7天,在y轴上进行汇总(同时确保数据从星期一开始)。这样每周可以得到正确的总计数,但随后我就很难用周数正确标记X轴。
最后,即使我可以在星期日开始星期几(所以使用上面提供的基本代码),我仍然看到意外的星期数。由于某些原因(也许是因为我不知道如何正确地计算周数),显示的周数分别是37和38(如附图所示),而实际上它们应该是39和40。该如何解决?
如果有任何提示,我将不胜感激。
解决方法
Vega的一周timeUnit具有明确定义的行为,在timeUnit documentation中有详细说明:
"week"
:基于星期日的星期。一年中第一个星期日之前的几天被认为是第0周,一年中的第一个星期日是第1周的开始,第二周日是第2周,依此类推。
该软件包当前没有内置的替代周定义,但是您可以在转换中使用vega expressions来从数据中计算任意数量。
如果我正确地进行了计算,我认为这将为您提供ISO周数:
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json","data": {
"values": [
{"date": "2020-09-29","count": "14","outcome": "fail"},{"date": "2020-09-29","count": "20","outcome": "pass"},{"date": "2020-09-27","count": "70","outcome": "invalid"},"count": "30",{"date": "2020-09-26","count": "5","count": "15","count": "13","outcome": "pass"}
]
},"transform": [
{"calculate": "day(datetime(utcyear(datum.date),1))","as": "startingDay"},{"calculate": "(4 - datum.startingDay) % 7 - 2","as": "mondayOfFirstWeek"},{"calculate": "1 + floor((utcdayofyear(datum.date) - datum.mondayOfFirstWeek) / 7)","as": "ISOweek"}
],"width": 280,"height": 200,"mark": {"type": "bar","tooltip": true},"encoding": {
"x": {
"title": "Week","field": "ISOweek","type": "ordinal"
},"y": {
"title": "Number of tests","field": "count","aggregate": "sum","type": "quantitative","axis": {"orient": "right"}
},"color": {
"field": "outcome","type": "nominal","scale": {
"domain": ["invalid","fail","pass"],"range": ["#c7c7c7","#8fd7f9","#ef9292"]
},"legend": {"title": "Test results"}
}
}
}
有关转换的简要说明:
-
计算给定年份的1月1日所在的星期几(Sunday = 0,Monday = 1 ... Saturday = 6)。{"calculate": "day(datetime(utcyear(datum.date),
-
这将计算一年中的第一天。因此,例如,如果{"calculate": "(4 - datum.startingDay) % 7 - 2",
startingDay = 5
,则1月1日是星期五,因此一年的第4天是包含星期四的第一周的星期一。如果startingDay = 4
,则1月1日是星期四,因此-2天是包含星期四的第一周的星期一。 -
这是从上面确定的第一个星期一开始算起的7天工作周的舍入数。{"calculate": "1 + floor((utcdayofyear(datum.date) - datum.mondayOfFirstWeek) / 7)","as": "ISOweek"}
请注意,在解析utc
时,我们使用datum.date
版的timeUnits来正确处理2020-09-29
之类的不完整时间戳。如果没有,那么ISOweek在1月1日将是不正确的。