Vega-Lite:从星期一开始的星期,以及一般错误的星期数

问题描述

我是Vega-Lite的新手,正在尝试按周汇总我的数据。现有的按周显示数据的选项不适合我,因为我希望将周从星期一开始(而不是现在的星期日),实际上周数是错误的。

下面是我的基本代码

{
  "$schema": "https://vega.github.io/schema/vega-lite/v4.json","data": {
    "values": [
      {"date": "2020-09-29","count": "13","outcome": "invalid"},{"date": "2020-09-29","count": "14","outcome": "fail"},"count": "20","outcome": "pass"},{"date": "2020-09-27","count": "70","count": "30",{"date": "2020-09-26","count": "5","count": "15","outcome": "pass"}
    ]
  },"width": 280,"height": 200,"mark": {"type": "bar","tooltip": true},"encoding": {
   "x": {
      "title": "Week","field": "date","type": "ordinal","timeUnit": "week","axis": {"format": "%W"}
    },"y": {
      "title": "Number of tests","field": "count","aggregate": "sum","type": "quantitative","axis": {"orient": "right"}
    },"color": {
      "field": "outcome","type": "nominal","scale": {
        "domain": ["invalid","fail","pass"],"range": ["#c7c7c7","#8fd7f9","#ef9292"]
      },"legend": {"title": "Test results"}
    }
  }
}

rendered plot

原则上,我可以使用下面的摘录中的window函数来计算每周的计数,但是每个日期都有多个实例,并且我不希望在“结果”变量中崩溃。而且,我的数据可以在任意日期开始,因此也不可以选择从0开始的星期数。

{"calculate": "day(datum.date) == 0","as": "sundays"},{
      "window": [{"op": "sum","field": "sundays","as": "week"}],"sort": "date"
    }

我还想到了一个不太优雅的解决方案-在x轴上执行7天,在y轴上进行汇总(同时确保数据从星期一开始)。这样每周可以得到正确的总计数,但随后我就很难用周数正确标记X轴。

最后,即使我可以在星期日开始星期几(所以使用上面提供的基本代码),我仍然看到意外的星期数。由于某些原因(也许是因为我不知道如何正确地计算周数),显示的周数分别是37和38(如附图所示),而实际上它们应该是39和40。该如何解决

如果有任何提示,我将不胜感激。

解决方法

Vega的一周timeUnit具有明确定义的行为,在timeUnit documentation中有详细说明:

"week":基于星期日的星期。一年中第一个星期日之前的几天被认为是第0周,一年中的第一个星期日是第1周的开始,第二周日是第2周,依此类推。

该软件包当前没有内置的替代周定义,但是您可以在转换中使用vega expressions来从数据中计算任意数量。

如果我正确地进行了计算,我认为这将为您提供ISO周数:

{
  "$schema": "https://vega.github.io/schema/vega-lite/v4.json","data": {
    "values": [
      {"date": "2020-09-29","count": "14","outcome": "fail"},{"date": "2020-09-29","count": "20","outcome": "pass"},{"date": "2020-09-27","count": "70","outcome": "invalid"},"count": "30",{"date": "2020-09-26","count": "5","count": "15","count": "13","outcome": "pass"}
    ]
  },"transform": [
    {"calculate": "day(datetime(utcyear(datum.date),1))","as": "startingDay"},{"calculate": "(4 - datum.startingDay) % 7 - 2","as": "mondayOfFirstWeek"},{"calculate": "1 + floor((utcdayofyear(datum.date) - datum.mondayOfFirstWeek) / 7)","as": "ISOweek"}
  ],"width": 280,"height": 200,"mark": {"type": "bar","tooltip": true},"encoding": {
   "x": {
      "title": "Week","field": "ISOweek","type": "ordinal"
    },"y": {
      "title": "Number of tests","field": "count","aggregate": "sum","type": "quantitative","axis": {"orient": "right"}
    },"color": {
      "field": "outcome","type": "nominal","scale": {
        "domain": ["invalid","fail","pass"],"range": ["#c7c7c7","#8fd7f9","#ef9292"]
      },"legend": {"title": "Test results"}
    }
  }
}

enter image description here

有关转换的简要说明:

  • {"calculate": "day(datetime(utcyear(datum.date),
    计算给定年份的1月1日所在的星期几(Sunday = 0,Monday = 1 ... Saturday = 6)。
  • {"calculate": "(4 - datum.startingDay) % 7 - 2",
    这将计算一年中的第一天。因此,例如,如果startingDay = 5,则1月1日是星期五,因此一年的第4天是包含星期四的第一周的星期一。如果startingDay = 4,则1月1日是星期四,因此-2天是包含星期四的第一周的星期一。
  • {"calculate": "1 + floor((utcdayofyear(datum.date) - datum.mondayOfFirstWeek) / 7)","as": "ISOweek"}
    
    这是从上面确定的第一个星期一开始算起的7天工作周的舍入数。

请注意,在解析utc时,我们使用datum.date版的timeUnits来正确处理2020-09-29之类的不完整时间戳。如果没有,那么ISOweek在1月1日将是不正确的。