问题描述
我想从一个给定的日期获取Hive中所有星期日和星期六的日期。 例如,如果给定的日期是2020-10-01,则需要返回两行,其中sunday_dates,saturday_dates的日期为“ 2020-10-01”之后的所有星期日和星期六。
我尝试过类似的操作,但似乎对我不起作用。
spark.sql("select date_sub('2020-10-01',cast(date_format(current_date(),'u')%7 as int)) as sunday_dates").show(10,False)
+------------+
|sunday_dates|
+------------+
|2020-09-29 |
+------------+
在Hive或pyspark中有什么方法可以实现这一目标。
谢谢!
解决方法
您需要使用 date_trunc()
才能到达星期开始日期 date_sub()
和 date_sub()
以获取周六和周日
在此处创建数据框
df = spark.createDataFrame([("2020-11-02",1),("2020-11-03",2),("2020-11-04",3)],["event_dt","word"])
df.show()
df = df.withColumn("week_start",F.date_trunc('WEEK',F.col("event_dt")))
#`In case you want to get backward weekdays`
df = df.selectExpr('*','date_sub(week_start,2) as backward_Saturday')
df = df.selectExpr('*',1) as backward_Sunday')
# In case you want forward weekends
df = df.selectExpr('*','date_add(week_start,5) as forward_Saturday')
df = df.selectExpr('*',6) as forward_Sunday')
df.show()
输入
+----------+----+
| event_dt|word|
+----------+----+
|2020-11-02| 1|
|2020-11-03| 2|
|2020-11-04| 3|
+----------+----+
输出
+----------+----+-------------------+-----------------+---------------+----------------+--------------+
| event_dt|word| week_start|backward_Saturday|backward_Sunday|forward_Saturday|forward_Sunday|
+----------+----+-------------------+-----------------+---------------+----------------+--------------+
|2020-11-02| 1|2020-11-02 00:00:00| 2020-10-31| 2020-11-01| 2020-11-07| 2020-11-08|
|2020-11-03| 2|2020-11-02 00:00:00| 2020-10-31| 2020-11-01| 2020-11-07| 2020-11-08|
|2020-11-04| 3|2020-11-02 00:00:00| 2020-10-31| 2020-11-01| 2020-11-07| 2020-11-08|
+----------+----+-------------------+-----------------+---------------+----------------+--------------+