Postgres-根据输入和输出条目计算总工作时间

问题描述

我有下表：

1）我的公司表格

 id |   c_name   |  c_code  | status 
----+------------+----------+--------
  1 | AAAAAAAAAA |  AA1234  | Active

2）我的用户表

 id |    c_id    | u_name   | status | emp_id 
----+------------+----------+--------+--------
  1 |      1     | XXXXXXXX | Active |    1   
  2 |      1     | YYYYYYYY | Active |    2

3）我的出勤表

 id |  u_id  |        swipe_time      | status 
----+--------+------------------------+--------
  1 |   1    |  2020-08-20 16:00:00   | IN     
  2 |   1    |  2020-08-20 20:00:00   | OUT    
  3 |   1    |  2020-08-20 21:00:00   | IN     
  4 |   1    |  2020-08-21 01:00:00   | OUT    
  5 |   1    |  2020-08-21 16:00:00   | IN     
  6 |   1    |  2020-08-21 19:00:00   | OUT

我需要按日期u_id分组计算出勤率，如下所示：
注意：查询参数为“从日期”，“到日期”和“公司ID”

u_id |   u_name  |     date    |        in_time       |        out_time      | hrs 
-----+-----------+-------------+----------------------+----------------------+-----
 1   |  XXXXXXXX | 2020-08-20  |  2020-08-20 16:00:00 |  2020-08-21 01:00:00 |  7  
 1   |  XXXXXXXX | 2020-08-21  |  2020-08-21 16:00:00 |  2020-08-21 19:00:00 |  4  
 2   |  YYYYYYYY |     null    |        null          |        null          |  0

这在Postgresql中可能吗？

解决方法

使用lead窗口函数使其更简单易读。对于进出平衡的出勤事件，此方法可以正常工作，否则出勤时间为空值。这是有道理的，因为该人尚未离开或尚未参加会议或出勤数据已损坏。

select 
 u.id u_id,u.u_name,t.date_in date,t.t_in in_time,t.t_out out_time,extract('hour' from t.t_out - t.t_in) hrs
from users u
left outer join 
(
  select u_id,date_trunc('day',swipe_time) date_in,swipe_time t_in,lead(swipe_time,1) over (partition by u_id order by u_id,swipe_time) t_out,status
  from attendance
) t 
on u.id = t.u_id
where t.status = 'IN';

棘手的部分是将涵盖两天（日历）的一行扩展为两行，并正确分配“第二天”的小时数。

第一部分是获取将IN / OUT对组合为单行的数据透视表。

一种简单（但不是很有效）的方法是：

  select ain.u_id,ain.swipe_time as time_in,(select min(aout.swipe_time)
          from attendance aout
          where aout.u_id = ain.u_id
            and aout.status = 'OUT'
            and aout.swipe_time > ain.swipe_time) as time_out
  from attendance ain
  where ain.status = 'IN'

下一步是将一天以上的行分成两行。

这假设您的IN / OUT对永远不会超过两天！

with inout as (
  select ain.u_id,(select min(aout.swipe_time)
          from attendance aout
          where aout.u_id = ain.u_id
            and aout.status = 'OUT'
            and aout.swipe_time > ain.swipe_time) as time_out
  from attendance ain
  where ain.status = 'IN'
),expanded as (
  select u_id,time_in::date as "date",time_in,time_out
  from inout     
  where time_in::date = time_out::date  
  union all
  select i.u_id,x.time_in::date as date,x.time_in,x.time_out
  from inout i   
    cross join lateral (
       select i.u_id,i.time_in,i.time_in::date + 1 as time_out
       union all
       select i.u_id,i.time_out::date,i.time_out
    ) x
  where i.time_out::date > i.time_in::date  
)
select *
from expanded;

以上内容为您的示例数据返回了以下内容：

u_id | date       | time_in             | time_out           
-----+------------+---------------------+--------------------
   1 | 2020-08-20 | 2020-08-20 16:00:00 | 2020-08-20 20:00:00
   1 | 2020-08-20 | 2020-08-20 21:00:00 | 2020-08-21 00:00:00
   1 | 2020-08-21 | 2020-08-21 00:00:00 | 2020-08-21 01:00:00
   1 | 2020-08-21 | 2020-08-21 16:00:00 | 2020-08-21 19:00:00

这是如何工作的？

因此，我们首先选择与此部分在同一天开始和结束的所有行：

  select u_id,time_out
  from inout     
  where time_in::date = time_out::date

联合的第二部分使用交叉连接将跨越两天的行进行拆分，该交叉连接生成的行的原始时间为原始开始时间和午夜，另一行则从午夜开始至原始结束时间：

  select i.u_id,x.time_out
  from inout i   
    cross join lateral (
       -- this generates a row for the first of the two days
       select i.u_id,i.time_in::date + 1 as time_out
       union all
       -- this generates the row for the next day
       select i.u_id,i.time_out
    ) x
  where i.time_out::date > i.time_in::date

最后，通过按用户和日期对新的“扩展”行进行汇总，然后将其合并到users表中以获取用户名。

with inout as (
  select ain.u_id,i.time_out
    ) x
  where i.time_out::date > i.time_in::date  
)
select u.id,e."date",min(e.time_in) as time_in,max(e.time_out) as time_out,sum(e.time_out - e.time_in) as duration
from users u
  left join expanded e on u.id = e.u_id
group by u.id,e."date"
order by u.id,e."date";

这将导致：

u_id | date       | time_in             | time_out            | duration                                     
-----+------------+---------------------+---------------------+----------------------------------------------
   1 | 2020-08-20 | 2020-08-20 16:00:00 | 2020-08-21 00:00:00 | 0 years 0 mons 0 days 7 hours 0 mins 0.0 secs
   1 | 2020-08-21 | 2020-08-21 00:00:00 | 2020-08-21 19:00:00 | 0 years 0 mons 0 days 4 hours 0 mins 0.0 secs

“持续时间”列是interval，您可以根据自己的喜好format。

Online example

postgresql-9.4 sql sql