如何在sqlite中加入这个巨人

问题描述

我正在使用python3分析sqlite数据库文件中的某些数据。我想将所有表连接到Python中的一个巨型表中。我对执行此操作的python命令有些了解,但是sql语句对我来说太复杂了。我需要创建将在数据库文件上执行的sql语句的帮助。我也希望所有这些数据也作为熊猫数据框输出

sqlite文件中,我有以下表格:

station
     id
     name
     lat
     long
     dock_count
     city
     installation_date
status
     station_id
     bikes_available
     docks_available
     time
trip
     id
     duration
     start_date
     start_station_name
     start_station_id
     end_date
     end_station_name
     end_station_id
     bike_id
     subscription_type
     zip_code
weather
     date
     max_temperature_f
     mean_temperature_f
     min_temperature_f
     max_dew_point_f
     mean_dew_point_f
     min_dew_point_f
     max_humidity
     mean_humidity
     min_humidity
     max_sea_level_pressure_inches
     mean_sea_level_pressure_inches
     min_sea_level_pressure_inches
     max_visibility_miles
     mean_visibility_miles
     min_visibility_miles
     max_wind_Speed_mph
     mean_wind_speed_mph
     max_gust_speed_mph
     precipitation_inches
     cloud_cover
     events
     wind_dir_degrees
     zip_code

我想将所有表联接到一个巨型表中,然后选择1000次行程并包含所有联接的数据。这意味着我需要了解行程表中的一些外键,它们是:

start_date,points to weather,status

start_station_id,points to station

end_date,status

end_station_id points to station

我正在考虑的联接如下:

select 1000 rows from trip join (

weather where trip.start_date = weather.date as startweather

) and join (

weather where trip.end_date = weather.date as endweather

) and join (

station where trip.start_station_id = station.id as startstation

)  and join(

station where trip.end_station_id = station.id as endstation

) and join (

status where trip.start_station_id = station.status_id and trip.start_date = station.date as startstationstatus

) and join(

status where trip.end_station_id = station.status_id and trip.end_date = station.date as endstationstatus)

)

解决方法

我将发布该问题的答案,因为我最终使用的查询显示了sqlite的许多不同功能。这是我使用的查询:

Select count() FROM trip as tr INNER JOIN station as startst on startst.id = tr.start_station_id INNER JOIN station as endst on endst.id = tr.end_station_id INNER JOIN weather as startwea on startwea.date = SUBSTR(tr.start_date,1,9) INNER JOIN status as ststat on trim(substr(ststat.time,6,2),"0") = substr(tr.start_date,instr(tr.start_date,"/") - 1) and trim(substr(ststat.time,9,"/") + 1,instr(substr(tr.start_date,"/") + 1),"/") - 1) and substr(ststat.time,4) = substr(tr.start_date," "),-4) WHERE tr.id  > 0 AND tr.id <= 7000000 AND tr.id % 100000 = 0

最后这太复杂了,因为我必须通过date列将两个表连接起来,并且每一列中的日期格式都不同。

相关问答

Selenium Web驱动程序和Java。元素在(x,y)点处不可单击。其...
Python-如何使用点“。” 访问字典成员?
Java 字符串是不可变的。到底是什么意思?
Java中的“ final”关键字如何工作?(我仍然可以修改对象。...
“loop:”在Java代码中。这是什么,为什么要编译?
java.lang.ClassNotFoundException:sun.jdbc.odbc.JdbcOdbc...