Notes taken by Abd @ITNA Digital

Links

🔗 Link to the Video

Keywords

SQL

Table of Contents


Taxi Zone Lookup Data

taxi+_zone_lookup.csv

taxi+_zone_lookup.csv

# this url contains taxi zone data
# assume we have downloaded the file as "taxi+_zone_lookup.csv"
url = "<https://s3.amazonaws.com/nyc-tlc/misc/taxi+_zone_lookup.csv>"

df_zones = pd.read_csv("taxi+_zone_lookup.csv")
df_zones.to_sql(name='zones', con=engine, if_exists='replace') # zones is a new table

Joining the trips db with lookup db

Instead of seeing the numbers as in location id in trips database we are going to join that database with zone location database.

SELECT
  tpep_pickup_datetime,
  tpepe_dropoff_datetime,
  total_amount,
  CONCAT(zpu."Borough", ' / ', zpu."Zone") As "pickup_loc",
  CONCAT(zdo."Borough", ' / ' , zpu."Zone") As "dropoff_loc"
FROM
  yellow_taxi_trips t,
  zones zpu,
  zones zdo
WHERE
  t."PULocationID" = zpu."LocationID" AND
  t."DOLocationID" = zdo."LocationID"
LIMIT 100;

This one way to do JOINs by -

If we want to use JOIN we can do -