Table of contents
No headings in the article.
Introduction
The goal of this project is to perform data analytics on Uber data using various tools and technologies, including GCP Storage, Python, Compute Instance, Mage Data Pipeline Tool, BigQuery, and Looker Studio.
Technology Used Programming Language - Python Google Cloud Platform
Google Storage Compute Instance BigQuery Looker Studio Modern Data Pipeine Tool - mage.ai
Contibute to this open source project - github.com/mage-ai/mage-ai
Dataset Used TLC Trip Record Data Yellow and green taxi trip records include fields capturing pick-up and drop-off dates/times, pick-up and drop-off locations, trip distances, itemized fares, rate types, payment types, and driver-reported passenger counts.
Here is the dataset used in the video - github.com/darshilparmar/uber-etl-pipeline-..
More info about dataset can be found here:
Website - nyc.gov/site/tlc/about/tlc-trip-record-data.. Data Dictionary - nyc.gov/assets/tlc/downloads/pdf/data_dicti..
passenger_count_dim passenger_count_id passenger_count
trip_distance_dim trip_distance_id trip_distance
rate_code_dim rate_code_id RatecodeID rate_code_name
payment_type_dim payment_type_id payment_type payment_type_name
datetime_dim datetime_id tpep_pickup_datetime pick_hour pick_day pick_month pick_year pick_weekday tpep_dropoff_datetime drop_hour drop_day drop_month drop_year drop_weekday
pickup_location_dim pickup_location_id pickup_latitude pickup_longitude
dropoff_location_dim dropoff_location_id dropoff_latitude dropoff_longitude
fact_table trip_id VendorID datetime_id passenger_count_id trip_distance_id rate_code_id store_and_fwd_flag pickup_location_id dropoff_location_id payment_type_id fare_amount extra mta_tax tip_amount tolls_amount improvement_surcharge total_amount