Route Stats

gtfs_utils.core_computations.compute_route_stats(trip_stats_subset: pandas.core.frame.DataFrame, date: datetime.date, source_files_base_name: List[str], headway_start_time: str = '07:00:00', headway_end_time: str = '19:00:00') → pandas.core.frame.DataFrame

Compute stats for the given subset of trips stats.

Parameters:
  • trip_stats_subset – Subset of the output of compute_trip_stats()
  • date – The original schedule date
  • source_files_base_name – The original zips the data is based on (GTFS, Tariff, etc.)
  • headway_start_time – HH:MM:SS time string indicating the start time for computing headway stats
  • headway_end_time – HH:MM:SS time string indicating the end time for computing headway stats
Returns:

A DataFrame with columns as described below

Route stats table has the following columns:

  • agency_id - Same as in gtfs_utils.compute_trip_stats()
  • agency_name - Same as in gtfs_utils.compute_trip_stats()
  • all_start_time - All of the start times (formatted as HH:MM:SS) in which the trips in the route start, separated by semicolons
  • all_stop_code - Same as in gtfs_utils.compute_trip_stats()
  • all_stop_desc_city - Same as in gtfs_utils.compute_trip_stats()
  • all_stop_id - Same as in gtfs_utils.compute_trip_stats()
  • all_stop_latlon - Same as in gtfs_utils.compute_trip_stats()
  • all_stop_name - Names of all stops of the trip (as described in stop_name field in stops.txt file), separated by semicolons
  • all_trip_id - All of the identifiers (trip_id, as specified in trips.txt file) of the trips in the route, separated by semicolons
  • all_trip_id_to_date - all the trip_id_to_date ids that match this route, separated by semicolon
  • cluster_id - Same as in gtfs_utils.compute_trip_stats()
  • cluster_name - Same as in gtfs_utils.compute_trip_stats()
  • cluster_sub_desc - Same as in gtfs_utils.compute_trip_stats()
  • date - Same as in gtfs_utils.compute_trip_stats()
  • end_stop_city - Same as in gtfs_utils.compute_trip_stats()
  • end_stop_desc - Same as in gtfs_utils.compute_trip_stats()
  • end_stop_id - Same as in gtfs_utils.compute_trip_stats()
  • end_stop_lat - Same as in gtfs_utils.compute_trip_stats()
  • end_stop_lon - Same as in gtfs_utils.compute_trip_stats()
  • end_stop_name - Same as in gtfs_utils.compute_trip_stats()
  • end_time - Same as in gtfs_utils.compute_trip_stats(), referring to the last trip of the route
  • end_zone - Same as in gtfs_utils.compute_trip_stats()
  • source_files - Same as in gtfs_utils.compute_trip_stats()
  • is_bidirectional - 1 if the route has trips in both directions, otherwise 0
  • is_loop - Same as in gtfs_utils.compute_trip_stats()
  • line_type - Same as in gtfs_utils.compute_trip_stats()
  • line_type_desc - Same as in gtfs_utils.compute_trip_stats()
  • max_headway - The maximal duration (in minutes) between trip starts on the route between headway_start_time and headway_end_time
  • mean_headway - The mean duration (in minutes) between trip starts on the route between headway_start_time and headway_end_time
  • mean_trip_distance - The full travel distance of each trip on the route in meters, which is the maximal shape_dist_traveled, as specified in stop_times.txt file (calculated as service_distance/num_trips)
  • mean_trip_duration - Duration of each trip on the route in hours (calculated as service_duration/num_trips)
  • min_headway - The minimal duration (in minutes) between trip starts on the route between headway_start_time and headway_end_time
  • num_stops - Same as in gtfs_utils.compute_trip_stats()
  • num_trip_ends - Number of trips on the route in the subset with non-null end times before 23:59:59
  • num_trip_starts - Number of trips on the route in the subset with non-null start times
  • num_trips - Number of trips on the route in the subset
  • num_zones - Same as in gtfs_utils.compute_trip_stats()
  • num_zones_missing - Same as in gtfs_utils.compute_trip_stats()
  • peak_end_time - End time of first longest period during which the peak number of trips (peak_num_trips) occurs
  • peak_num_trips - Maximal number of simultaneous trips in the service (for a given direction)
  • peak_start_time - Start time of first longest period during which the peak number of trips (peak_num_trips) occurs
  • route_alternative - Same as in gtfs_utils.compute_trip_stats()
  • route_direction - Same as in gtfs_utils.compute_trip_stats()
  • route_id - Same as in gtfs_utils.compute_trip_stats()
  • route_long_name - Same as in gtfs_utils.compute_trip_stats()
  • route_mkt - Same as in gtfs_utils.compute_trip_stats()
  • route_short_name - Same as in gtfs_utils.compute_trip_stats()
  • route_type - Same as in gtfs_utils.compute_trip_stats()
  • service_distance - The full travel distance of all trips on the route in meters, which is the maximal shape_dist_traveled, as specified in stop_times.txt file.
  • service_duration - Total duration of all trips on the route in hours
  • service_speed - Average speed each trip on the route in km/h
  • source_files - base name of the files the data is based on (as they are saved on S3).
  • start_stop_city - Same as in gtfs_utils.compute_trip_stats()
  • start_stop_desc - Same as in gtfs_utils.compute_trip_stats()
  • start_stop_id - Same as in gtfs_utils.compute_trip_stats()
  • start_stop_lat - Same as in gtfs_utils.compute_trip_stats()
  • start_stop_lon - Same as in gtfs_utils.compute_trip_stats()
  • start_stop_name - Same as in gtfs_utils.compute_trip_stats()
  • start_time - Same as in gtfs_utils.compute_trip_stats(), referring to the first trip of the route
  • start_zone - Same as in gtfs_utils.compute_trip_stats()

If trip_stats_subset is empty, return an empty DataFrame.