Guidelines for Producing GTFS Static Data for Transit

GTFS Static Data is a comprehensive representation of scheduled service on your network. Producing GTFS doesn’t have to be a headache! By following a few guidelines, you can make sure that your GTFS data is represented in Transit (and other apps) as intended, and that you’re always providing public transit riders with the best, clearest, and most up-to-date information possible. To see how GTFS is displayed, check How Transit Displays GTFS Data.


In this article

Integration of New Transit Networks

Dataset Publishing

Dataset Formatting

Required GTFS Files and fields

Optional GTFS files and felds

Beyond GTFS

Useful Links

Contact


Integration of New Transit Networks

  • If you work for (or on behalf of) a transit agency, you may request that your agency’s transit network be integrated into Transit. Please email partners@transitapp.com with data@transitapp.com in cc.
    • In your email, you must include a URL linking to the GTFS dataset of your agency’s transit network. Requests without GTFS datasets will not be considered.
    • In your email, you should also include any URLs linking to real-time data feeds. Real-time data must match the GTFS dataset IDs (eg route, trip and stop IDs)
    • In your email, you must share with us the email address of a contact to whom we can send any feedback about your agency’s GTFS data. This contact should be the one who takes care of your agency’s GTFS data production, and the email will ideally be one that doesn’t change with employee turnover (e.g. in our case, data@transitapp.com).
  • If you have already talked to our Partnerships team about integrating your agency’s transit network into Transit, please send the URL linking to your GTFS dataset to data@transitapp.com at least two weeks before the planned launch in the app.

Dataset Publishing

How to share static data

  • Any dataset must be accessible via a permanent and static URL so our system can pick it up automatically. That means the case and characters must remain the same in the URL over time.
    • The URL should be publicly accessible and not behind any geolocation or firewall. 
    • A good example of this is our home agency STM, where new GTFS is published at https://www.stm.info/sites/default/files/gtfs/gtfs_stm.zip and the link URL is never modified.
    • If you share your GTFS through an API, It may be possible for Transit to integrate it if we are given the appropriate keys and access. We cannot guarantee that this is possible, however, and it results in worse performance than a stable public URL endpoint. We recommend that agencies require vendors to publish data through GTFS rather than API, as this will result in data being available more widely.
    • If you do not have GTFS posted to a URL, it is possible for Transit to integrate .zip datasets that are sent via email to data@transit.app, but only on special or emergency occasions and not on a regular basis.

How often Transit fetches static data

  • New datasets published at the permanent and static URL will be automatically fetched by Transit within 4 hours
  • Once fetched, a new dataset takes on average 2 hours to be uploaded in the app if it does not contain any errors or trigger any issues on our end. Note that this time may vary, as upload time greatly depends on the size of the dataset.

When to publish new static data

  • Recommended according to GTFS Best Practices. - Any new dataset must be published at least 7 days in advance of new schedules taking effect. 
    • Providing new data 7 days in advance will help ensure that all Transit app features such as holiday service notifications work properly, and allows your riders to plan trips in advance.
    • Errors in data production may arise on your end, so aiming for a 7-day buffer helps ensure things get fixed in time.
  • Minimum timeline to guarantee Transit’s data quality checks - at least 4 days 
    • Providing new data within at least 4 days guarantees that our team can provide manual data quality verification if needed. Data consumption and cleaning requires time to work through our pipeline and may trigger manual checks or improvements, It takes time on your side to produce the data; it also takes time on our side to properly consume it!
  • For minor dataset changes - up to same day
    • In case of minor schedule revisions, our system should detect the latest dataset and process it automatically within a few hours (assuming there are no errors). If the last-minute changes are important, email us at data@transit.app to let our team know.

The most common cause of rider confusion is day-of or late static data uploads that results in the wrong information displayed in the app. 

How long static data should be valid

  • Any dataset must have valid schedules for a period of at least 7 days, and preferably for 30 days or the entire service period. (Use feed_start_date and feed_end_date in feed_info.txt)
  • A dataset should not be valid for more than a year for schedule reliability reasons.

When we receive future schedules, Transit extends the active schedules until the start date of the upcoming schedules, such that the new schedules are only displayed in the app the day they go into service. Please ensure that the new dataset’s start dates in feed_info.txt, calendar.txt and calendar_dates.txt match the day the new schedules they go into effect. Do not include calendar_dates in the past, as they will cause issues merging. If a dataset expires and there is no new dataset published, Transit will extend the schedules for the expired dataset for one month into the future. We do both these things to ensure users can always see schedules in the app. However, if schedules are not updated within a reasonable time after expiry, they are eventually removed from the app.

Dataset Formatting

GTFS Specification and Best Practices

  • Transit integrates the transit network’s data provided in GTFS. Any GTFS dataset must follow the GTFS specification closely.
    • Transit advises data producers to read and follow the GTFS specification as stated in the document reference.md in the GTFS GitHub repository, as it is always the most up-to-date document.
    • Any GTFS dataset must follow the File Requirements, must include all required files and fields, and should include any other optional files and fields when applicable, for the sake of helping transit riders as much and as accurately as possible in their journeys.
    • Transit advises data producers to check the compliance of any GTFS dataset with the GTFS specification before its release, using the canonical GTFS Validator from MobilityData’s GitHub repository.

If a GTFS dataset does not follow the GTFS specification, this dataset will be rejected by our system and its schedules will not be displayed in Transit.

If you work for (or on behalf of) a transit agency whose network is already available in Transit and you are unsure about how to model GTFS data, don’t hesitate to reach out to us at data@transitapp.com.

Data Consistency

  • Consistent Data Formatting: Ensure datasets remains consistent across all iterations. While automated formatting rules may be applied by Transit, inconsistencies can lead to issues in formatting.
  • Alignment with Real-World Information: All GTFS rider-facing information (e.g., names, headsigns, colors) should accurately reflect what riders encounter in person and on digital materials.

Transit may change values to improve data display for riders. However, Transit is not responsible for any dataset inaccuracies and can not guarantee fixing issues on behalf of transit agencies.


Required GTFS files and fields

Required or conditionally required are the minimum GTFS data needed. The table below outlines the minimum required or conditionally required GTFS data fields for a functional and accurate dataset. For information on how Transit displays these fields, refer to How Transit Displays GTFS Data.

Minimum Required Files

Required Fields

agency.txt agency_name, agency_url, agency_timezone
stops.txt stop_id, stop_name, stop_lat, stop_lon
routes.txt route_id, route_type, route_short_name or route_long_name (conditionally required)
trips.txt route_id, service_id, trip_id
stop_times.txt trip_id, arrival_time, departure_time, stop_id, stop_sequence
calendar.txt* service_id, monday, tuesday, wednesday, thursday, friday, saturday, sunday, start_date, end_date

*calendar.txt or calendar_dates may be used. calendar.txt is usually used to define regular weekday and weekend services. Whereas calendar_dates.txt is used to describe exceptions in regular service like holidays or events, we strongly recommend you include both for human readability

agency.txt

  • Ensure that the agency_timezone field is accurate to your area. You can look here to see the correct time zone for your agency.
  • Don’t forget the agency_url! It’s required in GTFS.

stops.txt

  • stop_name
    • If any of the symbols/words (+, &, @, at) appear between words, Transit replaces them with "/" to ensure consistency and readability of stop_names across transit networks within the app
    • This field should be less than 50 characters.
    • This field should not include redundant information such as the words “stop” or “station” unless necessary

routes.txt

First, you should determine if a route is unidirectional, or bidirectional (Loop and Lasso)

  • One way to determine whether a route is unidirectional or bidirectional is to put yourself in the shoes of a rider on the route.
    • If, along most of the route, the rider can walk across the street, or a block away (in the case of one-way streets), to catch vehicles of the same route going in the opposite direction, the route is bidirectional.
    • Otherwise, the route is unidirectional.
  • Any bidirectional route should have 2 groups of trips. Transit recommends providing the field trips.direction_id, and the trips of the same group must have the same trips.direction_id value (either 0 or 1).
    • If the bidirectional route is shaped like a straight line, one group of trips represents outbound trips (or direction "A") and the other group trips represents inbound trips (or direction "B").
    • If the bidirectional route is shaped like a loop, one group of trips represents clockwise trips (or direction "A") and the other group of trips represents counterclockwise trips (or direction "B").
    • If the bidirectional route is shaped like a lasso (or lollipop), and even if the route has only one actual terminus (the vehicle may have no dead time or timepoint in the loop section of the lasso, and the riders may stay onboard), there should be at least 2 groups of trips.
      • Two termini should exist in the data, one of which should be located in the loop section of the lasso.
      • One group of trips represents trips heading to the loop section of the lasso (or direction "A") and the other group of trips represents trips leaving from the loop section of the lasso (or direction "B").
      • All trips "A" and trips "B" operated in a row by the same vehicle should be linked with either in-seat trip transfers (transfers.transfer_type=4) when riders may stay onboard the vehicle, or off-seat trip transfers (transfers.transfer_type=5) when riders must disembark. Trips.block_id may be used instead, but it is not explicit regarding in-seat or off-seat situations.
  • Any bidirectional routes, including routes shaped like a loop or like a lasso (or lollipop), should provide different headsign values for each direction.
    • The riders must not see the same headsign (e.g. "Loop" or "Downtown") in both directions as they will get confused on which vehicle to board.
    • In the case of a bidirectional loop, the trip_headsign values "clockwise" and "counterclockwise" may be used (or anything else that has the same meaning). Alternatively, stop_headsign values can be populated with, in one direction, "A via X" until A is reached, then "B via Y"; and, in the other direction, "A via Y" until A is reached, then "B via X".

A route field should encompass all services under the same branding that is communicated to passengers, meaning there should not be a separate route for each direction or itinerary. Routes with multiple branches or stop sequences should be considered as one route if they fall under the same branding. If two routes share a numeral (e.g. 16A & 16B) but are distinct in agency publications, they should be treated as separate.

  • route_short_name
    • Required by Transit and used as the main display name
    • Should be concise and contain the briefest designation in use by riders or in agency publications
    • If a route has a numerical short name, this value should go in route_short_name
    • The route_short_name for a single route should stay consistent across all versions of the data. Errors may occur if this is changed.
  • route_long_name
    • Optional, and used by Transit to display additional information about a line when users enter it in the search bar
    • Should not contain the same information as in route_short_name
    • The route_long_name should encompass as much information as possible in a relatively small number of characters.
    • It is best to be as stylistically consistent as possible with route_long_name values within a dataset when using character formatting ( -, / &, @, etc.)
route_short_name route_long_name
✅ 18 ✅ Beaubien
✅ Blue ✅ Campus Counterclockwise Circulator
✅ 604 ❌ 604
✅ 14 ❌ Route 14
❌ ndodge ✅ North Dodge

If either the route_short_name or route_long_name is not provided, Transit will duplicate the populated field into the unpopulated field so that there are always values for both fields.

  • route_type
    • Transit has default display settings based on route_type. This field must always be filled and consistent for the route from one data publication to the next. For example, a train route 1 should not be changed from route_type=1 to 3 as this will change the display in the app.
    • Note that if a route is operated by a bus, even if it is serving as rail replacement, it should have a route_type of 3
  • route_color and route_text_color
    • This should be consistent with any paper or online communication materials made by the transit agency
    • This should not be too bright and should be able to provide enough contrast to be visible to the user
      • Ensure that colors are Web Content Accessibility Guidelines (WCAG) compliant using a contrast checker

trips.txt

The fields in trips.txt are extremely helpful to clarify complex trip information to Transit users. The fields below are used front and center within the app, so we encourage agencies to make sure they’re as accurate as possible.

  • trip_headsign
    • This field is used by Transit to display the direction or destination of each trip.
    • It should convey information about the direction or destination of the trip, and not the origin or waypoints along the route. Consistency when naming is key to providing a good user experience.
    • It should not contain any information that is already defined elsewhere in the GTFS, such as the route_short_name or route_long_name.
    • For loop routes and routes that have more than one general destination, or hit multiple waypoints along the route, use the stop_headsign field.
    • For routes that have different branches, ensure that the trip_headsign fields reflect the different destinations for each branch
    • Transit will automatically recognize letter branch codes if a trip_headsign begins with a single letter character followed by a hyphen (i.e. A - Queen West)
❌ Trip Headsign ✅ Trip Headsign
Route name eg (Route 5) Northbound/Outbound
Generic Stop Name eg (5th & Main)  Neighborhood name eg (Downtown)
Origin Name to Destination Name Destination Name eg (Verdun)
  • direction_id
    • This field is used by Transit to display the two different directions along a route.
    • It should be populated as accurately as possible to ensure that swiping between directions is possible in the app.
    • In the case of a unidirectional loop route, only one direction can be specified. It is recommended to use stop_headsigns to add clarification for riders.
  • wheelchair_accessible
    • This field should be filled to reflect the accessibility of the trip as a whole.
    • usually reflects the vehicle serving the trip.
  • bikes_allowed
    • A value of 1 in this field allows the trip planner to consider this trip as part of a multimodal trip plans for users who have “Personal Bike” enabled in their settings. These trips are then considered as part of a bike + transit trip plan based on the following criteria:
    • Subway, train, light rail, and ferry trips are considered as long as the departing stop is within 15km of the users origin. In other words, Transit will not suggest users to bike more than 15km to catch their departing vehicle.
    • Bus trips are evaluated using an algorithm that considers trip distance, speed, and distance to the departing stop. This check is in place to exclude buses that are typically slower than biking for the same distance. 
    • Call out - These limitations are designed to maintain the processing speed of the trip planner. However, if these limitations are not satisfactory, we can adjust the settings to be either more permissive or more restrictive regarding personal bikes on transit vehicles.

stop_times.txt

The fields in stop_times.txt contain more details that complete and complement the information from trips.txt. The accuracy of stop_times is extremely important as this is the basis of all trips.

Please note that, when provided, stop_headsigns will override trip_headsigns

  • stop_headsign
    • If the headsign changes along a trip, using stop_headsigns allows you to change the headsign information as the trip is progressing. This is useful when trip_headsigns are not enough to identify the directionality of a trip.
    • stop_headsigns can be assigned on a per-stop basis; however, the headsign at each stop should not be the name of the following stop. Rather, it should change across a group of stops as the trip progresses
    • For loop routes, stop_headsigns can be used to differentiate between different portions of the route depending on how the agency communicates directionality. See more examples in the GTFS best practices.

Here is a good use of stop_headsigns for a bus line that loops across downtown and two distinct neighborhoods.

stop_sequence

✅ stop_headsign

Until downtown 

Downtown

Until neighborhood A

Neighborhood A

Until Neighborhood B

Neighborhood B

Until Downtown

Downtown

Here is a bad use of stop_headsigns that change with the name of the next stop.

stop_sequence

❌stop_headsign

1

Saint-Joseph / Saint-Laurent

2

Casgrain / Maguire

3

Casgrain / Saint-Viateur

4

Saint-Viateur / Saint-Laurent


  • pickup_type & drop_off_type
    • These fields are used by Transit to show the availability of service at a particular stop for a specific trip during the day
    • If not specified correctly, the app could route the user to a stop where they will not be picked up by the bus. Or, if they are already on the bus, they may be unable to be dropped off at their stop. This often applies to commuter buses or trains, as well as express routes.

We do not support the pickup_type and drop_off_type values 2: "Must phone agency" and 3: "Must coordinate with driver". A value of 2 will be displayed as if it were a 1, and a value of 3 will be displayed as if it were a 0.

calendar.txt & calendar_dates.txt

The dates and days of operation defined in calendar.txt and in calendar_dates.txt describe the active services in the GTFS. 

  • Calendars should be valid for at least the next 7 days, preferably covering the entire service period.
  • Expired or unused service_ids should not be included
  • It is helpful to both GTFS producers and consumers to provide clearly labelled and descriptive service_id fields when possible.

To describe holiday or special service in the GTFS period, the best method is to use calendar_dates.txt to remove regular service (with exception_type = 2)  and replace it with special service (exception_type = 1). Ensure that the holiday service_ids are referencing the correct trips in trips.txt so that holiday service applies to the trips. 

Transit has automated checks for large anomalies in GTFS service levels. We do this to prevent bad schedules from making it to riders in the app. Our system compares the daily schedules of the next 7 days from the new dataset with the average daily schedules of the past 4 weeks on a per-route basis. The sensitivity of this check varies depending on the route. Generally, a dataset is rejected if one or more of the following conditions are met:

  • 100% decrease in daily trips on rail routes (route_type 0, 1, or 2)
  • 100% decrease in daily trips on 25% or more of frequent routes (routes with more than 40 trips a day)
  • 100% decrease in daily trips on 75% or more of any routes

These rejected datasets generate an error notice to our team and are not pushed to the app. One of our Transit Analysts will investigate and validate the schedules if the changes are expected, or take corrective action by contacting the agency about the unintended gaps in service.


For smaller decreases in service, our pipeline will trigger a warning notice to our team and the dataset will be uploaded as usual. 

If you have a big service change coming up, please can email data@transit.app to inform us of the upcoming changes so that our analysts can ensure that your GTFS properly reflects the intended service change


Optional GTFS files and fields

frequencies.txt 

If you provide frequencies.txt, we will use them to generate a trip_id and stop_times for each departure (based on headway_secs), and populate trips.txt and stop_times.txt accordingly. Please note that we do not support GTFS-RT for frequency=based trips, as the trip_ids in the static and the real-time feeds will not match. 

Wheelchair accessibility

The accessibility of a trip depends on the accessibility of both the stop and the vehicle. These are assigned using the following fields:

  • stops.wheelchair_boarding
    • This field indicates whether wheelchair boarding is possible from the stop itself
    • Possible answers are 0,1, or 2 and depending on the location_type value of that stop
  • trips.wheelchair_accessible
    • This field reflects the vehicle’s capability to onboard and carry wheelchairs
    • This field should be populated to match the accessibility of the transit agency’s fleet
  • Pathways.pathway_mode
    • Pathway modes 2 (Stairs) and 4 (Escalator) will not be displayed for accessible trips where pathways data exists

If you are unable to modify these fields in your GTFS, we can modify all your data to indicate whether all stops and/or all trips are accessible. For example, if 100% of your fleet is  wheelchair accessible, we can ensure that they appear as such in the app. Just let us know at data@transit.app.

Text accessibility

  • Text-To-Speech fields
    • Transit consumes text-to-speech values from the fields routes.tts_route_short_name , routes.tts_route_long_name, and stops.tts_stop_name.
    • These fields should contain the same information as the parent field, avoiding abbreviations for generic names, such as "St", which could be either "Street" or "Saint". Other examples to avoid include Pr (Park), Ln (Lane), Blvd (Boulevard), and Pl (Plaza or Place).
    • If acronyms are commonly used in speech, they should be kept as-is in the text-to-speech field (e.g. JFK or LAX).
  • Translations.txt
    • Currently, Transit does not consume the translation.txt file. If you are interested in displaying translation.txt data in-app, please contact us at data@transit.app.

shapes.txt

  • Transit consumes all shapes provided in a GTFS dataset except the shapes of rail routes. The shapes of rail routes are produced by Transit’s system based on OpenStreetMap data.
  • Shapes are optional in GTFS, so if they are not provided in a GTFS dataset, Transit’s system produces the missing shapes based on OpenStreetMap data.  Note that all non-rail shapes must follow a road that exists on OpenStreetMap, i.e. the shape must not pass through a building block, a field, or a lake.
  • In the following circumstances, shapes of non-rail routes provided in a GTFS dataset may be considered invalid by Transit’s system and may not be displayed in the app.
    • shapes.shape_pt_lat and shapes.shape_pt_lon: All stops on a trip do not lie within a small distance of the shape for that trip. This includes the first stop and the last stop.
    • Shapes.shape_pt_sequence: The values provided in these two fields do not increase as required. It means the values must not decrease but also must not repeat, i.e. all rows in shapes.txt must have different shape_pt_sequence values.

transfers.txt

  • In-seat trip transfer
    • In-seat trip transfers occur when the same vehicle is operated on two or many trips in a row, and riders may stay onboard the vehicle when the trip changes
    • In-seat trip transfers allow real-time data propagation between consecutive trips
    • Providing this information is useful for loop or lollipop routes, but it also helps when the route name changes mid-trip
    • Please use transfers.txt and define the from_trip_id and to_trip_id fields, along with transfer_type=4
  • Off-seat trip transfer
    • Off-seat trip transfers occur when the same vehicle is operated on two or more trips in a row, but riders must disembark when the trip changes
    • Off-seat trip transfers allow real-time data propagation between consecutive trips
    • Providing this information is useful for back-and-forth trips
    • Please use transfers.txt and define the from_trip_id and to_trip_id fields, along with transfer_type=5
  • trips.block_id
    • Defining block_id in trips.txt allows real-time data propagation between consecutive trips
    • Since in-seat or off-seat trip transfers are not explicitly defined using the block_id field, Transit will attempt to classify the transfer type based on the interlining behavior. Therefore, we do not recommend solely using this field for expressing transfers, as they may be deduced wrongly. The best practice is to use both block_id and transfers.txt. 

pathways.txt

  • Transit consumes information from the pathways.txt file for step by step navigation within a trip plan
  • Ensure pathways connect stops.location_type 2 and 0
  • All fields of pathways.txt are consumed and processed to check for data errors
  • Only length, pathway_mode and signposted_as are displayed in the app
  • Pathway traversal_time is aggregated and displayed in the trip plan. This value takes precedence over transfer_times defined in transfers.txt
  • Transit does not currently consume levels.txt
  • For detailed examples, see the GTFS Examples on pathways made by gtfs.org

Fares

  • Fares are displayed in the results of Transit’s Trip Planner and in Route Details. GTFS-Fares V2 is supported by Transit. GTFS-Fares V1, based on the fare_attributes.txt and fare_rules.txt files, is not supported by Transit. We highly recommend publishing all Fares V2 files, including the optional rider_categories and fare_media. 
    • At the moment only one rider category can be displayed, so the app uses is_default_fare_category = 1. If the default is not specified, Transit will choose the regular adult fare as the default category.
    • At the moment, only one fare_media can be displayed. The app prioritizes contactless fare pricing (fare_media_type = 3) if it is set. Otherwise, the choice is made to display fares according to the media most riders use. Usually, it is the physical transit card (when applicable). Otherwise, it is cash.

Does your systen accept contactless credit and debit? Check out our guide to integrating contactless payments into Transit via GTFS.


Beyond GTFS

Transit Home Screen Map

  • On the home screen, Transit displays the map of the network’s main routes, i.e. grade-separated, high-capacity, frequent-service routes.
  • By default, the route_type 0, 1, and 2 are visible on Transit’s map. Bus Rapid Transit routes may also be added. Unfortunately, given design and back-end system constraints, all other routes (e.g. frequent bus routes) are not added.
  • The shapes of each route are produced by Transit based on OpenStreetMap data.
  • To learn more about our home screen maps, check out this blog post.

Transit Images

  • See How Transit Displays GTFS to view how custom logos for routes are displayed
  • Branded route logos will appear with the route_color and route_text_color
  • When providing images to display in Transit, ensure that they are always in SVG format, or any other non-proprietary vectorial format. We do not accept raster images such as JPGs or PNGs.

Useful Links

  • General Transit Feed Specification Reference
    • This is the source-of-truth document about GTFS. Here, you’ll find the most up-to-date information about required and optional data.
  • GTFS Best Practices
    • MobilityData’s Best Practices to follow for data publishing and representation. All of Transit’s Best Practices are compliant with MobilityData’s Best Practices and may overlap.
  • California GTFS Guidelines
    • California has comprehensive GTFS guidelines to provide legible and consistent data. These are helpful supplements to the standard specification and are useful for all transit agencies, regardless of location.
  • Canonical GTFS Validator
    • MobilityData’s Canonical GTFS Validator is a great way to make sure your static GTFS is valid for consumption. We recommend using the validator after you have created your dataset and before you have uploaded it. There is also a desktop app available.

Contact

If you have any questions about how data is used in Transit, let us know at data@transit.app - we’re always happy to help!

Don't work with Transit, but want to? Contact our partnerships team at partners@transit.app.

Still need help? Contact Us Contact Us