Guidelines for Producing GTFS Static Data for Transit
GTFS Static Data is a comprehensive representation of scheduled service on your network. Producing GTFS doesn’t have to be a headache! By following a few guidelines, you can make sure that your GTFS data is represented in Transit (and other apps) as intended, and that you’re always providing public transit riders with the best, clearest, and most up-to-date information possible. To see how GTFS is displayed, check How Transit Displays GTFS Data.
Integration of New Transit Networks
Dataset Publishing and Formatting
Dataset Publishing
- When to publish new static data
- How long static data should be valid for
- How to share static data
- How often Transit updates static data
Dataset Formatting
- GTFS Specification and Best Practices
- Data Consistency
- Unidirectional vs. Bidirectional Routes (Loop and Lasso)
GTFS Field Requirements
Core GTFS
Supplemental GTFS
- Physical Accessibility
- Text Accessibility
- Trip Shapes
- Transit Home Screen Map
- Transit Images
- Transfers and Blocks
- Fares
Useful Links
Contact
Integration of New Transit Networks
- If you work for (or on behalf of) a transit agency, you may request that your agency’s transit network be integrated into Transit. Please send an email to partners@transitapp.com with data@transitapp.com in cc.
- In your email, you must include a URL linking to the GTFS dataset of your agency’s transit network. Requests without GTFS datasets will not be considered.
- In your email, you should also include any URLs linking to real-time data. Real-time data must match the GTFS dataset IDs.
- In your email, you must share with us the email address of a contact to whom we can send any feedback about your agency’s GTFS data. This contact should be the one who takes care of your agency’s GTFS data production, and the email will ideally be one that doesn’t change with employee turnover (e.g. in our case, data@transitapp.com).
- If you have already talked to our Partnerships team about integrating your agency’s transit network into Transit, please send the URL linking to your GTFS dataset to data@transitapp.com at least two weeks before the planned launch in the app.
Dataset Publishing and Formatting
Data Publishing
When to publish new static data
- Any new dataset must be published at the link at least 7 days in advance of new schedules taking effect to make sure everything is working by the data’s start date.
- Publishing new schedules less than 4 days in advance may result in your data not being updated in time in the app, as data consumption and cleaning requires a number of hours to work through our pipeline and may trigger manual checks or improvements. Data consumption is not instantaneous!
- Providing data 7 days in advance will help ensure that all Transit features (such as notifying users of holiday service changes) work properly.
- Errors may happen both in data production or in data consumption, so a buffer helps get everything fixed before new schedules start. The most common cause of scheduling issues and rider confusion is day-of or late static data uploads.
- In case of last-minute changes or minor schedule revisions, you may provide the new dataset as soon as it’s available. If the last-minute changes are important, you may email us at data@transit.app so we make sure they are displayed in the app. However, providing schedule changes last minute every time is not compliant.
- Providing schedules before they go live allows your riders to plan trips in advance.
- For a specific reason your agency is unable to consistently publish datasets in advance, contact data@transitapp.com to discuss possible workarounds.
How long static data should be valid for
- Any dataset must have valid schedules for a period of at least 7 days, and preferably for 30 days or the entire service period. (Use feed_start_date and feed_end_date in feed_info.txt)
- A dataset should not be valid for more than a year for schedule reliability reasons.
How to share static data
- Any dataset must be accessible via a permanent and static URL so the app can pick it up automatically. That means the case and characters must remain the same in the URL over time.
- The URL should be publicly accessible.
- A good example of this is our home agency STM, where new GTFS is published at https://www.stm.info/sites/default/files/gtfs/gtfs_stm.zip and the link URL is never modified.
- If you do not have GTFS posted to a URL, it is possible for Transit to integrate .zip datasets that are sent via other means, but only on special or emergency occasions and not on a regular basis.
How often Transit updates static data
- Any new dataset published at a permanent and static URL will be automatically fetched by Transit within 24 hours
- Usually, Transit fetches data every night at 1:00 AM in the transit network’s local time. For example: Transit fetches data from Montreal’s STM source at 1:00 AM Eastern Time and Transit fetches data from Vancouver’s TransLink source at 1:00 AM Pacific Time.
- Upon request, Transit may fetch data at another time.
- Upon request, Transit may fetch data more than once a day.
- Once fetched, a new dataset takes on average 2 hours to be uploaded in the app if it does not contain any errors. Note that this time may vary, as upload time greatly depends on the size of the dataset.
- Usually, Transit fetches data every night at 1:00 AM in the transit network’s local time. For example: Transit fetches data from Montreal’s STM source at 1:00 AM Eastern Time and Transit fetches data from Vancouver’s TransLink source at 1:00 AM Pacific Time.
Data Formatting
GTFS Specification and Best Practices
- Transit integrates the transit network’s data provided in GTFS. Any GTFS dataset must follow the GTFS specification closely.
- Transit advises data producers to read and follow the GTFS specification as stated in the document reference.md from Google Transit’s GitHub repository, as it is always the most up-to-date document.
- Any GTFS dataset must follow the File Requirements, must include all required files and fields, and should include any other optional files and fields when applicable, for the sake of helping transit riders as much and accurately as possible in their journeys.
- If a GTFS dataset does not follow the GTFS specification, this dataset will be very likely rejected by our system and its schedules will not be displayed in Transit.
- Transit advises data producers to check the compliance of any GTFS dataset with the GTFS specification before its release, using the canonical GTFS Validator from MobilityData’s GitHub repository.
- Besides following the GTFS specification closely, any GTFS dataset should also follow the GTFS Best Practices. Transit’s Best Practices are integrally compliant with the GTFS Best Practices, which are divided into 3 categories:
- The Dataset Publishing & General Practices, for general advice on GTFS publication.
- The Practice Recommendations Organized by File, for advice on how to efficiently populate files and fields in order to make the information as useful and understandable as possible for riders.
- The Practice Recommendations Organized by Case, for specific advice on how to model loop-shaped and lasso-shaped routes, as well as routes with branches.
Data Consistency
- Data formatting should be consistent throughout dataset iterations. If there are data inaccuracies, Transit may create rules to fix the inaccuracies, and any rules created will be applied to all the subsequent datasets.
- Even if consistent in type, too many inaccuracies may lead to dataset rejections.
- Transit is not responsible for any dataset inaccuracies and does not commit any fixes on behalf of transit agencies.
- Data formatting should be consistent with information the riders see in real life. All GTFS rider-facing strings should be populated with values that correspond to the information the riders need on their journeys.
- All the agency names, route names, route colors, trip and stop headsigns should match the names, colors, and headsigns that riders see at stop poles, in stations, on board vehicles, on transit maps, on printed or PDF schedules, and on the transit agency’s website.
- Transit may change values to better reflect information the riders see in the field.
Unidirectional vs. Bidirectional Route (Loop and Lasso)
- One way to determine whether a route is unidirectional or bidirectional is to put yourself in the shoes of a rider on the route.
- If, along most of the route, the rider can walk across the street, or a block away (in the case of one-way streets), to catch vehicles of the same route going in the opposite direction, the route is bidirectional.
- Otherwise, the route is unidirectional.
- Any bidirectional route should have 2 groups of trips. Transit recommends providing the field trips.direction_id, and the trips of the same group must have the same trips.direction_id value (either 0 or 1).
- If the bidirectional route is shaped like a straight line, one group of trips represents outbound trips (or direction “A”) and the other group trips represents inbound trips (or direction “B”).
- If the bidirectional route is shaped like a loop, one group of trips represents clockwise trips (or direction “A”) and the other group of trips represents counterclockwise trips (or direction “B”).
- If the bidirectional route is shaped like a lasso (or lollipop), and even if the route has only one actual terminus (the vehicle may have no dead time or timepoint in the loop section of the lasso, and the riders may stay onboard), there should be at least 2 groups of trips.
- Two termini should exist in the data, one of which should be located in the loop section of the lasso.
- One group of trips represents trips heading to the loop section of the lasso (or direction “A”) and the other group of trips represents trips leaving from the loop section of the lasso (or direction “B”).
- All trips “A” and trips “B” operated in a row by the same vehicle should be linked with either in-seat trip transfers (transfers.transfer_type=4) when riders may stay onboard the vehicle, or off-seat trip transfers (transfers.transfer_type=5) when riders must disembark. Trips.block_id may be used instead, but it is not explicit regarding in-seat or off-seat situations.
- Any bidirectional routes, including routes shaped like a loop or like a lasso (or lollipop), should provide different headsign values for each direction.
- The riders must not see the same headsign (e.g. “Loop” or “Downtown”) in both directions as they will get confused on which vehicle to board.
- In the case of a bidirectional loop, the trip_headsign values “clockwise” and “counterclockwise” may be used (or anything else that has the same meaning). Alternatively, stop_headsign values can be populated with, in one direction, “A via X” until A is reached, then “B via Y”; and, in the other direction, “A via Y” until A is reached, then “B via X”.
GTFS Field Requirements
Core GTFS
Core GTFS is the minimum GTFS data needed to make a functional GTFS dataset.To see how these fields are shown in transit, check How Transit Displays GTFS Data.
agency.txt
- Ensure that the agency_timezone field is accurate to your area. You can look here to see the correct time zone for your agency.
- Don’t forget the agency_url! It’s required in GTFS.
stops.txt
- stop_name
- If any of the symbols/words ( +, &, @, at) appear between words, Transit replaces them with “/” to ensure consistency and readability of stop_names across transit networks within the app
- This field should be less than 50 characters.
- This field should not include redundant information such as ‘stop’ or ‘station’ unless necessary
- wheelchair_boarding
- Indicates whether wheelchair boardings are possible from the stop itself.
- Possible answers are 0,1,2 depending on the type of the stop, as indicated in the GTFS specification.
routes.txt
A route should encompass all services under the same branding that is communicated to passengers, meaning there should not be a separate route for each direction or itinerary. Routes with multiple branches or stop sequences should be considered as one route if they fall under the same branding. If two routes share a numeral (e.g. 16A & 16B) but are distinct in agency publications, they should be treated as separate.
- route_short_name
- Required by Transit and used as the main display name
- Should be concise and contain the briefest designation in use by riders or in agency publications
- If a route has a numerical short name, this value should go in route_short_name
- The route_short_name for a single route should stay consistent across all versions of the data. Errors may occur if this is changed.
- route_long_name
- Required by Transit to display route details from search
- Should not contain the same information in route_short_name
- The route_long_name should encompass as much information as possible in a relatively small number of characters.
- It is best to be as consistent as possible from route_long_name to route_long_name when using character formatting such as ( -, / &, @)
If either the route_short_name or route_long_name is not provided, Transit will duplicate the populated field into the unpopulated field so that there are always values for both fields.
Good examples
route_short_name | route_long_name |
---|---|
18 | Beaubien |
BxM11 | Wakefield - Midtown |
Blue | Campus Counterclockwise Circulator |
901 | Metro Orange Line |
-- | Ventura County Line |
-- | Blue Line |
62 | -- |
Bad examples
route_short_name | route_long_name |
---|---|
✅ 604 | ❌ 604 |
✅ 14 | ❌ ROUTE 14 |
❌ ndodge | ✅ North Dodge |
- route_type
- Transit has default display settings based on route_type. This field must always be filled and consistent for the route from one data publication to the next. For example, a train route 1 should not be changed from route_type=1 to 3 as this will change the display in the app.
- Note that if a route is operated by a bus, even if it is serving as rail replacement, it should have a route_type of 3.
- route_color and route_text_color
- This should be consistent with any paper or online communication materials made by the transit agency.
- This should not be too bright and should be able to provide enough contrast to be visible to the user.
- Ensure that colours are Web Content Accessibility Guidelines (WCAG) compliant using a contrast checker.
trips.txt
The fields in trips.txt are extremely helpful to clarify complex trip information to Transit users. The fields below are used front and center within the app, so we encourage agencies to make sure they’re as accurate as possible.
- trip_headsign
- This field is used by Transit to display the direction or destination of each trip.
- It should convey information about the direction or destination of the trip, and not the origin or waypoints along the route. Consistency when naming is key to providing a good user experience.
- It should not contain any information that is already defined elsewhere in the GTFS, such as the route_short_name or route_long_name.
- For loop routes and routes that have more than one general destination, or hit multiple waypoints along the route, use the stop_headsign field.
- For routes that have different branches, ensure that the trip_headsign fields reflect the different destinations for each branch
- Transit will automatically recognize letter branch codes if a trip_headsign begins with a single letter character followed by a hyphen (i.e. A - QUEEN WEST)
- direction_id
- This field is used by Transit to display the two different directions along a route.
- It should be populated as accurately as possible to ensure that swiping between directions is possible in the app.
- In the case of a true loop route, only one direction can be specified. It is recommended to use stop_headsigns to add clarification for riders.
- This field is used by Transit to display the two different directions along a route.
- wheelchair_accessible
- This field should be filled to reflect the accessibility of the trip as a whole.
- usually reflects the vehicle serving the trip.
- This field should be filled to reflect the accessibility of the trip as a whole.
stop_times
The fields in stop_times.txt contain more details that complete and compliment the information from tripstxt . The accuracy of the stop_times is extremely important as this is the basis of all trips.
- stop_headsign
- Adding stop_headsigns to each stop of a trip allows you to change the headsign information along a trip. This is useful when trip_headsigns are not enough to identify the directionality of a trip
- stop_headsigns can be assigned on a per-stop basis. The headsign at each stop is the name of the following stop.
- For loop routes, stop_headsigns can be used to differentiate between different portions of the route depending on how the agency communicates directionality. See more examples in the GTFS best practices.
stop_headsigns will override trip_headsigns when provided
- pickup_type & drop_off_type
- These fields are used by Transit to show the availability of service at a particular stop for a specific trip during the day
- If not specified correctly, the app could route the user to a stop where they will not be picked up by the bus. Or, if they are already on the bus, they may be unable to be dropped off at their stop. This often applies to commuter buses or trains, as well as express routes.
We do not support the pickup_type and drop_off_type values 2: “Must phone agency” and 3: “Must coordinate with driver”. A value of 2 will be displayed as if it were a 1, and a value of 3 will be displayed as if it were a 0.
Services
Services are dates and days of operation defined in calendar.txt and in calendar_dates.txt.
- Calendars should be valid for at least the next 7 days, preferably covering the entire service period.
- It is helpful to both GTFS producers and consumers to provide clearly labelled and descriptive service_id fields when possible.
Supplemental GTFS
Physical Accessibility
The accessibility of a trip depends on the accessibility of both the stop and the vehicle. These are assigned using the following fields:
- stops.wheelchair_boarding
- This field indicates whether wheelchair boarding is possible from the stop itself.
- Possible answers are 0,1, or 2 and depending on the location_type value of that stop.
- trips.wheelchair_accessible
- This field reflects the vehicle’s capability to onboard and carry wheelchairs.
- This field should be populated according to the transit agency’s fleet of vehicles.
- pathways.txt and levels.txt
- Transit consumes information from the pathways.txt file to inform trip plans.
- Transit does not consume the levels.txt file.
- For detailed examples, see the GTFS Examples on pathways made by gtfs.org
Text Accessibility
- Text-To-Speech fields
- Transit consumes text-to-speech values from the fields routes.tts_route_short_name , routes.tts_route_long_name, and stops.tts_stop_name.
- These fields should contain the same information as the parent field, avoiding abbreviations for generic names, such as “St”, which could be either “Street” or “Saint”. Other examples to avoid include Pr (Park), Ln (Lane), Blvd (Boulevard), and Pl (Plaza or Place).
- If abbreviations are commonly used for names, they should be kept as-is in the text-to-speech field (e.g. JFK or LAX).
- Translations.txt
- Currently, Transit does not consume the translation.txt file. If you are interested in displaying translation.txt data in-app, please contact us.
Trip Shapes
- Transit consumes all shapes provided in a GTFS dataset except the shapes of rail routes. The shapes of rail routes are produced by Transit’s system based on OpenStreetMap data.
- Shapes are optional in GTFS, so they may not be provided in a GTFS dataset. In this case, Transit’s system produces the missing shapes based on OpenStreetMap data.
- Shapes of non-rail routes provided in a GTFS dataset may be considered invalid by Transit’s system and may not be displayed in the app. In this case, Transit’s system replaces the invalid shapes with new shapes based on OpenStreetMap data. The causes for invalid shapes are described below:
- shapes.shape_pt_lat and shapes.shape_pt_lon
- All stops on a trip must lie within a small distance of the shape for that trip. It includes the first stop and the last stop.
- shapes.shape_pt_sequence and shapes.shape_dist_traveled
- The values provided in these two fields must increase. It means the values must not decrease but also must not repeat, i.e. all rows in shapes.txt must have different shape_pt_sequence and shape_dist_traveled values.
- Note that all shapes must follow a road that exists on OpenStreetMap, i.e. the shape must not pass through a building block, a field, or a lake.
Transit Home Screen Map
- On the home screen, Transit displays the map of the network’s main routes, i.e. grade-separated, high-capacity, frequent-service routes.
- By default, the route_type 0, 1, and 2 are visible on Transit’s map. Bus Rapid Transit routes may also be added. Unfortunately, given design and back-end system constraints, frequent bus routes are not added.
- The shapes of each route are produced by Transit based on OpenStreetMap data.
- To learn more about our home screen maps, check out this blog post.
Transit Images
- When providing images to display in Transit, ensure that they are always in SVG format.
- See How Transit Displays GTFS to view how custom logos for routes are displayed.
- Branded route logos will appear with the route_colour and route_text_color.
Transfers and Blocks
- In-seat trip transfer
- In-seat trip transfers occur when the same vehicle is operated on two or many trips in a row, and riders may stay onboard the vehicle when the trip changes.
- In-seat trip transfers allow real-time data propagation between consecutive trips.
- Providing this information is useful for loop or lollipop routes, but it also helps when the route name changes mid-trip.
- Please use transfers.txt and define the from_trip_id and to_trip_id fields, along with transfer_type=4.
- Off-seat trip transfer
- Off-seat trip transfers occur when the same vehicle is operated on two or more trips in a row, but riders must disembark when the trip changes.
- Off-seat trip transfers allow real-time data propagation between consecutive trips.
- Providing this information is useful for back-and-forth trips.
- Please use transfers.txt and define the from_trip_id and to_trip_id fields, along with transfer_type=5.
- Block_id
- Defining block_id in trips.txt allows real-time data propagation between consecutive trips.
- In-seat or off-seat trip transfers are not explicitly defined using the block_id field. Transit does not recommend using this field for this purpose, as block_id transfers may be interpreted as off-seat transfers by default.
Fares
- GTFS-Fares V2, based on the fare_leg_rules.txt and fare_transfer_rules.txt files, is supported by Transit. Fares are displayed in the results of Transit’s trip planner.
- Only one rider category can be displayed, so the app displays the fare that the largest group of potential riders can travel with. Most of the time, the default fare is the adult one-way fare.
- Only one fare_media can be displayed, so the choice is made to display fares according to the media most riders use. Usually, the default fare_media is the physical transit card (when applicable) otherwise, it is cash.
Note that GTFS-Fares V1, based on the fare_attributes.txt and fare_rules.txt files, is not supported by Transit.
Useful Links
- General Transit Feed Specification Reference
- This is the source-of-truth document about GTFS. Here, you’ll find the most up-to-date information about required and optional data.
- GTFS Best Practices
- MobilityData’s Best Practices to follow for data publishing and representation. All of Transit’s Best Practices are compliant with MobilityData’s Best Practices and may overlap.
- California GTFS Guidelines
- California has strict GTFS guidelines to provide legible and consistent data. These are helpful supplements to the standard specification and are useful for all transit agencies, regardless of location.
- Canonical GTFS Validator
- MobilityData’s Canonical GTFS Validator is a great way to make sure your static GTFS is valid for consumption. We recommend using the validator after you have created your dataset and before you have uploaded it. There is also a desktop app available.
Contact
If you have any questions about how data is used in Transit, let us know at data@transitapp.com - we’re always happy to help!
Don't work with Transit, but want to? Contact our partnerships team at partners@transitapp.com.