Foursquare ITP prototyped the future of data infrastructure at WMATA by leveraging modern data analytics and implementing TIDES, an open-source transit data specification. WMATA’s intelligent transportation systems (ITS) produce huge volumes of data, but the lack of data format standards and transit data analysis frameworks has limited WMATA’s ability to mine this data for insights into transit performance. Data outages are difficult to detect and triage, and data governance is a challenge for all organizations.
The consultant team transformed agency data from the raw source system into tables that adhered to the cutting-edge TIDES (transit ITS data exchange specification). At the same time, we prototyped a new enterprise data infrastructure for WMATA that supported this transition and provided deeper insight into data health metrics.
WMATA analysts can quickly and effectively analyze transit data that has been synthesized from CAD/AVL, APC, and AFC systems into a standard data format, and the client is making use of this approach on other projects. The project team developed five data pipelines to automatically extract GTFS, AVL, APC, and AFC data, providing raw data for 95 data tables. Data quality is ensured through a robust suite of over 300 automated data tests that can ensure data consumers receive high-quality data.
Solutions and Outcomes
- Enterprised analytics infrastructure design based on emerging transit data standards.
- Modular approach to processing and analyzing data from transit operations systems like APC, AVL, and fare collection technology.
- Robust and high-visibility data pipelines.
- Transparent data transformation steps from noisy source data to curated dashboard-ready tables.
- Platform to build organization data governance.