Skip to contents

mtaOpenData provides simple, reproducible access to MTA-related datasets from the
NY State Open Data portal platform — directly from R,
with no API keys or manual downloads required. Working directly with Socrata APIs can be cumbersome — mtaOpenData simplifies this process into a clean, reproducible workflow.

Version 0.1.0 introduces a streamlined, catalog-driven interface for MTA Open Data.

The package provides three core functions:

  • mta_list_datasets() — Browse available datasets from the live MTA Open Data catalog
  • mta_pull_dataset() — Pull any cataloged dataset by key, with filtering, ordering, and optional date controls
  • mta_any_dataset() — Pull any MTA Open Data dataset directly via its Socrata JSON endpoint

Datasets pulled via mta_pull_dataset() automatically apply sensible defaults from the catalog (such as default ordering and date fields), while still allowing user control over:

  • limit
  • filters
  • date / from / to
  • where
  • order
  • clean_names
  • coerce_types

This redesign reduces maintenance burden, improves extensibility, and provides a more scalable interface for working with MTA Open Data.

All functions return clean tibble outputs and support filtering via
filters = list(field = "value").


Installation

Development version (GitHub)

devtools::install_github("martinezc1/mtaOpenData")

Example

library(mtaOpenData)

bus_stops <- mta_pull_dataset(dataset = "mta_bus_stops", limit = 5000)

head(bus_stops)
## # A tibble: 6 × 25
##   valid_from          valid_to            in_effect route_id route_short_name
##   <dttm>              <dttm>              <lgl>     <chr>    <chr>           
## 1 2020-11-20 00:00:00 2020-12-15 00:00:00 FALSE     QM3      QM3             
## 2 2020-11-20 00:00:00 2020-12-15 00:00:00 FALSE     QM44     QM44            
## 3 2020-11-20 00:00:00 2020-12-15 00:00:00 FALSE     SHNRD    SHNRD           
## 4 2020-10-28 00:00:00 2020-11-19 00:00:00 FALSE     YOAS     <NA>            
## 5 2021-10-19 00:00:00 2021-11-14 00:00:00 FALSE     CPAS     <NA>            
## 6 2021-10-19 00:00:00 2021-11-14 00:00:00 FALSE     YOAS     <NA>            
## # ℹ 20 more variables: route_long_name <chr>, route_description <chr>,
## #   route_color <chr>, stop_id <dbl>, stop_name <chr>, direction_id <dbl>,
## #   direction <chr>, revenue_stop <dbl>, timepoint <dbl>, boarding <dbl>,
## #   alighting <dbl>, is_cbd <lgl>, latitude <dbl>, longitude <dbl>,
## #   bundle <chr>, computed_region_wbg7_3whc <dbl>,
## #   computed_region_kjdx_g34t <dbl>, computed_region_yamh_8v7k <dbl>,
## #   georeference_type <chr>, georeference_coordinates <list>

About

mtaOpenData makes New York State’s civic datasets accessible to students,
educators, analysts, and researchers through a unified and user-friendly R interface.
Developed to support reproducible research, open-data literacy, and real-world analysis.


Comparison to Other Software

While the RSocrata package provides a general interface for any Socrata-backed portal, mtaOpenData is specifically tailored for New York State Open Data.

This package is part of a broader ecosystem of tools for working with New York open data:

  • nycOpenData — streamlined access to NYC Open Data
  • nysOpenData — streamlined access to NY State Open Data
  • mtaOpenData — streamlined access to MTA-related NY State Open Data

Together, these packages provide a consistent, user-friendly interface for working with civic data across jurisdictions.

  • Ease of Use: No need to hunt for 4x4 dataset IDs (e.g., 2ucp-7wg5); use catalog-based keys instead.
  • Open Literacy: Designed specifically for students and researchers to lower the barrier to entry for civic data analysis.

Contributing

We welcome contributions! If you find a bug or would like to request a wrapper for a specific mta dataset, please open an issue or submit a pull request on GitHub.


Authors & Contributors

Maintainer

Christian A. Martinez 📧
GitHub: @martinezc1