A Brief History of Big Data (in Federal Transportation)

Guest op-ed by Thomas Grogan, Senior Economist, HDR, Inc.

Big Data is big.

Big enough to generate 3.5 billion search results each day.

Big enough to process over 25 million transit trips daily, to find patterns in ridership to improve safety, service and reduce costs.

Big enough to potentially eliminate six billion metric tons of GHG pollution by reducing inefficient vehicle operations and travel demand.

Big Data can seem complicated because it means many things to many people. But it is not new. At its core, Big Data is still the collection, analysis, and communication of information. What has changed over time is the size, speed, and variety of data which is collected and communicated. However, the challenge lies in using innovative methods to analyze the data and gather meaningful insights which can be used to inform decisions makers.

The Federal Government has been collecting data in increasing size and scale since the first U.S. decennial census in 1790. The methods were rudimentary by today’s standards: the law required that every household be visited; that completed census schedules be posted in public places; and that “the aggregate amount of each description of persons” for every district be transmitted to the president.

Even at a time of great uncertainty in our nation’s future, the government understood the value of quantitatively capturing a moment in time in the country. The total cost was $44,000, or approximately 0.55% of Federal spending at the time (assuming $8 million in nominal terms, based on various estimates). As the nation grew, population data alone would not be enough to provide decision makers with the tools to solve the problem faced by citizens. The next step would be to collect data for how the population moved.

The first mail-out census in 1960 was also the first time that transportation information was formally collected in the decennial census. Under the “Employment Status and Work Experience” section, the multiple choice question was asked “How did this person get to work last week?” This data began to inform transportation planning products used by the Federal Highway Administration. The concentrated effort by the Federal government to collect transportation data that began as one observational data series 56 years ago has grown exponentially in scope, type, and volume of data, as a result of technological progress. And this collected information influences programmatic spending, determines compliance with laws, and guides private sector investment and operational decisions.

In June 2016 alone, the Department of Transportation updated or made available 9 data sets, ranging from numerical to geographic across different modes and types (employment, freight, traffic, performance, maps). Much of this information can be collected and reported automatically by devices; it does not require mailing paper surveys or interviewing people in person. Unlike the data from the first Census, which was only accessible at a single geographical location in a printed document, these data sets are accessible by anyone in the country with a strong internet connection and the right tools and skills to sort through them.

In many cases, real-time information from transit agencies is made developer-friendly in the form of APIs (application program interfaces). When you go online or to an app to track a bus route or order a car service, you’re accessing Big Data, right at your fingertips. A current challenge for agencies and organizations is processing this data on the back-end to gain the appropriate insights for long run transportation policies.

Today, the Census collects much more than the original question asked, and the government established programs with dedicated resources just for transportation data. DOT inherited decades-old data collection programs on the railroad and aviation sectors when they absorbed the functions of the Interstate Commerce Commission and the Civil Aeronautics Board via deregulation. Some might argue that DOT established its own “Big Data” program with the creation of the Bureau of Transportation Statistics (BTS) under ISTEA in 1991. Soon after, the advances in IT technology helped expand the Department’s data and research programs at a rapidly increasing pace.

With the creation of the Research and Innovative Technology Administration (RITA) through the Safe, Accountable, Flexible, Efficient Transportation Equity Act: A Legacy for Users (SAFETEA-LU), BTS was moved to this administration in 2005. Currently, the Office of the Assistant Secretary for Research and Technology (OST-R) oversees 8 multi-modal research programs, including BTS.

Other organizational developments that are occurring across all agencies include the growth in Chief Innovation and Chief Data Officers (CDO); US DOT appointed its first CDO in July 2014. Perhaps as important as the number of programs are the strategic initiatives the OST-R is spearheading.

The extremely popular “Smart Cities Challenge” by DOT was primarily championed by the OST-R. Making the knowledge and lessons learned of the “Smart Cities Challenge” available to everyone for replication is an important component of this program. This makes the implications for structured data programs and analysis all the more important as the transportation industry embraces Big Data.

This increase in Big Data isn’t limited to just transportation. The future of urban areas depends on accessing Big Data from a wide range of industries including energy, education, health, employment, environment, and housing, as well as transportation. For these “Digital Cities” to become a sustainable reality, we must learn to analyze these different sources of data in conjunction with each other.

The views and opinions expressed in this article are those of Thomas Grogan and do not express those of HDR Inc. or of the Eno Center for Transportation.

Search Eno Transportation Weekly

Latest Issues

Happening on the Hill