Unleashing Public Data Creates a Smarter Criminal Justice System

Lili Dworkin
Humphrey Obuobi
March 8, 2023
Technology

The United States criminal justice system is the only one in the world to be split across 50 states, 3,000 counties, and tens of thousands of cities and towns. In each locale, the system is further split between law enforcement, jails, prosecutors and public defenders, the courts, corrections, and parole and probation. 

This fragmentation of the justice system hinders the public’s understanding of what occurs on a day-to-day basis and makes coordination nearly impossible. Right now, we can’t answer simple questions like: How many people have been arrested in the past year? For what charge? How many people are in jail today? And how likely are people to return to prison?

The problem isn’t the generation of data, but the consistency and availability of that data. Each corner of the system records different data points, tracks different metrics, and performs different analyses. Even if they could compare the data points, policymakers and criminal justice practitioners often can’t even access the data sources reliably and consistently. These challenges make it impossible for leaders to make informed decisions and know which policies are working – and which aren’t.

Justice Counts is an effort to solve this problem. The initiative works by uniting the country’s criminal justice system around a set of metrics, and ensuring those metrics are freely and reliably available to the public. Justice Counts is co-led by the U.S. Department of Justice’s Office of Justice Programs’ Bureau of Justice Assistance (BJA) and The Council of State Governments (CSG) Justice Center, with Recidiviz as the chief technical partner.

In order to achieve this goal, Justice Counts must define which data points are most important for the ecosystem. Then it equips agencies with the tools they need to publish those metrics consistently, across the entire criminal justice system, every month. With this monthly heartbeat, the initiative aims to create a world where three things are, for the first time, possible in the U.S. criminal justice system:

  1. States can learn from and collaborate with each other using consensus-driven performance metrics.
  2. Policymakers can investigate which policies are most effective based on real-time data.
  3. Communities, researchers, and advocates can evaluate interventions and highlight new opportunities.

Early feedback from practitioners in the field demonstrates the initiative’s promise.

"Transparency is the foundational element of building trust,” says Abdul D. Pridgen, chief of the San Leandro Police Department in California. “We intend to use the data to inform our direction and decisions related to policies and procedures, which will improve our service.”

Theory of Change

Recidiviz is founded on the idea that the criminal justice system can only operate safely and fairly if decision-makers have insight into which policies and practices are effective. All of our tools are built with this principle in mind and fundamentally work the same way – by aggregating data from disparate sources and getting it into the hands of people who can use it to improve outcomes:

  • For government leaders and staff, data makes it possible to compare their performance to that of their peers and act confidently in moments of crisis (such as the COVID-19 pandemic). 
  • For advocates, data clarifies injustices and disparities, so it is easier to push for better policy and practice. 
  • For researchers, data powers models that predict how the system could change under different circumstances, driving decisions about what practices to implement.

Making data fully public is one of the most effective strategies for driving change. But the fragmentation of the criminal justice system means that the existing data represents different statistics, using different definitions, aggregated across different time periods. To be effective, data must be current, comprehensive and comparable across states and agencies. Strong data also includes historical information.

Justice Counts aims to accomplish all of the above. Collecting and streamlining fragmented data is challenging. But this initiative makes it easy for agencies to publicly provide recent data in different formats, using a simple, scalable framework. 

Here’s how it works:

The Justice Counts Approach

Justice Counts is an open-source, publisher-subscriber ecosystem where agencies publish data and policymakers, practitioners, and the public consume it.

To reach that goal, the team works along four different dimensions:

  1. Define consensus-driven metrics
  2. Create a technical specification
  3. Build a publisher tool, including the underlying data infrastructure
  4. Build client tools, including dashboards to easily visualize data 
High-level overview of the Justice Counts technical infrastructure

1: Define Consensus-Driven Metrics

Justice Counts has defined a set of metrics – about ten per sector (such as law enforcement or prisons) – that each agency in that sector can share on a regular basis. The metrics were developed in partnership with a wide range of experts representing every corner of the country’s state, county, and municipal justice systems. A few examples are below:

  • Funding and expenses (all sectors)
  • Total arrests (law enforcement)
  • Cases referred (defense and prosecution)
  • Admissions (jails and prisons)
  • Supervision terminations (supervision)

The metrics are designed to be:

  • Simple: The metrics capture vital data points while accounting for the fact that agencies collect, define, and maintain data in different ways and that data quality may vary by agency or metric.
  • Feasible: The metrics rely on data points that are already collected by many agencies and should be easy to share. They take into account that baselines may vary across agencies and localities.
  • Effective: The metrics are easy to understand and will provide data that policymakers and agency leaders can utilize in their decision-making. The initiative also allows agencies to provide context behind the numbers to enable fair, accurate use of the data collected.

The consensus-driven metrics are the starting point, and create a strong foundation for Justice Counts, upon which the team can build other technology and tools.

2: Create a Technical Specification

The Justice Counts ecosystem is composed of publishers and subscribers. The publishers (i.e. criminal justice agencies) share the Justice Counts metrics regularly. And the subscribers (i.e. the public) access and use that data. The system is therefore designed based on the “pub/sub paradigm”, like an RSS or GTFS feed subscription.

To work, a pub/sub paradigm requires a technical specification that both publishers and subscribers adhere to. This prescribes both which data points each agency needs to share and how to share them. Typically, such specifications use structured data formats such as JSON or XML. Like the GTFS Schedule, Justice Counts uses flat comma-separated .txt files. This is a simple, lightweight format that is easy for criminal justice agencies to generate (even simple software systems can output plain text files), and it is easy for both humans and machines to parse.

Example of a file in the technical specification corresponding to the metric “arrests by offense type”

The technical specification defines how publishers provide and subscribers access the set of metrics that Justice Counts is built upon.

3: Build a Publisher Tool

A key feature of this open pub/sub ecosystem is that all agencies can publish data anytime by following the technical specification. Because there are many steps to creating, hosting, and maintaining a coordinated public feed, Recidiviz developed a tool, called Justice Counts Publisher, to make this process easy. Justice Counts Publisher is a web application that provides a user-friendly interface for sharing metrics, and then generates and hosts the data feed described above. Agencies who use Justice Counts Publisher get a compliant data feed for free, without having to learn the details of the specification or worry about hosting their own data.

Screenshot of the manual data entry page in Publisher

Data entry is always vulnerable to mistakes such as misspellings or missing values in the spreadsheet. To address this, the Justice Counts Publisher automatically fixes as many errors as possible, and provides detailed messaging to the user for the rest. For instance, rather than erroring outright when encountering the value “Hawaian”, the system calculates the similarity of this string with all valid race and ethnicity values, and concludes that it is closest to “Hawaiian”. 

The tool also infers missing data when possible. For instance, if an agency uploads data for the disaggregated “arrests by offense type” metric, but not for the aggregate “arrests” metric, the tool performs the calculation automatically. Uploading data will continue to get easier and more innovative based on feedback from our partners and learning from issues that arise.  

Justice Counts Publisher enables criminal justice agencies to provide the Justice Counts metrics in accordance with the technical specification – and to do so regularly, accurately, and easily.

4: Build Client Tools

Just as Justice Counts Publisher facilitates the “pub” side of the pub/sub ecosystem, Recidiviz is also building tools to help with the “sub” side – making it easier for leaders, researchers, and advocates to find and use the data effectively. 

There are three main components of the “Client Tools”:

  • The aggregator service crawls all published data feeds on a daily basis, parsing the data.
  • The aggregator datastore holds all of the data from the crawled data feeds.
  • The public dashboards read from the aggregator datastore and generate visualizations and insights that make the data interpretable and actionable.
Overview of the Client Tools infrastructure

This three-part system serves as an example of how the published Justice Counts data feeds can be consumed, aggregated, and ultimately used to provide value to the public. On its own, an individual public feed may be of limited utility, but when combined with other feeds into state or national dashboards, the data becomes powerful.

With the metrics defined, technical specification in place, and publisher tool available, Justice Counts can get those data into the hands of researchers and the public through client tools. Now, for the first time, a cross-system picture is in clear view. 

How Data Drives Impact

The data dashboards that Justice Counts produces don’t solve problems on their own. But they can bring about significant impact when they create productive dialogue between publishers and consumers. Users should be able to ask questions about the data and then receive answers in the form of customized visualizations, detailed context, and insights that automatically highlight important statistics and trends.

For example, the dashboards should surface: what is “normal” for a given metric, how the metric has changed over time, how different breakdowns compare to one another, and what this data means in the context of other agencies and states.

Screenshot of the data visualization page in Justice Counts Publisher

Each visualization is accompanied by a set of insights that summarizes and highlights the key takeaways, from simple statistics to averages and year-over-year change. Eventually, these insights will provide more sophisticated analyses, like highlighting anomalies and other statistically significant trends. 

For example, a county looking to create a new policy on decriminalizing marijuana could learn how a similar move landed in another county. Corrections departments will be able to explore how a neighboring state’s prison and community supervision populations are trending to inspire new strategies. And prosecution offices will have more insight into diversion efforts and proven alternatives to incarceration across their state.

Lessons Learned

Justice Counts built an initial version of the publisher tool and onboarded ten pilot agencies. Four of these agencies have successfully entered data from the past year. From that pilot, the team learned three key lessons:

1. It’s challenging to align behind a common specification.

The hard part of writing the technical specification is communicating with and aligning the different stakeholders. While criminal justice agencies are accustomed to pulling and exporting data, they aren’t used to doing so in a set, high-quality format. In an open pub/sub ecosystem, the data must look exactly right. Therefore, Justice Counts invests heavily in documentation, examples, and validation tools that will help agencies publish data in alignment with the metrics.

2. Fast and painless data sharing is critical for compliance.

Data sharing needs to be as fast and painless as possible. Many agencies already track and share data and are hesitant to take on another recurring commitment. To make this easier, we allow agencies to personalize their data entry experience, including turning off metrics and dimensions that they cannot share or providing contextual information that explains any differences. The goal is for agencies to provide data that doesn’t match the specification exactly, rather than not provide data at all.

3. The way data is presented on dashboards matters.

Early data visualizations demonstrated that agencies care deeply about how their data is presented to the public. Even small changes in wording can make a big difference in how the information is perceived. It’s essential that publishing agencies feel a sense of trust that the dashboard will represent their work fairly and accurately – otherwise they’ll choose not to participate. Therefore, dashboards are designed to minimize the risk of misinterpretation, clearly surfacing any contextual information they have supplied.

Looking Ahead

The hardest part of the Justice Counts initiative is still to come – namely, onboarding the thousands of criminal justice agencies across the U.S., and empowering them to publish data. Then, encouraging other states, policymakers, and advocates to actually use that data to drive change. Without the former, the initiative can’t succeed because we have no data; without the latter, the initiative won’t succeed because the data is not being acted upon. Along with the CSG Justice Center and BJA, we launched the Founding States Program, which will provide technical assistance to ten state agencies sharing data with Justice Counts.

Over time, the Justice Counts team will continue to incorporate feedback from stakeholders and make improvements to both sides of the experience. We look forward to achieving the goal of helping criminal justice agencies get real-time data into the hands of the public – and we are even more excited about the improvements in outcomes in the criminal justice system that can follow. 

More information is available on the Justice Counts website. 

—-

Justice Counts is supported by Grant No. 2019-ZB-BX-K005 awarded by the Bureau of Justice Assistance. The Bureau of Justice Assistance is a component of the Department of Justice’s Office of Justice Programs, which also includes the Bureau of Justice Statistics, the National Institute of Justice, the Office of Juvenile Justice and Delinquency Prevention, the Office for Victims of Crime, and the SMART Office. Points of view or opinions in this document are those of the author and do not necessarily represent the official position or policies of the U.S. Department of Justice.​

Recent Articles

See Blog
Copyright © 2017, Recidiviz. All Rights Reserved.