Data Analysis as Community Practice

How did we make these documents and our pattern analysis accessible?

By utilizing a community model where most of our volunteers were not data analysts, we were also able to gain insights from the ways our volunteers read these complaints. Our volunteers worked in reading groups to support one another through reading difficult documents. They were able to provide feedback, gain insights, and learn from one another. Additionally, emerging patterns are not clearly visible in the data; this investigative work requires slow time and attention for careful review of full complaint files alongside the machine learning model we built. From those documents, we identified over 27,000 complaints. Our volunteers labeled more than 4,000 complaints and tagged human rights violations to train our algorithm, which we named Judy.

Who is Judy?

Judy is an algorithm. We created Judy to process complaints that volunteers tagged. The machine learns from the volunteers’ decisions to then “read” and tag tens of thousands more complaints. The data science team, led by the Human Rights Data Analysis Group, built this machine learning tool. Judy has analyzed 30,000 Chicago police complaint files and identified hidden patterns, including clusters of cases alleging sexual assault and domestic violence. Building Judy took us about three months. We intentionally took this time because our goal is to do this work with and for the communities most impacted by gender-based violence at the hands of police. As we utilize technology, it is crucial to consider what we’re building and how it will play a role in future communities. It is imperative to build relationships to gain context outside of the data frames. 

From the beginning, community participation was critical as we worked to make data science tools accessible to the public and to build data literacy among communities for greater agency over the data they hold.  For five months, more than 200 volunteers met weekly to read the full narrative texts of police complaint records from 2011-2015. These documents were made public through the lawsuit Green v. CPD, in which the plaintiff Charles Green filed a Freedom of Information Act request to the Chicago Police Department in 2015, seeking all police complaint records going back to 1967. Upholding the ruling of the Appellate Court which sided with the city in this matter, the case was decided by the Illinois Supreme Court in September 2022.

Tagging Examples:

  • Neglect of duty - instances where people called the police for help and were ignored or mistreated

  • Children/Parents - instances that included a parent and/or child explicitly

  • Home - instances where police officers entered a person’s home

  • LGBTQ - instances where there was language used that alluded to the person’s sexual orientation or gender expression. 

  • Disabled - instances where a disabled person was present or impacted

We created tagging “rules” for the volunteers to follow as they read complaints, so they knew which categories to mark for each story. Importantly, volunteers were able to tag records with every type of  violation described in the report. This is a clear departure from how the police department administration currently handles complaints. Their use of primary categorization, which is to categorize each complaint into a single category, routinely buries the other allegations described in the report. You can read our dictionary of these categories here. 

An example of Neglect CRID 1069480

Multiple volunteers read each complaint. Judy was only trained by complaints where 100 percent of the volunteers independently agreed that the complaint fell within a defined category. In the data science world, this is called inter-rater reliability. As we identified differences of opinion / disagreement rates, we were able to look closer at these documents and ensure that our training materials and other tools could help volunteers properly label and process complaints. By implementing volunteer feedback, our data team was able to continue a community learning model. The complaints Judy identified comprise a wide collection of documents we call our Library of Misconduct Complaints.