In October 2011, DataKind (then called Data Without Borders) held its inaugural DataDive in New York City, where it brought together UN Global Pulse, The Microfinance Information Exchange Market (MIX Market), and the New York Civil Liberties Union with data scientists, hackers, and an assortment of other interested people, such as myself. I had applied to have the volunteers take a crack at the data compiled by Internews’ Media Map Project, and while we were not chosen as one of the featured non-profit organizations, I was still curious about how the event would work and what they were trying to do. So I went up to New York to see what it was all about.
A few students from Columbia University’s Quantitative Methods in the Social Sciences program, who worked with me on Media Map data, were also at the event. We chose to work with the New York Civil Liberties (NYCLU) program. The NYCLU had a database of police records tracking police interactions called “Stop-and-Frisk.” NYCLU was convinced that Stop-and-Frisk, which gave police officers free rein to stop, question, and pat down thousands of pedestrians each year, was being implemented in a highly discriminatory way. And they had data – police were required to record each of these encounters, including location, gender, race, and other descriptive information. Sarah LaPlante – the lone NYCLU data analyst charged with making sense out of all of it in the hopes that it might form the backbone of a potential civil liberties case – needed support. Could we visualize the data to see if there was evidence to support the NYCLU’s suspicions that there was racial bias in who the police stopped and frisked?
The idea was quite exciting, but the data was a mess, and our group spent most of the weekend just cleaning and organizing the data. Finally, when it was in good enough shape, the geospatial analysts in the group took over created some simple maps to highlight the patterns.
The last day of the DataDive was exciting, with the different groups presenting what they had done. It was a bit hard to grasp what had been accomplished though, particularly as the projects were so different from each other and the results presented so quickly (the effects of fertilizer in Uganda? Creating a database on microfinance loans in Africa?). And as far as I knew, that was it. I went back to DC. The students went back to school. The rest of the group went back to their lives.
While we had a few follow up emails from Sarah after the event, after a while, I heard nothing more and forgot about it. So did the hackathon matter?
On August 12, 2013, a federal judge in New York City ruled in a class action suit brought by the NYCLU that the NYPD’s stop-and-frisk practices amounted to a “policy of indirect racial profiling,” and thus violated New Yorkers’ constitutional rights.
Victory! Hail to the hackathon!
Obviously, the truth is much more complicated.
There has been a bit of lively debate about what hackathons are good for and what they are not good for, and whether they actually make a difference. A few examples of this debate can be found here, here, and here. A few important highlights from these debates include demonstrate why in the very first DataKind DataDive did make a difference for the NYCLU:
- Organizations that want help with data need to spend considerable time scoping a question they want to be answered before the hackathon (I’ll admit, this was our failure in the Media Map application. We had lots of data! And lots of unanswered questions! But we hadn’t narrowed down our scope to a single data set or data sets and a good, juicy question). By contrast, the NYCLU had a hypothesis, the data, and some evidence based in observations about what they might find.
- Organizations should be able to benefit from an intensive jumpstart of a project, to “make small dents” in their problems, but should not expect to get a perfect product that answers all of their needs over the course of a weekend. The hackathon was a helpful boost along the way for the NYCLU, who only employed a single data analyst.
- There should be processes in place to follow, further develop, and iterate what was created during that intense, caffeine and pizza fueled weekend. The New York DataDive was one step that helped to facilitate and speed up a process that NYCLU had already put in place, and was prepared to follow up on.
Two years later at Internews, we are entering into the redesign of the Media Map Project, and one of the big questions is: what to do with all of the data that we compiled? The first step will be defining what questions are most important to ask. The hackathon may come. But for now, it can wait.