What problems are we trying to solve? Why are there so many failed tools and so much overlap? The Who’s Got Dirt model. Presented by Jonathan Stray, Overview.
See full presentation.
We’ve all been around too much by this sort of naive story that technology is going to change journalism. Specially for big collaborative work, which is our main focus this weekend; just managing who knows what, where is the data and who has it is a huge part of the battle. It’s kind of an unsexy problem, but it’s kind of the real problem.
Where we are now / Current problems:
- Multiple ongoing platforms projects
- Nothing interoperates (Ej: OC <-> Investigative Dashboard), we don’t have the protocols or even the concepts that would allow these tools to work together. A lot of the tools available have not sustainability model, don’t interoperate and have not produced the kinds of platforms that we hoped for. This is a problem in many issues, one of these is that these platforms are shockingly expensive to do.
- Duplication of expensive work (Overview: 655k, DocCloud: 2,629k, Investigative Dashboard 400k). Then we have basic workflow problems (interoperability is part of them). Later we’ll hear about Who’s got dirt API, which tries to solve the problem “who should I be talking to”, which crucially separates the problem from who to “can I have an access to these documents”. But right now, if you are collaborating with ten people, and you need information/documents, you send e-mails, which clearly doesn’t scale.
- Poorly understood analysis tasks + by dev, anyway Regarding our experience, journalists don’t need big maps. We asked a lot of questions around what they do with the data. Open question to solve in this conference. Overview was set to be a document set analytics platform. So I feel that we succeed but it looks different than we thought, for instance, I thought topic modeling and document clustering was crucial. And it wasn’t. The technologists mindset has this idea that we are going to build this tool like “the global map of everything”, but the journalists that we work with usually have the idea of an specific thing they are looking for. Not big maps, but specific information. We’ve asked ourselves what the reporters do after they look for the information.
- Poorly understood workflows: think about the unsexy stuff. Where are the files stored? What would it take it to a central repository? Who’s is going to enter the metadata? What formats does source data come in? Where is the data stored? How well does the OCR work? Stuff you can only learn from intensive user testing.
Three categories of solutions:
- Collaboration models - how (to get) people work (ing) together
- Interoperability APIs and formats - how the software works together
- Requires API’s + standardization
- Workflow definitions: what should the software do?
Sustainability
There is a model that puts bread in technologists and journalists mouths forever.
- Why are we writing code? What’s wrong with (every) commercial system?
- Grants are not a great long term funding model
- What’s the problem with Palantir, or any other commercial spinoffs? What business models could sustain this work?
Thoughts
- Drew Sullivan: Technology is going to drive the form of journalism in the future, we have to talk with journalists, work together.
- Miguel Paz: Turn this issue into a real thing.
- Drew: Semi structured interview. User research, spend a lot of time in usability testing. It has to be a long term commitment.
- Mar Cabra: A year and a half ago, ICIJ didn’t have a data “area”. Now half of the journalists work around it. We work directly with the users, we actually get user feature requests all the time. Let’s try to get them captive, and try to get them to be a captive audience. Develop while we have a captive audience.
- Drew: That’s actually a solution: have users.
- Smari McCarthy: More focus on a more generalized, on purpose tools, that are meaningful, not that specific.
- Paul Radu: Thinking a little bigger, building a tool with a general purpose, that could work for specific purposes.
- Mar: We work with investigative journalists and final users all the time, so we could work on keeping them captive.
- Chris Taggart: As a journalist your main interest is reporting. Why would I spend a day wondering why should I do something to make it more interesting to other people.
- Miguel: Journalists are the same type of user you’ll have in a startup: I don’t need you/why would I need you. You need to collaborate. What impact will this have in your work life. A strategy is working as a team with developers, that they are helping you. Hubs: we’ve been building hubs, where we ask people to be in there.
- Paul: What Chris was describing is the old journalist. At OCCRP impact is really small, a reason for that is because we are a small group. The startup approach I think is what’s wrong, we should think about collaboration (There’s stories for everyone).
- Adam Hooper: Now we are talking about user research, OpenRefine is a widely used tool, but it hasn’t been updated for years. The tools that we use now are part of bigger programs (tradition?). The business model I guess is to get something that’s done and getting it work better. But it works anyway.
- Drew: It’s difficult to convert our journalistic/reporting knowledge in a model. We can develop tools, try to get money, but that gets you away of the reporting.
- Jonathan Stray: There’s a commercial version of Open Refine being developed.
- Blaine Cook: Open source is important if you want collaboration. I’m all for developing tools, but how do we do the Google Maps (instead of Open Street Map).
- Mar: Let’s also think about the content. Journalists don’t care the technology behind their work. Also, around sustainability, what could be useful would be micropayment model, that could enable journalists to do their work.
- Drew: How many Investigative journalists are out there? 2.000? (Some people agree)
- Eva Constantaras: I think we have to be careful not to be developing for some niche investigative journalism. There are a lot of people working in investigative journalism, maybe not full time, but they are doing efforts.
- A lot of what you are doing is great, but it’s too much for investigative journalists. I’ve trained hundreds, and they can break Google Spreadsheet.
- Drew: I think there are 2000 IJ that could use a tool. 100 dollars a month, if we get them all, it’s sustainable. These are not good numbers, maybe we should approach not full time IJ, but other communities, activists.
- Adam: Also, if you get all 2.000 with one tool, what tool would that be.
- Notes: Think about how startups work around Sales teams.