The internet may have eaten some newspapers’ lunch, but it’s also blessed them with something extraordinary: data journalism.
However, not many people understand what that actually involves and how incredibly important it is. So here’s all you need about data journalism (to begin with).
What’s data journalism?
Data journalism is a marriage of a reporter’s nose for news with a statistician’s love for data analysis. By analyzing the massive data sets that have been made available through widespread connectivity, data journalists use data to uncover and tell stories.
This could mean that, after an enormous dump of documents, data journalists scour the repository using digital tools to quickly find interesting anomalies and turn them into stories. It might mean writing about changes in polling data, or, as has become recently popular, examining why polls don’t work as well as we want them to. Or it could mean analyzing raw data to gather evidence that supports a story, just like reporters use quotes and public documents to support stories.
Regardless of the exact form it takes, modern data journalism is possible thanks to the magic of the internet and computers. As the internet has become ubiquitous, we’ve seen data of all types become broadly available and easily accessible. What was once locked in paper records is now publically searchable, a Control-F away from revealing its secrets.
How do reporters use data?
Some things are secrets because information is restricted. But some secrets hide in plain sight, buried beneath vast troves of disconnected data. Using the tools of statisticians, computer programmers and investigative reporters, data journalists endeavor to uncover these secrets, lifting the blanket of obscurity.
There’s a huge range of stories that are possible only now, thanks to data become more accessible. Accessibility might seem like a small improvement, but it has been nothing short of revolutionary. Just think about how long it takes to find something in an index versus searching a PDF: that change alone moves data from “technically available” to “actually useful.”
Today, it’s possible for unfunded bloggers and deadline-focused journalists to quickly examine data in five minutes that would have taken a full day at the Hall of Records. It’s created a brand new method of internet-friendly journalism with interactive charts, searchable data sets and innovative storytelling tech.
The most basic application of data journalism is to empower investigative journalists by providing huge, searchable records that help them uncover new connections. Think of the Panama Papers, which contained 2.6 TB of data and 11.5 million documents. This was only useful because of custom software built specifically for analyzing these records, and required a team of journalists using forensic data techniques to investigate properly.
More pedestrian data stores take on new life when presented to the public in a searchable format. Take what Telegraph did when analyzing expenditures by members of Parliament. Using software, reporters at the Telegraph. And the. Without these investigations, crucial revelations might have remained obscure for months, if not permanently.
Sometimes, it’s about presenting the data in a new, useful way by creating visual aids that explain the impact of dry statistics. The Wall Street Journal illustrated the effectiveness of vaccines with one such visualization, dramatically proving how effective vaccines are at driving out disease. Interactive projects like St. Louis Post-Dispatch’s “Build a New St. Louis” project challenges users to re-district St. Louis while saving money. Even dust-dry data can become fascinating when paired with a map: Shipmap.org tracks the publicly-available activity of container ships, revealing the vast web of maritime routes that tie together our planet-scale economy.
Why is it useful?
The increase in the public and governmental use of the internet also means that more documents about how our government operates now exist. This means that leakers and whistleblowers now have a treasure trove of information they can draw on. Whatever you might think of the morality of Edward Snowden’s actions, there is no question that he could not have smuggled out that kind of data in the days of paper records. Sites like WikiLeaks bear testament to this, leaking the private communications of public officials on a daily basis. The paper trail is now digital, and it’s deeper than ever.
In addition to enormous sources of information, readily available data makes it easier to fact-check statements quickly. If President Trump, for example, claims that unemployment numbers have never been lower, it’s trivially easy to search the Department of Labor’s statistics and verify that claim. Ironically, this hasn’t seemed to reduce the number of lies told by our President.
With the rise of data journalism, we’ve seen a new breed of news aggregation and tracking site emerge. Websites like Trump Today and Google Trends combine the techniques of data journalism and web aggregation to surface trending keywords and visualize their impact. With these insights, reporters and news consumers alike can quickly evaluate today’s trending topics. The research required to do the same thing even twenty years ago would have been staggering.
Why should you care?
If you’re not a journalist or a news fanatic, it might be hard to muster up any interest for data journalism. It might be transforming the news industry, but that probably won’t change your life. But as a intelligent news consumer, you should at least be aware of how data journalism functions.
A critical news consumer needs to evaluate the source of information whenever possible. In a world overflowing with fake news, a critical eye towards the media is invaluable for any citizen. And to correctly assess a claim, you need to understand it. By understanding the techniques that journalists might use to search through huge data stores, you can better understand whether information from those sources is trustworthy.
Part of the media’s job is holding government accountable, and data journalism provides new opportunities to keep government honest. With massive data troves available for the public to scour, journalists can surface stories about the problems our society faces, evaluating potential solutions and keeping politicians accountable to the truth.
In the right hands, data can be incredibly powerful, holding politicians accountable and revealing hidden truths about our world.