Local newspapers have been documenting our lives for centuries. Decades of institutional knowledge are buried in antiquated archival systems or in the minds of journalists who eventually retire or leave the industry. French political thinker Alexis de Tocqueville once said, “When the past no longer illuminates the future, the spirit walks in darkness.” I would like to develop a way to preserve and analyze digital archives to help journalists better cover their communities.
How would solving this problem help journalism?
In a marketplace where virality and page views are valued over analysis and context, newspapers are losing their identity. They report on towns that are rarely covered by national media; meanwhile their newsrooms continue to shrink. It is time for that to change. By preserving and potentially visualizing years of institutional knowledge and news coverage, we could help reporters and editors deepen their understanding of their communities and strengthen the role newspapers have within them.
Who is tackling a similar problem and how is your approach different?
Data analysis has primarily been a way to tell stories in journalism. But there’s more we can do internally to store and illustrate our own institutional knowledge to help make smarter coverage decisions. A professor at Columbia University recently created MedLEE (Medical Language Extraction and Encoding System) that extracts medical information from past patient reports and visualizes them. There are all kinds of organizations that use natural language processing (NLP) and data visualization techniques. Palantir Technologies is a data analytics company that tackles problems like climate change and cybersecurity.
What are the first questions you plan to pursue?
What types of digital information would be most helpful to preserve and access in local newsrooms? What are the best NLP and machine learning techniques to organize and extract data from digital news archives? What are the most effective ways to visualize that information?
What are the first steps you plan to take in working on your challenge?
Talk to reporters and editors in local newsrooms to learn about their newsgathering process. Learn from researchers and experts who are tackling the same problem in academia, finance, libraries and other industries. Research current archival systems and the latest developments in NLP, machine learning and data mining. Design a rough prototype.