What I learned about digital news archives

April 22, 2015 0 comments

Before I changed the focus of my Knight journalism project (more on that in the coming weeks), I spent the first six months of my fellowship learning a lot about the state of digital news archives. In fact, this was my original innovation proposal for my Knight Fellowship application.

TL,DR: How can we preserve and analyze digital news archives to better cover our communities?

I interviewed dozens of historians, archivists, librarians, journalists and executives, who care about preserving the news, but no one has it quite figured out. Here are a few of the challenges, and naturally the opportunities, for journalists and news organizations to consider:

Historical vs. digital preservation. In the past, archiving the news was relatively straightforward. Newspapers, radio and TV broadcasts were a frozen moment in time. Once they were published, or hit the airwaves, they could no longer be altered. Companies, such as LexisNexis, ProQuest, Factiva, and Merlin One, and memory institutions, like the National Digital Newspapers Program at the Library of Congress, are refining the process of digitizing physical newspapers, but at least there’s a process. When it comes to born-digital content, it’s a bit more complicated.

The nature of digital is dynamic. Stories are constantly updated and delivered in a variety of formats. At what point do we preserve them? Is it when they’re first published online? Or later when the news dies down and the story is more complete? How do you capture tweets, Vines, Instagrams, interactives, links, comments, ads, surveys, quizzes, and other types of content?

A few organizations have taken a stab at it. The Internet Archive’s Wayback Machine, for instance, takes snapshots of millions of webpages overtime so you can see what the NYT’s homepage looked like in 1996 compared to what it is now. But you can’t search for specific news stories. Newsdiffs is a neat tool that tracks changes in articles from the NYT, WaPo, CNN, Politico and BBC, but it has yet to include videos, photos and other forms of multimedia. Wikipedia addresses the Internet’s revisionist tendencies by documenting user edits in its “view history” tab. Then there’s the Knight Foundation-backed Digital Public Library of America, which is digitizing and visualizing historical collections, but they aren’t centered around news. If journalism’s mission is to document our lives, how can we preserve our journalism? 

Newsroom culture and priorities. In the past eight months, the Bay Area Guardian, GigaOm, the Bold Italic and Homicide Watch DC (just to name a few) have ceased to exist. The financial conundrum of running a news organization is real. Newsroom leaders are still wrestling with how to make journalism sustainable, and hopefully profitable, but the drive to measure an immediate ROI doesn’t allow for experimentation and discovery, especially when it comes to reimagining news archives. It encourages newsrooms to revert to what they know and accept things the way they are. The culture needs to change. 

I’m not the only one who believes this, but a key competitive advantage between legacy news organizations and digital news startups is the depth of their institutional knowledge. Local journalists have been covering their beats and communities for decades, producing stories, photos and other forms of multimedia all along the way. That’s a lot of data, which if structured correctly, could be valuable to reporters and residents alike. How we can leverage that inherent strength? The NYT’s Cooking collection is a taste (pun intended) of how to surface, showcase and monetize archive stories. The LAT developed a similar recipes section too. Legacy news organizations are sitting on a trove of content that could evolve into a range of potential products, but it requires a shift in newsroom culture and priorities to create or adopt something that never existed before. 

Structured journalism. Championed by Reg Chua, the executive editor at Reuters, and Bill Adair, the creator of Politifact, structured journalism is a movement “to change the way we create content so as to maximize its shelf-life, as well as structuring – as much as possible – the information in stories, at the time of creation, for use in databases that can form the basis of new stories or information products.” Essentially, how can we re-think how we produce stories and present them in different ways? Spaceprob.es, Emergent and Event Registry are just a few projects that have been mentioned on the structured journalism listserv. Why does this matter? How we structure our stories is connected to the value we can derive from our archives. Imagine if we can navigate through our own content in visual ways. How could that help editors make more informed decisions about news coverage? How quickly can reporters learn a new beat, historically contextualize their coverage, and generate new story ideas? 

So, what’s next? I’m collaborating with a co-conspirator, Tiago Etiene, a programmer based in SF, who’s equally interested in reaching out to news organizations and testing our hypothesis. We believe digital news archives are a source of untapped data and a natural competitive advantage for news organizations, but its full potential has yet to be realized. We want to build a tool that can help journalists leverage their institutional knowledge.

If you’re a news outlet that’s game for experimenting (at no financial cost), please reach me at yleow [at] stanford [dot] edu. 

And if you’re a designer who nerds out about news, history or datavisualizations, please shoot me a note. 

The more we test in this space, the more we’ll learn. The Knight Foundation has been actively funding libraries in an effort to “build more knowledgeable communities,” but it’s no coincidence that they’re investing in institutions dedicated to preserving the past. The Educopia Institute is hosting a conference, Dodging the Memory Hole II: An Action Assembly, from May 11-12, 2015 at UNC to bring together news publishers, press associations, technologists, researchers, libraries, corporations and funding agencies to tackle the challenge of preserving digital news content. 

My time at Stanford officially wraps up on June 5, but it’s not over. If there’s anything I learned this year, it’s that this project, and all worthwhile ideas, are a constant work in progress.

A special thanks to:
James Robinson, New York Times
Evan Sandhaus, New York Times
Liz McClure, JSK fellow 2012
Jeremy Hay JSK fellow, 2015
Michael Morisy JSK fellow, 2015
Zena Barakat JSK fellow, 2015
Donna Borak JSK fellow, 2015
Christina Passariello JSK fellow, 2015
Akoto Ofori-Atta JSK fellow, 2015
Anne Kornblut JSK fellow, 2015
Charla Bear JSK fellow, 2015
Leigh Pointinger, San Jose Mercury News
Carlos DelaSerna JSK fellow, 2014
Michelle Price, Associated Press
Amy Wang, Arizona Republic
Tom Huang, Dallas Morning News
Michelle Holmes, Alabama Media Group
Frank Shyong, LAT
Matt Stevens, LAT
Paolo Carretta, Universidade de São Paulo
Andy Waters, Columbia Daily Tribune
Clea Benson, Bloomberg News
Peter Rippon, BBC
Cary Schneider, LAT
Ted Han, Document Cloud
Trei Brundrett, Vox Media
Logan McClure, Palantir
Miguel Paz, Poderopedia
Steve Jones, SF Bay Guardian
Mark Bieschke, SF Bay Guardian
Anne Wooton, Pop-up Archive
Bailey Smith, Pop-up Archive
Kathleen Hansen, University of Minnesota 
Nora Paul, University of Minnesota
Edward McCain, Donald W. Reynolds Institute
Victoria McCargar, Mount St. Mary’s College
David Hansen, UNC Chapel Hill
Jonathan Kalan, Timeline
Heather Corcoran, Colloq
Zach Kaplan, Colloq
Paul Quinn, Minezy
T. Christian Miller, ProPublica
Deborah Thomas, Library of Congress
Kenny Whitebloom, Digital Public Library of America
David Riordan, New York Public Library
Abigail Grotke, Library of Congress
Amy Rudersdorf, Digital Public Library of America 
Gretchen Gueguen, Digital Public Library of America 
Matt Galligan, Circa


Project Thunderdome One Year Later

April 20, 2015 0 comments

This was originally published on Poynter on Friday, April 17th, 2015. Read the rest of the feature along with our lessons learned.

“In our industry right now, journalists get laid off. Newsrooms shut down. Publications fold. But what happened to most of Thunderdome’s people is different from other stories we’ve told one year later for a few reasons. To start, most of Thunderdome’s staff lived and worked in a media-rich city and the end of Thunderdome was quite public. Also, they all lost their jobs, not just some staff or even a whole department. The nature of the work they did — adapting quickly, testing, learning, failing and starting again, meant a lot of them were both attractive to other news organizations and that they knew how to pivot and keep moving. And for many of them, the shaky state of the journalism industry is their normal. They’ve been laid off before.

“As hard as situations like this are, they really give you the opportunity to do a gut check about your career path and about what’s important to you,” Tomlin said. “I think each one of us had to do some soul-searching about where we ‘fit’ within the industry after this experience. And while we were all scrambling to find work, most of us wanted to find jobs that left us feeling both as challenged and fulfilled as the ones we were leaving.”

Here’s a look at how seven of the journalists from Project Thunderdome have managed since their final day one year ago.


Yvonne Leow doesn’t quite remember how she felt on that last day. It wasn’t nostalgic — they’d all had plenty of chances to say goodbye. Maybe it was more a sense of excitement for what might come next.

She’d never been laid-off before and didn’t know what to expect, but about two weeks after her last day at Thunderdome, she was chosen as a John S. Knight fellow at Stanford. Six month before the Thunderdome layoffs, Leow’s dream job would have been to work as an interactive editor.

“I would have loved to be producing Snow Fall after Snow Fall,” she said.

Now, she wants to understand the business side of journalism. She’s interested in messaging apps and companies that bring together journalism and tech, such as Medium. Leow doesn’t know exactly what her future looks like, but she wants to get her hands dirty.

“I would like to know how we make money,” she said. “I think overall as an industry, we don’t spend enough time trying to know that or learn that.”

Knight Journalism Challenge: Part 1

My Knight Journalism Challenge

November 16, 2014 0 comments

Local newspapers have been documenting our lives for centuries. Decades of institutional knowledge are buried in antiquated archival systems or in the minds of journalists who eventually retire or leave the industry. French political thinker Alexis de Tocqueville once said, “When the past no longer illuminates the future, the spirit walks in darkness.” I would like to develop a way to preserve and analyze digital archives to help journalists better cover their communities.

How would solving this problem help journalism?

In a marketplace where virality and page views are valued over analysis and context, newspapers are losing their identity. They report on towns that are rarely covered by national media; meanwhile their newsrooms continue to shrink. It is time for that to change. By preserving and potentially visualizing years of institutional knowledge and news coverage, we could help reporters and editors deepen their understanding of their communities and strengthen the role newspapers have within them.

Who is tackling a similar problem and how is your approach different?

Data analysis has primarily been a way to tell stories in journalism. But there’s more we can do internally to store and illustrate our own institutional knowledge to help make smarter coverage decisions. A professor at Columbia University recently created MedLEE (Medical Language Extraction and Encoding System) that extracts medical information from past patient reports and visualizes them. There are all kinds of organizations that use natural language processing (NLP) and data visualization techniques. Palantir Technologies is a data analytics company that tackles problems like climate change and cybersecurity.

What are the first questions you plan to pursue?

What types of digital information would be most helpful to preserve and access in local newsrooms? What are the best NLP and machine learning techniques to organize and extract data from digital news archives? What are the most effective ways to visualize that information?

What are the first steps you plan to take in working on your challenge?

Talk to reporters and editors in local newsrooms to learn about their newsgathering process. Learn from researchers and experts who are tackling the same problem in academia, finance, libraries and other industries. Research current archival systems and the latest developments in NLP, machine learning and data mining. Design a rough prototype.

Yvonne Leow’s bio. Have suggestions or questions about this challenge? Email yleow@stanford.edu.

Knight Journalism Fellowship at Stanford

May 10, 2014 0 comments

Stanford's campus.

Last week I learned I was one of twelve U.S. fellows to be invited to participate in the John S. Knight Journalism Fellowship at Stanford. The news was overwhelming. As I listened to Jim Bettinger over the phone, my face could do nothing but smile. My mind flooded with joy, my heart brimmed with gratitude. Ever since I submitted my application in early-January and interviewed with Knight program officers in late-March, we have all been cautiously waiting for the final decision. I write “we” because, like everything else in my life, there’s no way I could have done this on my own.

From the very beginning, Mandy Jenkins, Robyn Tomlin and Jim Brady were pillars of support. When my ideas and self-confidence began to falter during the application process, Raghu Vadarevu and Harry Lin put their names on the line and helped me keep it together. Former and current Knight Fellows Andy Donohue, Martin Kotynek, Latoya Petersen and Shazna Nessa were incredibly generous with their advice, and it still amazes me how Frank Shyong, Lam Thuy Vo, Desiree Li, Brian Hernandez, Eric Olander and so many others I’m grateful to call friends, put up with my anxious G-chats and incoherent ramblings about the misty future of newspapers. I would not be embarking on this adventure if it were not for them.  They deserve all of my thanks.

Now here comes the fun part. What will I be working on at Stanford this fall? My proposal is to develop a tool that analyzes and visualizes archive stories to help journalists better report and cover their communities. Former Thunderdome colleague Adrienne LaFrance eloquently explains why yesterday’s news may be far more important than we think. She poses, “how can news organizations expect anyone to find their stories valuable today if those same organizations are sending the message that their archives aren’t worth showcasing tomorrow?” I completely agree.

The concept is vague, broad and ridiculously open-ended right now, but it’s a start with more to come. There are others who are thinking about this too. If you are one of them, let’s collaborate. If your newsroom is interested in experimenting with Knight, let’s talk. Because at the end of the day, ideas are just ideas, what makes them exciting and worthwhile is making them come to life.

Beyond Project Thunderdome

April 3, 2014 5 comments

Yesterday reminded me of where I was two weeks ago. Burrowed into the corner of my window seat on a cross-country flight watching “You’ve Got Mail.” Yes. The Nora Ephron nineties rom-com starring Meg Ryan, Tom Hanks and AOL dial-up. I was particularly sappy that week so I couldn’t resist. As I was watching the film for the third or fourth time, this scene felt especially poignant.

For those who haven’t seen it (why?!), here’s some context. Mega bookstore owner Joe Fox explains, and partially apologizes, to Kathleen Kelly for putting her family-owned bookstore out of business by saying “it wasn’t personal.” 

We’ve all heard that expression before. It’s synonymous with the realities of doing business. Buyouts, bankruptcies, takeovers, mergers, acquisitions, firings, layoffs, all of it can be absolved in one phrase. But this is Kelly’s response:

“What is that supposed to mean? I am so sick of that. All that means is that it wasn’t personal to you. But it was personal to me. It’s personal to a lot of people. And what’s so wrong with being personal, anyway?…Because whatever else anything is, it ought to begin by being personal.”

Who knew in-flight entertainment could be so profound.

the hilarious interactives team.

What a good-lookin’ interactives team.

Yesterday, everyone learned Project Thunderdome, John Paton’s Digital First initiative, was imploding. Yes, it’s all over.  As I listened to our CEO explain the strategic decision to shutter our 50-person newsroom, I have every reason to believe that Thunderdome’s demise was driven by economics. A company’s survival depends on the bottom line, and leaders have to make tough calls to preserve it. I get it and I agree, the decision wasn’t personal.

winning Thunderdome's first team bonding activity was by far a career highlight.

Winning Thunderdome’s all-staff scavenger hunt was a career highlight. I will be trash talking for years.

But standing in the meeting, exchanging glances of empathy with my peers, seeing tears well up in some of their eyes, I realized that’s not entirely true. Most if not all of us joined Thunderdome because we believed in it. Wholeheartedly. It was a job, sure, but we spent nearly two years aspiring to accomplish something greater. And in that time we got to know each other’s story. Where we grew up, our loved ones, our favorite beers, our quirky Internet obsessions, I mean, I know every editors’ leadership-personality traits for crying out loud. We did not all bond with one another the same way, but at Thunderdome it always felt more than just a newsroom, it felt like a team.

there's only one way to peel a potato.

While filming Kitchen Pop, David informed us that “There is only one way to peel a potato,” and without missing a beat, Courtney replied “You’re full of shit.” Yea, that’s my team.

Looking back, I’m glad I struck up a conversation with Robyn Tomlin at the 2012 UNITY convention. I’m thankful she introduced me to Jim Brady, who vetted me for the role. I’m grateful Mandy Jenkins invited me to come along for the ride. Thunderdome rekindled my love for journalism, and I feel fortunate to have had the chance to work with everyone at DFM. (A special thanks to Tim Rasmussen and my scrappy video team, David Freid and Courtney Wells, for also reminding me why we do what we do, especially when I needed to hear it most.)

My last day at DFM is April 17. While being unemployed is far from ideal, I’m excited to see what’s next and I’m allowing myself to be open to the possibilities. Every city is up for grabs. Every opportunity is another journey. Every newsroom a potential second home. Despite the circumstances, I’m happy to be a journalist right now. The level of empathy and support from the broader community is astounding. Visual storytelling combined with data journalism has never been more innovative. And I can’t even count the number of companies launching digital initiatives and experimenting with new business models.

Our industry is alive, messy and evolving. There’s no better time to be a bullish optimist. So here we are. My newsroom “imploded,” but we’re all walking away from the rubble far more experienced at building a digital newsroom from scratch, wiser for having tried and more resilient for having survived. And I’m confident we will. Because it’s not just the story of Project Thunderdome that strived to accomplish something greater. It’s the spirit of journalism.

my how far we've come.

Project Thunderdome circa 2012. My how far we’ve come.

Drone School

January 9, 2014 0 comments

We’ve been thinking a lot about drones recently at Thunderdome. Tom Meagher, our data editor, and I, his sidekick, lead a committee to establish guidelines for how and where our journalists can fly drones. The FAA is still trying to figure how to legislate drone flights so the legalities are murky. For instance, it is illegal to fly drones out in the open, but flying them inside a university gym is perfectly fine. So that’s what we did. Thanks to David Freid for producing our Drone School video.

1 - 6