This is a hell of a read:
www.wired.com
Contact chaining on a scale as grand as a whole nation’s phone records was a prodigious computational task, even for Mainway. It called for mapping dots and clusters of calls as dense as a star field, each linked to others by webs of intricate lines. Mainway’s analytic engine traced hidden paths across the map, looking for relationships that human analysts could not detect. Mainway had to produce that map on demand, under pressure of time, whenever its operators asked for a new contact chain. No one could predict the name or telephone number of the next Tsarnaev. From a data scientist’s point of view, the logical remedy was clear. If anyone could become an intelligence target, Mainway should try to get a head start on everyone.
“You have to establish all those relationships, tag them, so that when you do launch the query you can quickly get them,” Rick Ledgett, the former NSA deputy director, told me years later. “Otherwise you’re taking like a month to scan through a gazillion-line phone bill.” And that, right there, was where precomputation came in. Mainway chained through its database continuously—“operating on a 7x24 basis,” according to the classified project summary. You might compare its work, on the most basic level, to indexing a book—albeit a book with hundreds of millions of topics (phone numbers) and trillions of entries (phone calls). One flaw in this comparison is that it sounds like a job that will be finished eventually. Mainway’s job never ended. It was trying to index a book in progress, forever incomplete. The FBI brought the NSA more than a billion new records a day from the telephone companies. Mainway had to purge another billion a day to comply with the FISA Court’s five-year limit on retention. Every change cascaded through the social graph, redrawing the map and obliging Mainway to update ceaselessly.
Mainway’s purpose, in other words, was neither storage nor preparation of a simple list. Constant, complex, and demanding operations fed another database called the Graph-in-Memory.
When the Boston marathon bombs exploded in April 2013, the Graph-in-Memory was ready. Absent unlucky data gaps, it already held a summary map of the contacts revealed by the Tsarnaev brothers’ calls. The underlying details—dates, times, durations, busy signals, missed calls, and “call waiting events”—were easily retrieved on demand. Mainway had already processed them. With the first hop precomputed, the Graph-in-Memory could make much quicker work of the second and the third.
To keep a Tsarnaev graph at the ready, Mainway also had to precompute a graph for everyone else. And if Mainway had your phone records, it also held a rough and ready diagram of your business and personal life.
This is an interesting bit of history. Unfortunately, we're all still living in it.

Inside the NSA’s Secret Tool for Mapping Your Social Network
Edward Snowden revealed the agency’s phone-record tracking program. But thanks to “precomputed contact chaining,” that database was much more powerful than anyone knew.
Contact chaining on a scale as grand as a whole nation’s phone records was a prodigious computational task, even for Mainway. It called for mapping dots and clusters of calls as dense as a star field, each linked to others by webs of intricate lines. Mainway’s analytic engine traced hidden paths across the map, looking for relationships that human analysts could not detect. Mainway had to produce that map on demand, under pressure of time, whenever its operators asked for a new contact chain. No one could predict the name or telephone number of the next Tsarnaev. From a data scientist’s point of view, the logical remedy was clear. If anyone could become an intelligence target, Mainway should try to get a head start on everyone.
“You have to establish all those relationships, tag them, so that when you do launch the query you can quickly get them,” Rick Ledgett, the former NSA deputy director, told me years later. “Otherwise you’re taking like a month to scan through a gazillion-line phone bill.” And that, right there, was where precomputation came in. Mainway chained through its database continuously—“operating on a 7x24 basis,” according to the classified project summary. You might compare its work, on the most basic level, to indexing a book—albeit a book with hundreds of millions of topics (phone numbers) and trillions of entries (phone calls). One flaw in this comparison is that it sounds like a job that will be finished eventually. Mainway’s job never ended. It was trying to index a book in progress, forever incomplete. The FBI brought the NSA more than a billion new records a day from the telephone companies. Mainway had to purge another billion a day to comply with the FISA Court’s five-year limit on retention. Every change cascaded through the social graph, redrawing the map and obliging Mainway to update ceaselessly.
Mainway’s purpose, in other words, was neither storage nor preparation of a simple list. Constant, complex, and demanding operations fed another database called the Graph-in-Memory.
When the Boston marathon bombs exploded in April 2013, the Graph-in-Memory was ready. Absent unlucky data gaps, it already held a summary map of the contacts revealed by the Tsarnaev brothers’ calls. The underlying details—dates, times, durations, busy signals, missed calls, and “call waiting events”—were easily retrieved on demand. Mainway had already processed them. With the first hop precomputed, the Graph-in-Memory could make much quicker work of the second and the third.
To keep a Tsarnaev graph at the ready, Mainway also had to precompute a graph for everyone else. And if Mainway had your phone records, it also held a rough and ready diagram of your business and personal life.
This is an interesting bit of history. Unfortunately, we're all still living in it.