October 22, 2013

BOUNDLESSINFORMANT only shows metadata

(Updated: January 23, 2017)

The day before yesterday, the French paper Le Monde broke with a story saying that NSA is intercepting French telephone communications on a massive scale. This is mainly based upon a graph from the BOUNDLESSINFORMANT program, which shows that during one month, 70,3 million telephone data of French citizens were recorded by the NSA.

Here, it will be clarified that the BOUNDLESSINFORMANT tool only shows numbers of metadata. Also some screenshots will be analysed, showing information about collection related to:




Metadata

As the Le Monde article, written by Jacques Follorou and Glenn Greenwald, failed to clarify the exact nature of the 70,3 million, it was unclear whether this number was about metadata or also about the content of phone calls. Combined with some sensationalism, this led to headlines like U.S. intercepts French phone calls on a 'massive scale'.

But this is incorrect. According to a presentation and a FAQ document, the BOUNDLESSINFORMANT tool is for showing the collection capabilities of NSA's Global Access Operations (GAO) division, which is responsible for intercepts from satellites and other international SIGINT platforms.

The program presents this information through counting and analysing all DNI (internet) and DNR (telephony) metadata records passing through the NSA SIGINT systems.

This means, all figures shown in the BOUNDLESSINFORMANT screenshots are about metadata and not about content. It is unclear how many phone calls are represented by the numbers of metadata records, but it's likely much less.

So for France, we only know for sure that NSA collected 70,3 million metadata records and not how many phone calls were actually intercepted in the sense of recording the call contents.

It should also be noted that BOUNDLESSINFORMANT is apparently only showing metadata collected by the GAO division. Therefore, data gathered by NSA's other main Signals Intelligence divisions, SSO (for collection from commercial companies) and TAO (for collection by hacking networks and computers), may not be included in the charts and the heat maps.


UPDATE #1:
On October 29, 2013, the Wall Street Journal reported that according to US officials, the metadata records for France and Spain were not collected by the NSA, but by French and Spanish intelligence services. The metadata were gathered outside their borders, like in war zones, and then shared with NSA. This confirms the explanation of the numbers of German metadata, given by Der Spiegel on August 5.

UPDATE #2:
On October 30, 2013, Glenn Greenwald published a statement claiming that his original reports, saying that NSA massively collected data in foreign countries, are still correct.

UPDATE #3:
On February 5, 2014, the Dutch interior minister revised his earlier statement from October by declaring that the 1.8 million "Dutch" metadata are actually collected from foreign sources by the Dutch military intelligence service MIVD in order to support military operations abroad.

CONCLUSION:
This means that the initial interpretation of the BOUNDLESSINFORMANT charts showing that NSA intercepted phone calls of European citizens is not correct. Instead they show metadata which were collected by European intelligence agencies for military purposes and subsequently shared with partner agencies like NSA.



France

Below is a screenshot from BOUNDLESSINFORMANT that shows information about collection from France between December 10, 2012 and January 8, 2013. In total, almost 70,3 million metadata records were collected:


The bar chart in the top part shows the numbers by date, with DNR (telephony) in green and DNI (internet) in blue. In this case only telephony metadata were collected, so we only see green bars.

In the lower part of the screenshot we see three sections with break-ups for "Signal Profile", "Most Volume" and "Top 5 Techs".

Signal Profile

The Signal Profile section shows a pie chart which can show the following types of communication:

- PCS: Personal Communications Service (mobile phone networks)
- INMAR: INMARSAT (satellite communications network)
- MOIP: Mobile communications over IP
- VSAT: Very Small Aperture Terminal
- HPCP: High Power Cordless Phone
- PSTN: Public Switched Telephone Network
- DNI: Digital Network Intelligence (internet data)

In this case, the majority of the signals are from PCS or mobile phone networks (dark blue) and a minor fraction from the Public Switched Telephone Network (dark yellow).

Most Volume

This section shows that all French metadata during the one month period were collected by a facility designated US-985D. This SIGAD is seen here for the first time and also Le Monde has no further information, except for the suggestion that it's from a range of numbers corresponding to the NSA's third party partners.

As the French metadata are all collected from mobile and traditional telephone networks, they may have been intercepted with the help of a (foreign or even French) telecommunications provider. In that case, it's possible that the metadata are from French phone numbers which are used by foreign targets (see Germany below).

Top 5 Techs

The techniques used for these interceptions appear under the codenames DRTBOX and WHITEBOX, which are disclosed here for the first time. Le Monde wasn't able to provide any more details about these programs or systems, but if we compare the numbers collected by these programs with the pie chart under Signal Profile, it seems likely that DRTBOX (which collected 89% of the data) accounts for the big PCS part of the pie chart, and WHITEBOX (11%) for the small PSTN part.



The Netherlands

Almost immediatly after Le Monde came with their story on October 20, 2013, the Dutch IT website Tweakers.net noticed that the German magazine Der Spiegel had published a similar screenshot about collection from the Netherlands early August:


In this case we only have the top part, with a bar chart showing that during a one month period, about 1,8 million telephony metadata records were collected from the Netherlands.

Again, this number is only about metadata, and therefore it doesn't tell us how many phone calls, let alone how many phone numbers were possibly involved.

The report by Tweakers.net was correct in explaining that the chart only shows metadata, but unfortunately, the headline initially said "NSA intercepted 1.8 million phonecalls in the Netherlands". This gave many people, including politicians, the idea that NSA was actually eavesdropping on a vast number of Dutch phone calls, which is not what the chart says, and which is also probably not what NSA is doing.

UPDATE:
On February 5, 2014, the Dutch interior minister and the defense minister came out with an official statement saying that the 1.8 million metadata, as shown in the aforementioned screenshot, are actually collected from foreign sources by the Dutch military intelligence service MIVD in order to support military operations abroad. A few days later, NRC Handelsblad also published the complete BOUNDLESSINFORMANT chart, including the lower sections.

> Latest details about the chart: BOUNDLESSINFORMANT: metadata collection by Dutch MIVD instead of NSA

> The whole story: Dutch government tried to hide the truth about metadata collection



Germany

On July 29, the German magazine Der Spiegel published a screenshot from BOUNDLESSINFORMANT which shows information about collection from Germany between December 10, 2012 and January 8, 2013. In total, more than 552 million metadata records were collected:


The bar chart in the top part shows the numbers by date, with DNR (telephony) in green and DNI (internet) in blue.

Signal Profile

In case of Germany, the pie chart shows that the communication systems are roughly divided into:

- 40% PCS (mobile communications)
- 25% PSTN (traditional telephony)
- 35% DNI (internet traffic)

Most Volume

This section shows that all German metadata were collected by two facilities, designated by the following SIGADs:

- US-987LA (471 million records)
- US-987LB (81 million records)

In a follow-up article by Der Spiegel from August 5, the German foreign intelligence agency BND said that it collected the 552 million metadata and believed "that the SIGADs US-987LA and US-987LB are associated with Bad Aibling and telecommunications surveillance in Afghanistan".* Bad Aibling is a small town in Southern Germany which had a huge listening post during the Cold War, which was also part of the ECHELON system. In 2004, the listening post was moved to a smaller facility nearby.

According to Der Spiegel, the BND collects metadata from communications which it had placed under surveillance and passes them, in massive amounts, on to the NSA. BND says that it's operating within German law and doesn't spy on German citizens. Therefore, Der Spiegel suggests that the data are only technically acquired in Germany, but are actually about foreign targets.

However, this explanation would only make sense if those foreigners were contacting (or using) German phone numbers and e-mail addresses, because otherwise there would be no reason for NSA to count their metadata as being German.

Top 5 Techs

The techniques used for these interceptions appear under the following codenames:

- XKEYSCORE (182 million records or 33% of the total of 552 million)
- LOPERS (131 million records or 24%)
- JUGGERNAUT (93 million records or 17%)
- CERF CALL MOSES1 (39 million records or 7%)
- MATRIX (8 million records or 1,4%)

(the record numbers don't add up to the total of 552 million, apparently there are more, smaller systems involved than the 5 shown here)

If we compare these percentages with the pie chart showing the signal profiles, we see that XKEYSCORE corresponds to the DNI or internet metadata. XKEYSCORE is a tool used for indexing and analysing internet data and therefore it's possible that also the other programs mentioned in the Top 5 Tech section are not for collecting data, but for processing and analysing them.

According to Der Spiegel, LOPERS is a system to intercept the public switched telephone network. Indeed, the approximately 24% of the data collected by LOPERS fits the PSTN part of the pie chart.

This leaves the other three programs, and also those not mentioned in this Top 5, being used for data from mobile communication networks. Der Spiegel confirms this for JUGGERNAUT, but we can assume this for CERF CALL MOSES1 and MATRIX too.



Spain

In the print edition of the Spanish paper El Mundo from October 28, 2013, there was the following screenshot from BOUNDLESSINFORMANT showing information about collection from Spain between December 10, 2012 and January 8, 2013. In total, 60 million metadata records were collected:


(screenshot via koenrh)

The various parts of this figure are the same as described above, so here we only look at the specifics for Spain.

Signal Profile / Most Volume

All records were collected from mobile communications networks (PCS) and this was done through an unknown facility designated by the following SIGAD:

- US-987S (60 million records)

This SIGAD is very similar to the ones used for collecting the German data (US-987LA and US-987LB) and it's assumed they stand for 3rd party facilities, that is, collection sites run by 3rd party partner agencies of NSA. It is also rather similar to US-985D, which collected the French metadata.

Top 5 Techs

All records were processed or analysed by only one system or program:

- DRTBOX (60 million records)

In the screenshot about France, we saw DRTBOX also being used for handling (meta)data derived from mobile communication networks, so we can assume this system is not specifically used for French communications, but for traffic from mobile communication systems in general.

DRTBOX

As almost all NSA codenames are (composed of) real words, it looks like DRTBOX is a spelling error, but a reader of this weblog pointed to another, very interesting option: DRT is also the abbreviation of Digital Receiver Technology, Inc. of Germantown, Maryland, which was taken over by US military contractor Boeing in 2009.

This makes it quite likely that the intercept devices of DRT are also used by NSA for collecting data from mobile communication networks. This equipment might then be installed at facilities with designators like US-987S and others. DRTBOX (or DRT Box) itself seems to be a system for processing or an interface for analysing the collected data, just like XKEYSCORE does for collected internet data.

> See for more about DRT: DRTBOX and the DRT surveillance systems



Norway

On November 19, the website of the Norwegian tabloid Dagbladet published the following screenshot from BOUNDLESSINFORMANT which shows information about collection from Norway between December 10, 2012 and January 8, 2013. In total, over 33 million metadata records were collected:


Once again, only telephony metadata were gathered, so we see only green bars in the bar chart.

Signal Profile / Most Volume

All records were collected from mobile communications networks (PCS), which was done through an unknown facility designated by the following SIGAD:

- US-987F (33 million records)

After US-987L for Germany and US-987S for Spain, US-987F is now the third known SIGAD starting with US-987, which indicates that this is an umbrella-designator for collection facilities in or targeted at different countries, each designated by a different letter.

Following the interpretation of former Guardian journalist Glenn Greenwald, the Norwegian paper Dagbladet wrote that NSA monitored 33 million Norwegian phone calls. This was almost immediatly corrected by the Norwegian military intelligence agency Etteretningstjenesten (or E-tjenesten), which said that they collected the data "to support Norwegian military operations in conflict areas abroad, or connected to the fight against terrorism, also abroad" and that "this was not data collection from Norway against Norway, but Norwegian data collection that is shared with the Americans".

This explanation is very similar to the one given by the German foreign intelligence agency about the metadata which appeared as being 'German' (see above), but also here it's the question on what grounds these data are counted as being Norwegian. If we follow the BOUNDLESSINFORMANT FAQ document, at least one end of the communication should be a Norwegian phone number.

Top 5 Techs

All records were processed or analysed by only one system or program:

- DRTBOX (33 million records)

Also in this case, the DRTBOX system is used for the communications collected from mobile networks, just like we saw in the BOUNDLESSINFORMANT screenshots about France and Spain.



Afghanistan

On November 22, the Norwegian tabloid Dagbladet published a screenshot from BOUNDLESSINFORMANT about Afghanistan:



This screenshot about Afghanistan published by Glenn Greenwald only shows information about some 35 million telephony (DNR) records, collected by a facility only known by its SIGAD US-962A5 and processed or analysed by DRTBox. But this number is just a tiny fraction of the billions of data from both internet and telephone communications from Afghanistan as listed in the global overview map of BOUNDLESSINFORMANT.

Afghanistan is undoubtedly being monitored by numerous SIGINT collection stations and facilities (like US-3217, codenamed SHIFTINGSHADOW which targets the MTN Afghanistan and Roshan GSM telecommunication companies), so seeing only one SIGAD in this screenshot proves that it can never show the whole collection from that country.

> See for more: Screenshots from BOUNDLESSINFORMANT can be misleading



Italy

On December 6, the website of the Italian newspaper L'Espresso published the following screenshot from BOUNDLESSINFORMANT which shows information about collection from Italy between December 10, 2012 and January 8, 2013. In total, almost 46 million metadata records were collected:


Once again, only telephony metadata were gathered, so we see only green bars in the bar chart.

Signal Profile / Most Volume

All records were collected from mobile communications networks (PCS), which was done through an unknown facility designated by the following SIGAD:

- US-987A3005 (45,9 million records)

Top 5 Tech

This SIGAD is once agian from the US-987-series, and comparing this one with others could show that the suffix A stands for Italy. After the A follows an unusual long number (3005). So far, only SIGADs with two additional characters were known.

- DRTBOX (45,9 million records)

After the screenshots related to France, Spain, Norway and Afghanistan, this one about Italy is the fifth in which the technique used to process and analyse the collected (meta)data is DRTBOX, which is a system to intercept wireless communication signals made by DRT Inc.




WINDSTOP

On November 4, the Washington Post published a screenshot from BOUNDLESSINFORMANT which shows information about collection under the WINDSTOP program. Between December 10, 2012 and January 8, 2013, more than 14 billion metadata records were collected:


The bar chart in the top part shows the numbers by date, with DNR (telephony) in green and DNI (internet) in blue.

According to the Washington Post, WINDSTOP is an umbrella program for at least four collection systems which are jointly operated by NSA and one or more signals intelligence agencies of the 2nd Party countries Britain, Canada, Australia and New Zealand.

Signal Profile

The pie chart shows that more than 95% of the metadata are collected from internet traffic (DNI), less than 5% is from mobile networks (PCS).

Most Volume

This section shows that under WINDSTOP, the metadata were collected by at least the following two facilities, designated by their SIGADs:

- DS-300 (14100 million records)
- DS-200B (181 million records)

In a sidenote, the Washington Post says that DS-300 is the SIGAD for an interception facility which is also known under the codename INCENSER. With 14 billion internet metadata records in one month, INCENSER seems to be one of NSA's major internet collection programs, as for March 2013, the total of internet metadata collected worldwide was 97 billion records. For now, it's unclear where this enormous amount of data comes from.

DS-200B is a facility codenamed MUSCULAR, which is used for tapping the cables linking the big data centers of Google and Yahoo outside the US. This intercept facility is located somewhere in the United Kingdom and operated by GCHQ and NSA jointly. MUSCULAR collected some 181 million records, a small number compared to the 14 billion of INCENSER, but still way too much given its low intelligence value - according to NSA's Analysis and Production division.

It's interesting to see data from MUSCULAR mentioned in this screenshot, because a FAQ document about BOUNDLESSINFORMANT from 2010 said that no metadata from MUSCULAR were counted by this tool. But as this chart shows records from December 2012 and January 2013, it seems that meanwhile also metadata from MUSCULAR were added.

Top 5 Techs

The programs used for processing and analysing these interceptions are:

- XKEYSCORE (14100 million records)
- TURMOIL (141 million records)
- WEALTHYCLUSTER (1 million records)

Just like we saw in the chart about the German metadata, the internet (DNI) data are processed by the XKEYSCORE tool. Almost all these internet data are collected by the facility designated DS-300 and codenamed INCENSER.

TURMOIL is a database or a system which is part of the TURBULENCE program, and seems to be used for selecting and storing common internet encryption technologies, so they can be exploited by NSA. If we compare the numbers, we see that TURMOIL is used for processing most of the data collected by DS-200B or MUSCULAR. An NSA presentation confirms that data collected by MUSCULAR are ingested and processed by TURMOIL.

WEALTHYCLUSTER is also related to the TURBULENCE program and is described as "a smaller-scale effort to hunt down tips on terrorists and others in cyberspace" and is said to have helped finding members of al-Qaida.

(Updated with the information about the German metadata, the new explanation by the Wall Street Journal, the WINDSTOP metadata, the data from Spain and Norway, and the revised interpretation of the Dutch chart)

Update:
During a hearing of the German parliamentary investigation commission on January 19, 2017, former BND president Schindler said that the BOUNDLESSINFORMANT charts that Snowden took, were from training course material. This was said here for the first time and given the problems these charts caused for BND, it's possible that they asked NSA for more details after which this explanation came up. However, this still doesn't explains why the charts were interpreted incorrectly.



Links and Sources
- HuffingtonPost.com: NSA: Europeans Did Spying, Handed Data To Americans
- Dagbladet.no: NSA-files repeatedly show collection of data «against countries» - not «from»
- VoiceOfRussia.com: Denmark admits to tapping phones in conflict zones abroad
- Wall Street Journal: U.S. Says France, Spain Aided NSA Spying
- Cryptome.org: Translating Telephone metadata records to phone calls
- The Week: Why the NSA spies on France and Germany
- Le Monde: France in the NSA's crosshair : phone networks under surveillance
- Tweakers.net: NSA onderschepte in maand metadata 1,8 miljoen telefoontjes in Nederland
- De Correspondent: Wat doet de NSA precies met het Nederlandse telefoonverkeer?
- Der Spiegel: Daten aus Deutschland

6 comments:

Rob1 said...

Thanks for the analysis. That's seriously lacking in the newspapers.

Regarding the SIGAD, US-985D is also close to US-983, US-984 and US-990, all corporate partners.

Given that the only kind of data provided by US-985D is mobile phone and public switched phone metadata, maybe it is a telephony corporation ?

I have a hard time figuring out how it could be a third party -- or more honestly, I'm scared by what it could mean.

Anonymous said...

Boundless reports Metadata , but since these are not US citizens , there's no content restrictions .

they collect and analyze metadata because it 'cheap' and it reveals networks . they only bother with content for terrorists , criminals , and financial and social movers for purposes of intelligence , interdiction , and entrapment .

the game doesn't change , only the means

Bauke Jan Douma said...

You're rather apologetic of the NSA.
Metadata IS data.

P/K said...

Of course, metadata are data, but metadata are not the same as the content of phone calls. The latter is what people think of if you use words like 'intercepting'.

Anonymous said...

Regarding the German metadata, you write: "According to Der Spiegel, the BND collects metadata from communications which it had placed under surveillance and passes them, in massive amounts, on to the NSA. (..) Therefore, Der Spiegel suggests that the data are only technically acquired in Germany, but are actually about foreign targets. (...) However, this explanation would only make sense if those foreigners were contacting (or using) German phone numbers and e-mail addresses, because otherwise there would be no reason for NSA to count their metadata as being German."

Could the explanation just be that Boundless Informant just shows all information that is collected from German sigads? All this data is counted as 'German' metadata. This includes for instance sigad 1: info that a German intel agency collected from German soil, sigad 2: metadata the NSA collected on German soil?

P/K said...

That's an interesting theory, and I think we shouldn't rule out that option.

My remark was based upon the BOUNDLESSINFORMANT FAQ document, which says on page 2: "In order to bin the records into a country, a normalized phone number (DNR) or an administrative region atom (DNI) must be populated within the record". That seems to indicatie that it's a phone number which is used to pin a record to a country, either the number of the caller or the number of the reveiver.

In Dutch: Meer over het wetsvoorstel voor de Tijdelijke wet cyberoperaties