https://techinquiry.org/SiliconValley-Military/

Reports of a
Silicon Valley/Military Divide
Have Been Greatly Exaggerated

Jack Poulson <jack@techinquiry.org>
July 7, 2020

Overview

We performed an in-depth analysis of all public US federal (sub)contracting data over the last four and a half years to estimate the rankings of tech companies, both in and out of Silicon Valley, as contractors with the military, law enforcement, and diplomatic arms of the United States.

Inspired by Mijente's groundbreaking report on the US tech companies supporting the Immigration and Customs Enforcement (ICE) agency of the Department of Homeland Security (DHS), we also ranked contractors with the Justice Department, Department of Defense (DoD), State Department, Agency for International Development (AID), and Agency for Global Media (AGM).

We hope to address the alarmist claims from the Chairman of the Joint Chiefs of Staff and corporate executives that a tech company -- namely, Google -- electing to not directly contribute to weapons systems is "treasonous" and part of a divide between Silicon Valley and the US military that is a "national-security threat". [1, 2, 3]


We separately address:

  1. The implication that defense contracting is rare in Silicon Valley -- the capacity question, [4] and
  2. That the Department of Defense appears to only publicly criticize U.S. tech companies facilitating authoritarian suppression of dissent as a means of demanding that they contract with the Department of Defense -- the human rights question.


The Alleged Silicon Valley/Military Divide: Capacity

Our analysis shows a diversity of contracting postures (see Tables 2 and 3), not a systemic divide from Washington. Within a substantial list of namebrand tech companies, only Facebook, Apple, and Twitter look to be staying out of major military and law enforcement contracts. [5, 6]

The accusations of treason stemmed from Google's 2018 release of its worker-demanded AI Principles; because one of the principles involved a commitment to not building weapons systems, Google promised to not renew its contract providing custom-built drone object tracking AI for a Joint Artificial Intelligence Center (JAIC) pathfinder project (Maven). [7, 8]  Months later, the company cited potential conflicts with their AI Principles, along with their missing government certifications, as their motivations for walking away from the Joint Enterprise Defense Infrastructure cloud competition. The missing certifications, as well as the near-certainty that the award would go to Amazon or Microsoft, indicated that Google was not so much exiting federal cloud contracting as limiting embarrassment from a loss.

This intuition was later born out through the company's deep involvement with the Department of Defense's efforts to include Artificial Intelligence / Machine Learning in the modernization of its battle networks: the Defense Innovation Initiative. [9]  Google's former CEO, Eric Schmidt, has chaired two of its components since their inception: the Defense Innovation Board (DIB) and the National Security Commission on AI (NSCAI). And, in May, Google Cloud poached the Executive Director of the DIB and then landed a secure cloud contract through its sibling, the Defense Innovation Unit (DIU). [10, 11]  On balance, Google's position became supporting the DoD's cloud and cybersecurity while avoiding direct contributions to weapons systems. [12]

Federal procurement data also suggests, contrary to popular narrative, that Palantir and Anduril are, financially speaking, modest players in the defense contracting space relative to Hewlett Packard (especially its spinoff, Perspecta), IBM, Microsoft, Dell Technologies, AT&T, Verizon, and Amazon. Even Accenture, through Accenture Federal Services, and Johns Hopkins University, through its Applied Physics Laboratory, rank substantially higher within the studied DoD and DHS agencies. [13]

We conclude that the US weapons and intelligence community dramatically overreacted to a prominent tech company democratically deciding to not contribute to weapons systems. [14]  In July 2019, as part of the Aspen Security Forum, Defense Innovation Unit director Michael Brown stated that coverage of Maven was "overblown" before pointing out that many tech companies, namely Amazon and Microsoft, understand that workplaces should not be democracies and that

"the place to exercise [concern over weapons systems] is at the ballot box and we need to support the government." [15, 16, 17]
The namebrand tech companies, universities, and tech-oriented traditional defense contractors, by and large, appear to be willing and able to modernize US battle networks. [18]


The Alleged Silicon Valley/Military Divide: Human Rights

We argue that two of the primary tech defense contractors, Microsoft and IBM, helped normalize their industry's suppression of human rights in exchange for market access. Despite the human rights messaging of Microsoft's head lawyer and lobbyist, the company has proactively suppressed dissent in Bing for more than a decade, and its subsidiary, LinkedIn, is infamous for doing the same; in early 2010 Bill Gates pointedly criticized Google for -- as it turns out, temporarily -- taking a principled stand for human rights. [19]  IBM's "safe city" products are known to have involved a video surveillance system for strongman Rodrigo Duterte in Davao City. And IBM's CEO having directly led sales of cataloguing equipment for Nazi Germany, which directly contributed to the Holocaust, is still defended by the company.

Similarly, Perspecta, a close Hewlett Packard affiliate/spin-off that dominates our chart of tech defense contractors, incorporates components of both Computer Sciences Corporation, which helped charter CIA rendition flights to secret prisons, and QinetiQ North America. QinetiQ, the privatization of England's former Defense Evaluation and Research Agency, has had former CIA Chief who personally authorized his agency's use of torture on its board. [20]

Another tech defense contractor, Cisco, has been in court since 2013 for not only helping custom-build China's "Golden Shield" project -- commonly known as the "Great Firewall of China" -- but even purpose-building a censorship, video surveillance, and "forced conversion" module for suppressing a dissident religious minority. Some of Cisco's internal marketing materials mentioning this work even leaked the day before a Senate human rights hearing. And despite Google's 2010-2018 public stance against suppressing dissent, Google's former CEO, and primary interface to the DoD, Eric Schmidt, has defended complying with authoritarian demands since 2006. [21]

Companies and executives which comply with authoritarian demands are only openly criticized by the US national security community if they do not significantly interface with the DoD. When they are willing to partner with the DoD, as is the case with Eric Schmidt and Reid Hoffman, they are welcomed as leaders.

Data Availability

Alongside this report, we are officially releasing the initial version of our procurement and lobbying explorer: techinquiry.org/lobbying/. In addition to providing summaries of over one hundred thousand companies' procurement and lobbying behavior, and lists of heavy-hitter vendors for over 100 government agencies, it includes custom nearest-neighbor utilities for finding lesser-known analogues of contractors.

Our manual curation of the noisy FPDS data is also made public in JavaScript Object Notation (JSON) via:


And since this work was sparked by a need to understand federal procurement data well enough to properly submit Freedom of Information requests for contracts, we are also releasing the responses to our requests. They involve three tech companies (two based in Silicon Valley):


Lastly, we make public our lists of annotated subcontract award summaries, which we describe in detail in a later section.

Analyzing Direct Contracting

This project began with the submission of a large number of Freedom of Information (FOI) requests, through the Office of the Secretary of Defense, for contracts relating to Silicon Valley's role in the Defense Innovation Initiative. Intermediate responses made clear that at least a passing familiarity with the Federal Procurement Data System was required. And due to the extended wait times, high redaction rates, and low response rates, we decided that a detailed study of the hundreds of gigabytes of federal procurement awards might prove intrinsically valuable, in addition to improving our ability to target future FOI requests.

While Tech Inquiry is entirely made up of tech workers, roughly our first year was spent pointedly avoiding the common pitfall of assuming that change is best achieved by throwing up a website and writing software. But such an acknowledgement of the centrality of organizing and communications should not prevent detailed curation and analysis of datasets that allow us to study up, such as databases of corporate registrations, corporate lobbying and political contributions, or, in the case of this report, military and law enforcement contracting.

Given that the author previously built recommendation systems for a major tech company, one of our first questions was whether the FPDS award data was rich enough to allow for the automatic retrieval of answers to questions of the form: "Which companies contract like Palantir?". As it turns out, the answer is yes -- we discuss our system for producing nearest neighbors of FPDS vendors in a later section. But, through our own usage of our tool, we found the following, more basic, questions to typically be more important:

Of all of these questions, the last turns out to be the most critical -- and, by far, the most laborious -- for understanding tech defense contracting, and we delay its discussion to a later section. The others can be fairly quickly answered in each instance through the curation of aPostgres database of geocoded FPDS award data with an associated interface (in our case, techinquiry.org/lobbying/).


But it is helpful for us to take a step back and briefly review the context of US federal procurement data before diving into the structure of the associated data feed and how we extract answers to our above questions from it.

Overview of the Federal Procurement Data System and its Curation

For an in-depth overview of the roots of U.S. federal procurement, dating back to the Revolutionary War, we highly recommend The U.S. Federal Procurement System: An Introduction by Christopher R. Yukins. But this report only involves the Federal Procurement Data System (FPDS), which, since 2010, has been run by IBM through the General Services Administration. At roughly 1:30AM ET each day, anywhere from 10,000 to 100,000 new award modification summaries are made available through an ATOM feed provided by the official source for FPDS data, fpds.gov. [22]

While FPDS is the definitive source for US federal procurement data, it is known to have numerous shortcomings, such as inconsistencies and and inaccuracies in award amounts, slow and incomplete uploads from contract officers (including 90 day delays for DoD procurement), and corrections frequently taking place years after the signing data. The Government Accountability Office (GAO) issued a report in late 2019 covering many such shortcomings.

One of the more delicate issues is that FPDS integrates the proprietary Data Universal Numbering System (DUNS), an assignment of a nine digit number to each business entity. Beyond accuracy and precision issues associated with DUNS, bulk dissemination of FPDS data is alleged to violate the intellectual property of the owner of the data, Dun & Bradstreet. [23]  As a result, we avoid overreliance on DUNS information (which includes some information on parent companies) and manually maintain our own vendor name normalizations and corporate parentage hierarchies.

Anatomy of an FPDS Award

To give a concrete example of what type of data exists in the FPDS ATOM feed, a boiled down version of the JSON equivalent of the October 2019 JEDI award, which went to Microsoft, is shown in Figure 1.

{
 "title": "New IDC HQ003420D0001 awarded to MICROSOFT CORPORATION for the amount of $0",
 "content": {
  "IDV": {
   "contractID": {
    "IDVID": {
     "PIID": "HQ003420D0001",
     "modNumber": "0"
    }
   },
   "relevantContractDates": {
    "signedDate": "2019-10-25 00:00:00",
    "effectiveDate": "2019-10-25 00:00:00",
    "lastDateToOrder": "2030-10-24 00:00:00"
   },
   "dollarValues": {
    "obligatedAmount": "0.00",
    "baseAndAllOptionsValue": "10000000000.00"
   },
   "totalDollarValues": {
    "totalObligatedAmount": "0.00",
    "totalBaseAndAllOptionsValue": "10000000000.00"
   },
   "purchaserInformation": {
    "contractingOfficeAgencyID": {
     "@name": "WASHINGTON HEADQUARTERS SERVICES (WHS)",
     "@departmentName": "DEPT OF DEFENSE"
    },
    "contractingOfficeID": { "@name": "WASHINGTON HEADQUARTERS SERVICES" },
    "fundingRequestingAgencyID": {
     "@name": "DEPT OF DEFENSE",
     "@departmentName": "DEPT OF DEFENSE"
    },
    "fundingRequestingOfficeID": { "@name": "DOD CIO" }
   },
   "contractMarketingData": {
    "websiteURL": "HTTPS://WWW.CLOUD.MIL",
    "whoCanUse": "DEFENSE",
    "emailAddress": "support@cloud.mil",
    "maximumOrderLimit": "1000000000.00",
   },
   "contractData": {
    "contractActionType": { "@description": "IDC" },
    "descriptionOfContractRequirement": "ENTERPRISE LEVEL, COMMERCIAL INFRASTRUCTURE AS A SERVICE (IAAS) AND PLATFORM AS A SERVICE (PAAS) TO SUPPORT DEPARTMENT OF DEFENSE BUSINESS AND MISSION OPERATIONS."
   },
   "vendor": {
    "vendorHeader": { "vendorName": "MICROSOFT CORPORATION" },
    "vendorSiteDetails": {
     "vendorLocation": {
      "streetAddress": "1 MICROSOFT WAY",
      "state": { "@name": "WASHINGTON", }
      "ZIPCode": {
       "@city": "REDMOND",
       "#text": "980528300"
      },
      "countryCode": { "@name": "UNITED STATES" },
      "phoneNo": "5713532701"
     },
     "vendorDUNSInformation": {
      "DUNSNumber": "081466849",
      "globalParentDUNSNumber": "081466849",
      "globalParentDUNSName": "MICROSOFT CORPORATION"
     }
    }
   },
   "competition": {
    "extentCompeted": { "@description": "FULL AND OPEN COMPETITION" },
    "solicitationProcedures": { "@description": "NEGOTIATED PROPOSAL/QUOTE" },
    "numberOfOffersReceived": "7"
   },
   "transactionInformation": {
    "createdBy": "ANGELA.DEREN.HQ0034@WHS.MIL",
    "createdDate": "2019-10-29 13:24:08",
    "lastModifiedBy": "HABERLACHJ",
    "lastModifiedDate": "2019-10-31 16:12:48",
    "status": { "@description": "FINAL" }
   }
  }
 }
}

Figure 1: A small, relevant subset of a JSON translation of the official FPDS XML (via the python xmltodict module) for the original award to Microsoft for the Joint Enterprise Defense Infrastructure (JEDI) contract.

We can read off a number of useful conclusions from this -- again, truncated -- snippet:

  • The Procurement Instrument IDentifier (PIID) for JEDI is HQ0034-20-D-0001.
  • The original JEDI award was signed, and also made effective, on October 25, 2019. Whereas the last modification to this original award was October 31, 2019 (and an email address associated with the modifier is provided).
  • There is no legally obligated amount, whereas the potential value of the contract, if all options become exercised, is $10B.
  • The award was contracted through the Washington Headquarters Service but was requested by the DoD's Chief Information Officer (as it happens, later awards moved the requesting agency down to the Defense Information Systems Agency).
  • The official description of the award is to provide:
    Enterprise level, commercial Infrastructure as a Service (IAAS) and Platform as a Service (PAAS) to support Department of Defense business and mission operations.
  • Microsoft's listed address is:
    1 Microsoft Way
    Redmond, WA 98052-8300
    and their listed phone number is (571) 353-2701. While the address of large companies, such as Microsoft, is readily available, aggregated lists for obscure contractors are somewhat rare. This consistent source for company addresses is behind our ability to show maps of geocoded headquarters for the vast majority of federal contractors.
  • Microsoft's associated DUNS number is 081466849, and it has no listed parent company (technically, the award lists its parent as itself).
  • The award is claimed to have undergone a full, open competition that received 7 offers.
We underline that, on a typical day, between twenty and thirty thousand new award summaries are made public via theFPDS ATOM feed.


Three Measures of Contract Value

One of the significant complications in analyzing federal procurement is that there is no straight-forward answer for how to measure money-flow. One of the best such examples is the above JEDI award, which reports a minimum, or "obligated" award amount of zero, and a "base and all options" -- or "potential" -- value of ten billion dollars. That is to say, the minimum amount of the award was set to zero, and the maximum amount was set to ten billion, and the actual value could be anywhere in the middle. This was pointedly explained by the Department of Defense's Chief Information Officer, Dana Deasy:

"People think JEDI is a 10-year, $10 billion contract. It’s not – not necessarily. While that’s the maximum value and duration of the contract, the Pentagon has the option to terminate it after two years. There’s another end-it-or-extend-it decision three years later, and a third three years after that. The minimum the winning contractor is guaranteed to get? Just $1 million over two years."


To understand a possible source for the million dollar floor on the contract value, we incorporate two further complications:

  • Awards often cannot be analyzed in isolation -- they can have both modification awards (using the same PIID) and subawards (which reference the original award's PIID).
  • There are not just lower and upper bounds on contract values -- the "obligated" amount and "potential", a.k.a. "base and all options" value -- but also a typically more accurate lower bound, called the "current", or "base and exercised options" value.



Figure 2: A subaward to the original JEDI award showing all three contract value types as one million dollars.

One of the several subawards to the original JEDI award is shown in Figure 2, with each of the three measures of contract value set to the same mysterious one million dollar floor value mentioned by Deasy. But we emphasize that one should not expect FPDS award data to always be correct: a 2017 Government Accountability Office report claimed that less than one percent of awards are fully consistent!

The takeaway should be that, when FPDS award data is consistent, the running totals of the contract values should satisfy the ordering:

In several extreme cases, typically involving DoD contracts run through the GSA's Federal Acquisition Service (FAS), both the "obligated" and "current" contract values are zero, while the "potential" value is almost exactly one trillion dollars.

We therefore argue that it is a serious, qualitative mistake to pick any one of these three contract values as representative. Our approach is to recognize the noise and inconsistency in FPDS data and instead focus on companies' rankings within government agencies when sorted by contract values.

Which contract value should we sort by to determine rankings? We do so for all three, and then take the most significant of the three results -- that is, the highest ranking -- as a numerical indicator of influence of a company on a particular agency. To give a specific example: we measured IBM as the 2nd largest vendor with the DHS's Customs and Border Protection by obligated amount, 3rd largest by current values, and 19th largest by potential values. As a result, we set IBM's influence ranking with CBP as 2nd. By focusing on per-agency rankings, rather than dollar amounts, we help avoid coming to misleading conclusions due to systemic idiosyncrasies and imprecision in agency data.

Normalizing Company Names

Most questions that we would like to investigate involve aggregating large numbers of awards for each vendor, which therefore requires adoption of some form of company identifier. An obvious approach would be to set this to the DUNS number, but, for the accuracy and intellectual property reasons described above, we instead use a specific spelling of the business name (potentially qualified with the state of incorporation) and manually curate a list of corrections from encountered misspellings (possibly restricted to its pairing with a particular DUNS numbers).

While our JEDI award example properly provided Microsoft's (lowercased) name as "microsoft corporation", this is far from always the case. Our current normalization map includes corrections from the encountered misspellings: "microsoft corporation sitz in", "microsoft corporation sitz in redmond corporation", and "microsoftcorporation".

To demonstrate why we also allow for the restriction of vendor name modifications to particular DUNS numbers, we take the -- exceedingly complicated -- example of the major defense contractor Science Applications International Corporation (SAIC).

In 2013, the company decided that it could avoid legal barriers arising from conflicts between different contracts by splitting off its legacy national security unit. Rather than making the reasonable choice of giving the spun-off company a new name, because the spin-off would be maintaining legacy contracts, they gave the spin-off the name SAIC and renamed the parent company Leidos (taken from the middle of the word "kaleidoscope").

As a result of the SAIC corporate shell game, knowing when to rename an occurrence of "SAIC" to "Leidos" requires knowledge of an extensive list of DUNS numbers (it turns out, more than 60 in this case) to build custom normalization rules around. These rules, and more than 9000 others, can be found in our vendor normalization map.

Normalizing Requesting and Contracting Agencies

As was the case for vendor name spellings, we can substantially improve the accuracy and consistency of the listings for the contracting and requesting agencies for an award. Our above approach can be adopted through the analogy: DUNS is to vendor name as office name is to agency name. That is, we build a normalization map that corrects listed agency names and provide overriding exceptions for specific pairings of offices and agencies.

For example, in the Microsoft JEDI award of Figure 1, the requesting agency identifier sets both the "@name" (which typically refers to the agency level) and "@departmentName" (which typically refers to the parent department of the agency) to "DEPT OF DEFENSE". A reasonable default rule is to normalize such a pairing to a unique string associated with the DoD -- we standardized on "Defense".

But we have the side information that the requesting office name was given as "DOD CIO", which refers to the DoD's Chief Information Officer. We therefore override our agency normalization map to normalize to "Defense: OCIO" when the agency was simply listed as the DoD but the office name is "DOD CIO". This rule, along with more than 650 others, can be found in our current agency normalization map.

We note that there exist interesting agencies, such as the National Security Agency and the National Reconnaissance Office, which we have only observed through the requesting office rather than the requesting agency. And several other agencies, such as the Defense Intelligence Agency (DIA), are often obscured as simply being the Department of Defense in the agency field -- in the case of the DIA, we often made use of the office being an obvious reference to the Defense Attache System. We corrected many such improper coarsenings up to the Department of Defense via a manual review of the list of all occurring pairings of agencies and offices.

Mapping Out Parent Companies and Subsidiaries

The defense contracting space is notorious for its frequent mergers, acquisitions, spin-offs, and even more exotic exchanges, such as the spin-merger. For example, when we colloquially refer to "Hewlett Packard", we mean a network of four affiliates: the personal computer company HP Inc., the enterprise IT provider Hewlett Packard Enterprise (HPE), HPE's B2B spin-merge DXC Technology, DXC's spin-off of its public sector segment, Perspecta, and all of their subsidiaries.

Two of HPE's major subsidiaries include:


When we speak of the sum of contracts between HPE and a particular agency -- for example, the Defense Information Systems Agency -- we could either be referring to the contracts awarded directly to HPE (an "exclusive" count), or all awards to HPE, HPE Government, Cray, and its other subsidiaries (an "inclusive" count). For this report, we focus on inclusive counts.

As mentioned in our discussion of the JEDI award, FPDS DUNS information sometimes provides a hint for a single parent company. But, for the same reasons which we avoid overdependence on DUNS information for unique vendor identifiers, as well as the fact that many companies have multiple parents (e.g., all joint ventures), we manually curate our own JSON map of parent companies. Our current map contains more than 5000 parent links, which, in combination with our vendor and agency normalization maps, allows us to automate inclusive award sums for a large number of pairings of vendors and agencies.

As another layer of complication, we consider HP, HPE, and Perspecta to be affiliates, not parents or subsidiaries of each other. And so when we compute influence rankings of the nebulous "Hewlett Packard" network, we use the maximum influence ranking between HP, HPE, and Perspecta. In almost all cases, this corresponds to Perspecta's influence rank.

Logistics of Data Retrieval and Indexing

At this point, we are ready to describe the construction of a table of direct contracting influence rankings, but we take the opportunity to add in the logistical details of efficiently retrieving, searching, and summarizing the federal procurement data.

As mentioned in the FPDA ATOM feed FAQ, each pull from the ATOM feed is limited to retrieving only ten records, but up to ten simultaneous requests are allowed -- which combines to up to 100 simultaneous award requests. Given that 30,000 to 100,000 award modifications are not unusual, performing multithreaded retrievals leads to very significant savings (often, from an hour down to a few minutes). And while the FAQ recommends 9AM ET retrievals, we find that the data is typically available around 1:30AM ET.

One can get an idea of the storage requirements for mirroring and indexing the entire FPDS database by reading the USASpending.gov database guide: more than 1.5TB of hard disk space are required for their system. Getting access to this much space on DigitalOcean would cost $500 per month, which is out of the price range of our small, grassroots non-profit.

We therefore adopt a two-tiered approach: we store an entire copy of the raw FPDS data (converted into JSON via python's xmltodict module) on a private workstation and export salient subsets to CSV for loading into a Postgres database with full-text search indices. We then expose an interface to the indices via our public website, techinquiry.org/lobbying/, which is a straight-forward combination of Express.js, jQuery, Pug, and Leaflet.

Addresses for companies are retrieved from the procurement awards and geocoded via a combination of geocodio (for U.S. addresses) and Nominatim (for international addresses).

Our lists of "similar contractors" for vendors are generated by extracting nearest neighbors from the results of the embedding processed, which we execute on a private workstation, described in a later section.

Building a picture of direct contracting influence

We can now combine all of the previously discussed FPDS curation mechanisms to compute maximum ranks of direct financial flow between various tech-related companies and U.S. federal agencies -- which we treat as proxies for influence/impact. Brief summaries of many of the studied companies are available in a section of the appendix.

Agencies

For the sake of brevity, we will simply list the agencies of focus rather than providing short summaries. One exception is the Justice Department's Offices, Boards, and Divisions (OBD) unit, which is less widely documented. We recommend the book Badges without Borders (or the associated interview with Historic.ly), which details the history of the international, often explicitly anti-communist, policing programs once run through USAID's Office of Public Safety. These programs transitioned into the Justice Department's International Criminal Training Assistance Program (ICITAP) under the OBD agency. There are numerous recent awards from this program to Science Applications International Corporation and its subsidiary, Engility Holdings, for international policing in Indonesia, the Kingdom of Saudi Arabia, and Pakistan.

For the Justice Department as a whole, we investigate rankings within:

From theTreasury Department, we only include the Internal Revenue Service (IRS).


Several members of the Department of Homeland Security are of interest:


The largest number of agencies are from the Department of Defense:


Lastly, we include rankings within several independent agencies:




FBI
DEA
OBD
BoP
IRS
SS
ICE
CBP
CIS
TSA
DHS S&T
AirForce
Navy
Army
SOCOM
DIA
DLA /
TRANSCOM
WHS
DISA
DARPA
NGA
NSA
State
AID
AGM
FAS
HP / Perspecta
21
311
31
828
7
3
10
6
6
41
13
82
7
25
302
35
225
365
10
20


80
635
165
5
Deloitte
8
51
6
172
2
27
14
24
152
7
42
125
21
97

13
94
16
38
297

47
17
12
925
15
MITRE
2
24
285
469
1
23
49
34
80
109
2
16
85
26

1
6471
100


1

155
247
451
955
IBM
24
75
3
2483
3
286
88
2
1118
10
221
137
118
27
143
61
62
23
18
19


241
25
302
18
Accenture
10
100
17
412
1
602
56
28
18
8

92
99
49
2438
62
40
126
80

34
3
13
197

33
AT&T
99
37
1
248
6
7
95
18
171
294
219
109
107
60
126
85
2019
24
9
891
32
70
26
20
15
38
Microsoft
128
105
33
1528
1463
68
146
66
37
75
255
102
253
113
364

441
1
2



57
561
490
448
Verizon
32
28
23
296
12
6
163
133
26
359

234
132
72
257
76
1134
595
1
204
47

553
2073
39
59
Dell
56
241
343
8396
53
20
4
28
33
30
123
38
252
47
64
59
307
63
26
1612

28
539
446
73
14
Johns Hopkins
5550




144
261
43


36
183
23
231
45
861

5
104
13
29
5
428
71

4814
IDA











17,896

62



1

1629



1150


Cisco
3250




2580





301
309
175


717

6





98

Palantir
44
3265
25

84

27




10,015
763
468
14



946



1208


1531
Oracle
1873



931

294
266
292
684

825
388
642
2270
111
315
337
40





142
175
Anduril







163



1255
5788




100

49






NVIDIA
5671


















95






Intel











5504
11,060
91,055





102





12,104
Apple
1593
809

7400

1356
509
5380



19,124
20,847
13,639
3538

18,757

2039



1473
392
156

Google











8951
76,708
17,248



2227




2908

255
12,353
Amazon
1176

1096
6559
733

3219




13,141
7472
19,578
1208


730
1013



1895
986
1314
841
Facebook






















8902
3054
1498

Twitter
























1009

Table 1: Corporate influence rankings by contracts directly between various tech-related companies and U.S. agencies between the beginning of 2016 and July 4, 2020.     Measurements were generated by combining all awards listed in the Federal Procurement Data System (FPDS) which were signed or modified within the time window. In order to smooth out known inconsistencies and FPDS data, companies were ranked by each of the three ways to measure contract values (obligated, current, and potential) and their influence was set to the most significant (that is, smallest) of the three ranks.     These influence rankings were then binned as either: top 10 (       ), top 25, (       ), top 50, (       ), top 100, (       ), top 250, (       ), top 500 (       ), top 1000 (       ), or below (       ). If a company was not observed contracting with an agency, the square was left white. The numerical influence rankings are shown on mouse-over.


Even though we have restricted the analysis behind the influence rankings in Table 1 to direct contracts -- which misses, for example, a high-profile $25M (base and all options value) award between TRANSCOM and Amazon through ECS Federal -- we can interpret the results as underestimates. Given this caveat, we immediately notice that the Hewlett Packard network of companies noticeably dominate, with top-10 influence rankings in 8 of the 26 agencies. [25]

Equally as surprising is that consulting companies Accenture and Deloitte Consulting -- which contributed much of the award money to parent company Deloitte -- were more influential via direct contracting than essentially every tech company except HP and IBM. Next after the management consulting companies is the not-for-profit MITRE Corporation, which was built specifically to manage Federally Funded Research and Development Centers (FFRDCs); its high rankings should thus serve more as a benchmark than a surprise. We underline that it ranked first with the DIA, NGA, and DHS S&T, and second with the FBI.

If we restricted our attention to the "Big Five" tech companies: Google, Apple, Facebook, Amazon, and Microsoft, then a purely direct contracting analysis would suggest that Microsoft is the only significant defense contractor -- which, again, misses the numerous multi-million dollar Amazon Web Services awards passed through subcontracts (such as ECS Federal and JHC Technology). The question of how to account for more commoditized relationships -- namely, Intel, NVIDIA, and Apple hardware sales -- is more complicated, and we will partially address it in the next section.

We also notice the minor influence rankings of Palantir and Anduril relative to: management consulting companies, HP, Dell, telecoms, and even the Johns Hopkins Applied Physics Laboratory (which is responsible for the vast majority of awards to Johns Hopkins). Palantir's highest influence was within the U.S. Special Operations Command, who publicly praised Palantir's software during their years-long struggle to win the DCGS-A contract (and a subsequent "capability drop"). Anduril's biggest influences are observed as through DARPA, the Washington Headquarters Service, and Customs and Border Protection; direct inspection of their awards reveals, for example, a $250M (base and all options value) award through CBP for "autonomous surveillance towers" and a $100M (base and all options value) award requested by DARPA and contracted through the Air Force for "advance[d] battle management anduril phase 3 idiq". [26, 27]

A few of the agency columns are worth dedicated discussion. For example, Microsoft's second place influence rank within the Defense Information Systems Agency is entirely due to the $10B (base and all options value) JEDI award -- only Leidos ranked higher -- and so the award being transferred to Amazon due to a bid protest would give it the same position. The other top-ten influencers within DISA, unsurprisingly, included several communication infrastructure companies: Verizon, Cisco, and AT&T, as well as HP (IBM was 18th, and Dell was 26th).

The column combining the National Security Agency and the U.S. Cyber Command -- which we note only has a modest amount of reported awards -- contains top-ten rankings for Accenture (3rd) and Johns Hopkins (5th), but known NSA contractors Dell and AT&T were only 28th and 70th, respectively. The FBI rankings, like many others, show the professional services companies, Accenture and Deloitte, as heavier-hitters than tech giants. Nevertheless, Perspecta and IBM are both in the top 25, Verizon and Palantir are in the top 50, and Dell and AT&T are in the top 100.

As we will see in the next section, Palantir and Anduril's results are not significantly changed by incorporating subcontracts -- indeed, Anduril's are not changed at all -- and so, if they are setting the bar for what it means for a tech company to be a defense contractor, then HP, IBM, Microsoft, Dell, Cisco, AT&T, and Verizon are more than meeting it, even without incorporating their subcontracting passthroughs. And that Intel chips are a core component of most HP and Dell machines suggests we should already be able to justify its addition it into the list.

Incorporating Conservative Subcontracting Passthrough Estimates

We now discuss our approach for incorporating conservative subcontracting estimates into the direct contracting influence rankings of the last section (summarized in Table 1). Our approach was simple, yet exceedingly laborious:

  1. Generate a list of awards whose descriptions include key terms (we built a custom API to return such lists in a minimal JSON format).
  2. Manually remove all awards not attributable to the target company after direct inspection.
  3. Expand out to the entire award series of those that remain and annotate each such award with an estimated passthrough percentage.
  4. Augment the three types of total award amounts from the direct award analysis by adding in the products of the subcontract values and their estimated passthrough rates.
  5. Re-rank.
In practice, we combine steps (2-4) so that we need only perform a single pass through the original list. And we first sort the awards by maximum absolute value so that the most impactful items are handled first.


This process was repeated for ten companies, and, when restricted to our agencies of study, resulted in:

Keywords
Kept subcontracts*
Major intermediaries
Microsoft
"microsoft" or "azure" or "windows licenses" or "windows server"
6860 subcontracts
Amazon
"amazon" or "aws" or "govcloud"
477 subcontracts
Four Points Technology, JHC Technology, and ECS Federal
Google
"google"
384 subcontracts
Facebook
"facebook"
172 subcontracts
Chaise Management Group, Sage Communications, and ZilYen (now doing business as Forge Branding)
NVIDIA
"nvidia" or "tesla"
165 subcontracts
Twitter
"twitter"
43 subcontracts
Sage Communications, ZilYen (now doing business as Forge Branding), and Chaise Management Group
Palantir
"palantir" or "gotham"
26 subcontracts
i3 Federal, Pat V. Mack, Sava Workforce Solutions, and Affigent
IDA
"institute for defense analyses"
7 subcontracts
telecoms (e.g., AT&T)
MITRE
"mitre"
6 subcontracts
MIT: each award was for "mitre-lincoln laboratory research&development"
Johns Hopkins
"johns hopkins"
2 subcontracts

Anduril
"anduril"
0 subcontracts

*: We emphasize that these counts of the number of manually annotated contracts -- which we colloquially described as the number of "kept subcontracts" -- contain a small percentage of awards that simply make reference to the company in question. In such cases, the percentage of money estimated to be passed through to the company is set to zero. Such awards are kept in the list to help demonstrate that a simple keyword search is insufficient.


As part of the subcontract review process, we noticed that, during the 2005-2011 time period that an Alaska Native subsidiary, Eyak Technology, was operating a kickback scheme relating to its billion-dollar prime contract with the U.S. Army Corps of Engineers, it was also serving as a supplier of Google technology to the U.S. Army through numerous awards in 2009. We also discovered that Amazon, through JHC Technology, has received several million dollars in AWS cloud contracts through the Federal Bureau of Prisons.

It also became clear that many of the NVIDIA subcontracted awards were for the acquisition of their DGX compact General-Purpose Graphics Processing Unit (GPGPU) supercomputers. Recipients included: The Army, Navy, Air Force, Washington Headquarters Service, and, most surprisingly, even Veterans Affairs.

While we would have preferred to have performed similar analyses for the remaining companies, we remind the reader that such analyses only increase ranks. Thus, incorporating any missed subcontracts for HP, Deloitte, IBM, AT&T, Verizon, and Dell would only make them more dominant. The opportunities for significant qualitative change are with extending our subcontract analysis to Appl, Cisco, Oracle, and Intel.



FBI
DEA
OBD
BoP
IRS
SS
ICE
CBP
CIS
TSA
DHS S&T
AirForce
Navy
Army
SOCOM
DIA
DLA /
TRANSCOM
WHS
DISA
DARPA
NGA
NSA
State
AID
AGM
FAS
HP / Perspecta
21
311
32
829
7
3
10
6
6
41
13
83
7
25
303
35
226
366
10
20


81
636
168
5
Deloitte
8
51
6
172
2
28
14
24
153
7
42
125
21
97

13
94
16
38
297

48
17
12
928
15
MITRE*
2
24
285
470
1
24
49
34
81
109
2
16
85
26

1
6471
100


1

155
248
455
956
IBM
24
75
3
2484
3
286
88
2
1119
10
222
137
118
27
144
61
62
23
18
19


241
25
305
18
Accenture
10
101
18
413
1
602
56
28
18
8

93
99
49
2439
62
40
126
80

34
3
13
197

33
AT&T
100
37
1
248
6
7
95
18
172
294
220
109
108
60
127
85
2020
24
9
892
32
70
26
20
15
38
Microsoft*
73
98
8
1216
19
21
97
41
27
29
150
60
106
56
19

244
1
1



29
198
31
270
Verizon
32
28
24
296
12
6
163
133
26
359

235
133
73
258
76
1135
596
1
204
47

553
2074
40
59
Dell
56
241
343
8398
54
20
4
28
34
31
123
38
253
47
65
59
308
63
26
1613

29
539
447
74
14
Amazon*
555

709
150
731

620
207
47

474
2611
4827
9133
742

133
269
709
463

7
787
987
437
826
Johns Hopkins*
5552




144
261
44


36
183
23
231
46
862

5
104
13
29
5
428
71

4814
IDA*











17897

62



1

1630



1151


Cisco
3252




2581





301
309
175


718

7





100

Palantir*
41
3266
26

85
374
27




9455
763
460
14



947



1210


1532
Oracle
1875



933

294
267
293
684

825
388
642
2271
111
316
337
40





145
175
NVIDIA*
1104
3292

8097
2055






10,362
7238
18,956

168
26,173
1764

95


6969

3527
7990
Anduril*







163



1255
5789




100

49






Apple
1595
809

7401

1357
509
5382



19,126
20,849
13,641
3539

18,758

2040



1475
393
159

Google*
336
1197
1829
2545
745


755

963

5412
6859
6692
1928

540
1651
1363



793
390
123
2101
Facebook*









887



52,865








2807
2715
96

Twitter*









983












7833

254

Table 2: Corporate influence rankings by a combination of direct contracts and subcontracting passthroughs between various tech-related companies and U.S. agencies between the beginning of 2016 and July 4, 2020 (subcontract analysis only included up to June 20).     Direct contract measurements are generated in the same manner as for Table 1. Conservative subcontract passthroughs were estimated and incorporated for companies annotated with '*' through manual review of thousands of FPDS awards whose descriptions mentioned the company or one of its major products (e.g., "aws govcloud"). Due to the intense time demands of manually reviewing tens of thousands of awards, several companies were not augmented with subcontracting passthrough estimates.     These influence rankings were then binned as either: top 10 (       ), top 25, (       ), top 50, (       ), top 100, (       ), top 250, (       ), top 500 (       ), top 1000 (       ), or below (       ). If a company was not observed contracting with an agency, the square was left white. The numerical influence rankings are shown on mouse-over.




FBI
DEA
OBD
BoP
IRS
SS
ICE
CBP
CIS
TSA
DHS S&T
AirForce
Navy
Army
SOCOM
DIA
DLA /
TRANSCOM
WHS
DISA
DARPA
NGA
NSA
State
AID
AGM
FAS
HP / Perspecta
21
311
32
829
7
3
10
6
6
41
13
83
7
25
303
35
226
366
10
20


81
636
168
5
IBM
24
75
3
2484
3
286
88
2
1119
10
222
137
118
27
144
61
62
23
18
19


241
25
305
18
Microsoft*
73
98
8
1216
19
21
97
41
27
29
150
60
106
56
19

244
1
1



29
198
31
270
Dell
56
241
343
8398
54
20
4
28
34
31
123
38
253
47
65
59
308
63
26
1613

29
539
447
74
14
Amazon*
555

709
150
731

620
207
47

474
2611
4827
9133
742

133
269
709
463

7
787
987
437
826
Cisco
3252




2581





301
309
175


718

7





100

Palantir*
41
3266
26

85
374
27




9455
763
460
14



947



1210


1532
Oracle
1875



933

294
267
293
684

825
388
642
2271
111
316
337
40





145
175
NVIDIA*
1104
3292

8097
2055






10,362
7238
18,956

168
26,173
1764

95


6969

3527
7990
Anduril*







163



1255
5789




100

49






Apple
1595
809

7401

1357
509
5382



19,126
20,849
13,641
3539

18,758

2040



1475
393
159

Google*
336
1197
1829
2545
745


755

963

5412
6859
6692
1928

540
1651
1363



793
390
123
2101
Facebook*









887



52,865








2807
2715
96

Twitter*









983












7833

254

Table 3: A restriction of Table 2 to tech companies.


The results of incorporating our subcontracting estimations are demonstrated in Table 2 and its restriction to tech companies, Table 3. The most significant qualitative difference is with Amazon, whose large numbers of Justice, DHS, and DoD cloud contracts are almost entirely through intermediaries, such as Four Points Technology, JHC Technology, and ECS Federal (who was also the prime contractor for Google's Maven contracts).

Another significant change is that Microsoft moves into the top 10 influencers within the Justice Department's Offices, Boards, and Divisions. As we explained in a previous section, this little-known agency is the current home for U.S. international policing program ICITAP. We also notice that Google has moved into the top 500 contractors with the FBI (through supplying FISMA-certified Google Apps for Government).

Google, Facebook, and Twitter's contracting with the U.S. Agency for Global Media -- which is formerly known as the Broadcasting Board of Governors, which itself grew out of the propaganda-focused Information Agency -- as well as USAID, and, to a lesser degree, The State Department, became more pronounced. As did minor amounts of contracts with the TSA.

The conservative estimation of relative financial flow between major tech companies and various military, prosecutory, law enforcement, and diplomatic organizations in Table 3 makes clear that each of these companies is playing at least a minor support role to the U.S. government. And, in the cases of: HP, IBM, Microsoft, Dell, Amazon, Cisco, Palantir, Oracle, NVIDIA, and Anduril, these roles are significant. In light of this data, continuing to claim that Silicon Valley has abandoned Washington would be disingenuous -- even if one technically excluded Microsoft, Amazon, and IBM.

Generating Contractor "Embeddings" and Nearest Neighbors

The original motivation for mirroring the entire U.S. federal procurement database was to answer the question:

"Are the description text fields and contracting/requesting agencies in procurement data enough to generate leads for companies similar to a given one?"
More specifically, the goal was to generate glue between otherwise siloed company profiles through thenearest neighbors resulting from the left factor of a weighted low-rank approximation to the co-occurrence score matrix between terms from the contract description, the vendor receiving the award, and the contracting and requesting agencies.


Variants of such an approach are at the heart of many commercial recommendation systems -- which have been frequently criticized as being both unacceptably opaque and detrimentally engagement driven. We repurpose a basic variant here for the purpose of exploring federal contracting. Given that the resulting recommendations only involve federally mandated records of corporate entities, and our site is not monetarily incentivized by engagement (we are a non-profit and we do not sell ads), we do not forsee any analogues of the typical failure modes.

We assert no conclusions about the resulting nearest neighbors, other than that they often contain companies which contract in similar areas to the generating company. They are simply useful for expanding one's breadth of knowledge of companies.

Our Approach to Generating Similar Contractor Lists

We built such a system on top of SciPy: a stand-alone Alternating Weighted Least Squares (AWLS) embedding and nearest neighbor extraction utility and a driver specific to the Federal Procurement Data System.

After the

co-occurrence score matrix,, is formed -- which, to be clear, is where most the "art" of this model resides [28] -- the objects of interest are the rows of the tall-skinny matrix in the low-rank matrix , which approximates the co-occurrence matrix, in the sense of approximately, locally minimizing given a nonnegative weight matrix with a particular sparse plus rank-one structure, represents the entrywise product, is the entrywise square-root, and

is the Frobenius norm.

More specifically, the weight matrix is required to be of the form

where:

is a binary matrix with the same sparsity pattern as the co-occurrence score matrix,
  • ,

  • is a length-
  • column vector of ones,

  • , and
  • The
  • 'th entry of vector is analogous to the tf-idf weighting for item represented by the
    We standardized on, , a rank of 150, and 10 alternating iterations. And to keep memory usage and runtimes in check, the rows of the co-occurrence matrix were restricted to the dominant 50,000 terms, 150,000 vendors, and 25,000 agencies. Likewise, the columns were restricted to the dominant 75,000 terms, 150,000 vendors, and 100,000 agencies.


    Given such a structure for the weight matrix, it was shown by Hu, Koren, and Volinsky in 2008 -- incidentally, while working at two companies studied in this report, AT&T and Yahoo! -- that each factor update could be formed in linear time by precomputing a certain 'background' Gramian and sparsely updating it to solve the normal equations for each row's update.

    After the final iteration of the minimization process is completed, we normalize each row of the matrix

    to have unit Euclidean norm and then extract 20 to 30 of its nearest neighbors, in a cosine-similarity sense.

    Examples of Generated Neighbors

    Each of the "vendor" pages on the website associated with this report, techinquiry.org/lobbying/, contains a list of "Similar Contractors" within the "US Federal Contracting" tab that are generated with the algorithm described above. We show a few of the examples from our most recently trained model, which was trained on roughly the last year and a half of procurement data.

    Cellebrite

    Our first example is that of Cellebrite, a "mobile forensics" company which has allegedly been used by Michigan State Police to conduct unlawful searches. It has also been reported that Cellebrite sells its software to the governments of Turkey, the United Arab Emirates, and Russia.

    The five closest neighbors produced for Cellebrite Inc., a subsidiary of Cellebrite DI Ltd., were:

    1. Cellebrite DI Ltd.
    2. Grayshift, LLC
    3. Pen-Link, Ltd.
    4. Avail Forensics LLC
    5. Berla Corporation
    We note that these five neighbors were chosen out of the set of more than 130,000 companies kept by the truncation procedure described above. Given that the parent company was picked up as the nearest neighbor, and the other four are also forensics companies, we believe that these are quality results.


    Palantir Technologies

    We return to investigate the answers to our original question: "Which companies contract like Palantir [Technologies]?". Our top-five results are:

    1. XFinion Inc.
    2. Ardent Management Consulting, Inc.
    3. VIRE Consulting, Inc.
    4. iWorks Corporation
    5. Alethix, LLC


    Anduril Industries

    Similarly, the top-five results for Anduril are currently:

    1. Orbital Insight, Inc.
    2. Solid State Scientific Corporation
    3. Omni Fed LLC
    4. G2 Ops, Inc.
    5. Entheleon Technologies, Inc.
    Orbital Insight markets itself with the blurb:
    "Access the most current visibility, intelligence, and transparency of the world’s physical activity ... all on one secure, private geospatial data platform.
    andSolid State Scientific Corporation appears to focus on applying machine learning to Air Force problems, much like Anduril.


    Microsoft Corporation

    Given that our nearest neighbors lists are generated entirely from contracting behavior, they produce lists of companies which contract similarly to the given company, rather than companies which are, in a vague sense, generally thought of as similar. This distinction becomes clear when we look at a very large tech company like Microsoft.

    The most recently produced top-five list of vendors who contract similarly to Microsoft was:

    1. Four Inc.
    2. Dynamic Systems, Inc.
    3. Forcepoint Federal LLC
    4. Emergent LLC
    5. Accelera Solutions, Inc.
    According to Forcepoint Federal's ownbrochure, they are a subsidiary of weapons manufacturer Raytheon formerly known as Websense.


    Future Extensions of Embeddings

    There exist namebrand tech companies -- for example, Google -- which do very little direct federal contracting, but perform a large amount of registered lobbying. We can therefore expect to improve the quality of "similar companies" recommendations by extending from our current procurement-only approach to one which includes Senate OPR LD-1/LD-2 and LD-203 filings. We have already included said filings in our website, but have not yet worked them into the embedding/neighbor generation pipeline.

    Conclusions and Future Work

    We have demonstrated a framework for converting the Federal Procurement Database System (FPDS) into a set of rankings meant to indicate the degree of (financial) influence of a company, including its subsidiaries, within a particular government agency -- including some which are only indirectly indicated in procurement data. When we applied this methodology to major tech companies, and augmented the award amounts with conservative estimates of subcontract passthrough amounts, we demonstrated that recent narratives decrying a massive divide between Silicon Valley and the military are anecdotal and qualitatively false.

    We also demonstrated a recommendation system for automatically generating lists of similar contractors for the vast majority of -- more than 130,000 -- U.S. federal contractors. We plan to extend these "embeddings" to incorporate U.S. registered lobbying data, and provide a similar map of tech company lobbying influence, in a future report.

    One of the fundamental missing features of our website is a means of accepting user-contributed corrections and additions -- ideally coupled with citations that could be verified before their inclusion. We hope to begin experimenting with such interfaces alongside our incorporation of more datasets (e.g., Canadian and European procurement records).

    Lastly, as a means of both connecting our work to that of Mijente's "Who's Behind ICE?" report and demonstrating the breadth of our database's coverage, we provide a subsection of the appendix that links to our profile page for each company mentioned in their report. A similar map is provided from a collection of contractors with the Defense Innovation Unit -- and its spin-out, Kessel Run -- that we curated. An autocomplete interface is also provided for each via the "DIU" and "Who's Behind ICE?" radio options on techinquiry.org/lobbying/. We have also incorporated the Project on Government Oversight's Pentagon Revolving Door and Federal Contractor Misconduct databases but refrain from listing them in this report.

    Acknowledgements

    The author would like to thank Irene Knapp (@ireneista) for detailed help with this project's database (Postgres) and the web server's operating system configuration (NixOS). He would also like to separately thank Liz O'Sullivan (@lizjosullivan) and Shauna Gordon-McKeon (@shauna_gm) for detailed suggestions that significantly improved an early draft of this document. Lastly, he would like to thank Cornell's Center for Applied Mathematics, and the Balsillie School of International Affairs, for hosting him to talk about very early versions of this work.

    Conflict of Interest Disclosure

    This work was entirely self-funded. The author acknowledges that they were formerly an employee at Google, as was Irene Knapp. Likewise, Liz O'Sullivan formerly worked at Clarafai.

    Appendix

    Company Summaries