1、NOVEMBER 2020 Automating Cyber Attacks HYPE AND REALITY AUTHORS Ben Buchanan John Bansemer Dakota Cary Jack Lucas Micah Musser Center for Security and Emerging Technology2 Established in January 2019, the Center for Security and Emerging Technology (CSET) at Georgetowns Walsh School of Foreign Servi
2、ce is a research organization fo- cused on studying the security impacts of emerging tech- nologies, supporting academic work in security and tech- nology studies, and delivering nonpartisan analysis to the policy community. CSET aims to prepare a generation of policymakers, analysts, and diplomats
3、to address the chal- lenges and opportunities of emerging technologies. During its first two years, CSET will focus on the effects of progress in artificial intelligence and advanced computing. CSET.GEORGETOWN.EDU | CSETGEORGETOWN.EDU Automating Cyber Attacks NOVEMBER 2020 AUTHORS Ben Buchanan John
4、Bansemer Dakota Cary Jack Lucas Micah Musser HYPE AND REALITY ACKNOWLEDGMENTS The authors would like to thank Max Guise, Drew Lohn, Igor Mikolic- Torreira, Chris Rohlf, Lynne Weil, and Alexandra Vreeman for their comments on earlier versions of this manuscript. PRINT AND ELECTRONIC DISTRIBUTION RIGH
5、TS 2020 by the Center for Security and Emerging Technology. This work is licensed under a Creative Commons Attribution- NonCommercial 4.0 International License. To view a copy of this license, visit: https:/creativecommons.org/licenses/by-nc/4.0/. Cover photo: KsanaGraphica/ShutterStock. Center for
6、Security and Emerging Technologyi EXECUTIVE SUMMARY INTRODUCTION 1 | THE CYBER KILL CHAIN 2 | HOW MACHINE LEARNING CAN (AND CANT) CHANGE OFFENSIVE OPERATIONS 3 | CONCLUSION: KEY JUDGMENTS ENDNOTES III V 1 1 1 21 29 Contents Center for Security and Emerging Technologyiv Center for Security and Emergi
7、ng Technologyiii acking is a well-established part of statecraft. Machine learning is rapidly becoming an arena of competition between nations as well. With the continued importance of computer hacking and the increasing drumbeat of AI advances due to machine learning, important questions emerge: wh
8、at might machine learning do for cyber operations? How could machine learning improve on the techniques that already exist, ushering in faster, stealthier, and more potent attacks? On the other hand, how might its importance to cyber operations be mis- leadingly overhyped? We examine how machine lea
9、rning mightand might notreshape the process of launching cyber attacks. We examine the cyber kill chain and consider how machine learning could enhance each phase of oper- ations. We expect certain offensive techniques to benefit from machine learning, including spearphishing, vulnerability discover
10、y, delivering malicious code into targeted networks, and evading cyber defenses. How- ever, we caution that machine learning has notable limitations that are not reflected in much of the current hype. As a result of these constraints and flaws, attackers are less likely to apply machine learning tec
11、hniques than many expect, and will likely do so only if they see unique benefits. Our core conclusions are: Current cyber automation techniques are powerful and meet the objectives of many attackers. For most attackers, they will not have an obvious need to augment their operations with machine lear
12、n- ing, especially given the complexity of some machine learning techniques and their need for relevant data. If current methods of automation become less effective or machine learning techniques become more accessible, this may change. Executive Summary H Center for Security and Emerging Technology
13、iv In the near term, machine learning has the potential to increase both the scale and success rate of spearphishing and social engineering attacks. Of the machine learning techniques reviewed in this paper, reinforcement learning promises the most operational impact over the medium-to-long term. Th
14、ough its potential impact is speculative, it could reshape how attack- ers plan and execute cyber operations. Machine learning systems have substantial limitations, such as their reliance on salient data, their weakness to adversarial attacks, and their complexity in deployment. Like other cyber cap
15、abilities, many machine learning capabilities are inher- ently dual-use, with the advantage accruing to those who have the resourc- es and expertise to use them best rather than always favoring attackers or defenders. The paper proceeds in three parts. The first part covers the state of the art in c
16、yber operations today, showing how attackers progress through the kill chain and taking care to demonstrate how traditional automation assists them in their efforts. The second part considers machine learning in more depth, exploring its differences from traditional automation and probing how those
17、differences mightand might notreshape key parts of the kill chain. Among other things, it highlights the way in which machine learning could improve discovery of the software vulnerabilities that enable cyber operations, grow the effectiveness of spearphishing emails that deliver malicious code, inc
18、rease the stealthiness of cyber operations, and enable malicious code to function more independently of human operators. The conclusion takes stock, drawing out key themes of geopolitical and technical importance. It argues that machine learning is overhyped and yet still important, that structural
19、factors will limit the relevance of machine learning in cyber operations for most attackers, that the dual-use nature of cyber operations will continue, and that great powers including the United Statesshould be proactive in exploring how machine learning can improve their operations. Center for Sec
20、urity and Emerging Technologyv Introduction he use of a computer at Lawrence Berkeley Laboratory in 1986 cost $300 per hour.1 One day, when reviewing the accounting ledgers, a system administrator discovered a seventy-five-cent discrepancy. The administrator asked another staffer, Clifford Stoll, to
21、 investigate. What followed was one of the first and most well-documented hunts for a cyber criminal. Stoll, in his classic book The Cuckoos Egg, details a case study in persistence on the part of both the attacker and the defenders that, from todays vantage point, seems to develop in slow motion. T
22、his attack was not a highly automated quick strike. Instead, it unfolded over the course of months. The attack techniques, directed from a computer halfway around the world, were manual and relatively unsophisticated, yet effective. This slow pace and lack of automation is not surprising. The intern
23、et at the time had about 20,000 connected computers, transmission speeds were measured in kilobytes, and computing power was a fraction of what is available on todays mobile devices. The attacker followed an operational process, or kill chain, that has largely endured: reconnaissance, initial entry,
24、 exploitation of known vul- nerabilities, establishment of command and control channels, and lateral movement across networks. Each of these steps contributed to the ultimate objective of exfiltrating sensitive documents from defense contractors, universities, and the Pentagon.2 With striking simpli
25、city, the attacker at- tempted logging onto systems with known account names and commonly used passwords, such as “guest.” Even with this rudimentary technique, the attacker gained unauthorized access upwards of 5 percent of the time.3 While the attack was largely manual, automation aided the defend
26、- ers. Stoll and others established automated systems to alert them when the T Center for Security and Emerging Technologyvi attacker accessed key machines and networks, enabling the team to begin tracing the ultimate source of the attacks. Sometimes with the help of court orders and tele- phone com
27、panies, the defenders systematically worked back through the tangled network of infected computers toward the attacker. Stoll and his team baited the attacker with enticing (but fake) files related to the highly sensitive Strategic Defense Initiativethe rough equivalent of Cold War catnip.4 The atta
28、cker spent so much time online examining the bogus files that technicians were able to trace the intrud- ers location: Hanover, Germany. Local authorities eventually charged Markus Hess and four of his German associates with espionage for their various roles in feeding pilfered documents and network
29、 details to the Soviet security agency then known as the KGB. About a year later, Stoll received an alarming call about a new threat: an auto- mated attack was cascading across the internet, digitally destroying everything in its path. Stoll and other computer security experts raced to stop the self
30、-propagating code, which became known as the Morris Worm. They succeeded, but not before the worm disabled more than 2,000 computers in the span of 15 hours.5 This attack stood in sharp contrast to the manual operations of the period and introduced the concept of automated cyber attacks. The two cas
31、es neatly bookend the spectrum of conceptual possibilities when it comes to cyber operations. On one end are the plodding manual efforts, pains- takingly carried out by attackers and thwarted by system administrators and their tools in a cat-and-mouse game that unfolds over months. On the other end
32、are the automated attack sequencesoften lacking nuance or controlthat tear across the internet at high speed and destroy everything in their path. Operations at both ends of the spectrum continue today, though human-directed efforts benefit from more automation and automated attacks exhibit greater
33、control than before. In this context arrives machine learning, a technology at the core of almost all the hype surrounding AI today. Within the last decade, machine learning has achieved technical feats that were not too long ago thought to be decades or even centuries away. Machine learning algorit
34、hms have beaten world champion players at fiendishly complex board and video games, demonstrating something akin to intuition. These algorithms have devised convincing photos and videos of people who never existed, painted compelling portraits, and written music and stories so good that they seem hu
35、manlike in their creativity. They have done so with rapidly increasing speed and quality, charting a growth curve in capabilities that seems to point ever upward. Against this backdrop of advances, important questions emerge: what might machine learning do for cyber operations? How could the improve
36、d automation Center for Security and Emerging Technologyvii technology improve on the techniques that already exist, ushering in faster, stealth- ier, and more potent attacks? On the other hand, how might it be misleadingly overhyped? In this paper, we tackle these questions. To do so, we proceed in
37、 three parts. The first part covers the state of the art in cyber operations today, showing how attackers progress through the kill chain and taking care to demonstrate how traditional au- tomation assists them in their efforts. The second part considers machine learning in more depth, exploring its
38、 differences from traditional automation and probing how those differences mightand might notreshape key parts of the kill chain. Among other things, it highlights the way in which machine learning could improve discov- ery of the software vulnerabilities that enable cyber operations, grow the effec
39、tive- ness of spearphishing emails that deliver malicious code, increase the stealthiness of cyber operations, and enable malicious code to function more independently of human operators. The conclusion takes stock, drawing out key geopolitical and technical judgments. It argues that machine learnin
40、g is overhyped and yet still important, that structural factors will limit the relevance of machine learning in cyber operations for most attackers, that reinforcement learning techniques show promise in the medium- to-long term, that the dual-use nature of cyber operations will continue, and that g
41、reat powersincluding the United Statesshould be proactive in exploring how machine learning can improve their operations. Center for Security and Emerging Technology12 Center for Security and Emerging Technology1 he kill chain is an established method of conceptualizing cy- ber operations by present
42、ing a checklist of tasks that attackers work through on their way to their objective. Lockheed Martin researchers published a canonical paper outlining the idea in 2010.6 Other organizations, such as MITRE, have introduced more complex versions.7 While the kill chain model has limitationssuch as por
43、traying cyber operations as overly linearit is a common and useful way to begin to understand cyber attacks. We therefore use it as a foundation to explore how cyber operations work and how automation that does not use machine learning aids attackers; this is the status quo that machine learning-ena
44、bled automation seeks to advance. Attackers will perform some or all of the kill chains steps. Depending upon their overall objective, attackers may merge several steps by em- ploying commonly used exploit tools or techniques. Each step they execute represents an opportunity for a defender to stop a
45、n attack. We discuss six steps widely agreed to be important: reconnaissance, weaponization, de- livery, command and control, pivoting, and actions on objective. Each step constitutes its own processes, challenges, and techniquesall of which continue to evolve, including with greater automation. Thi
46、s section illustrates well-known cases of each step of the kill chain and current state-of-the-art techniques. Readers familiar with the cyber kill chain and how automation helped enable major operationsespecially NotPetya, CRASHOVERRIDE, Agent.BTZ, Conficker, and the 2015 Ukraine blackoutshould fee
47、l free to skip ahead to our discussion of machine learning in the following section. The Cyber Kill Chain 1 T Center for Security and Emerging Technology2 RECONNAISSANCE Attackers must first pick their target. Their process of reconnaissance and target selection depends on the objectives. Some attac
48、kers will be interested in infecting broad categories of users and will largely forego this process. Others are more selective in their choice of victims, requiring a more extensive reconnaissance effort. In this phase, attackers first identify humans and machines that are worth targeting and then g
49、ather information about the technical vulnerabilities of those targets. To inform their search for human targets, attackers can gather important details about an organization and its personnel through internet searches, social media analysis, and scraping technical online forums. These passive techniques have the added advantage of being largely undetectable. Traditional techniques of automa- tion offer a means to collect, sort, and analyze data collected in this way, signifi- cantly shortening time spent in this phase and helping attackers plot their next