1、#BHUSA BlackHatEventsBytecode Jiu-JitsuContributors:Ryo Kubota1,Yuhei Kawakoya1,Makoto Iwamura1,Kanta Matsuura2Choking Interpreters to Force Execution of Malicious Bytecode 1NTT Security Holdings Corporation2Institute of Industrial Science,The University of TokyoToshinori Usui1,Yuto Otsuki1#BHUSA Bl
2、ackHatEventsToshinori Usui,Ph.D.Research scientist,security principalResearch interests:malware analysis,reverse engineering,and exploit developmentCTF loverBrazilian Jiu-Jitsu enthusiastYuto Otsuki,Ph.D.Senior researcherResearch interests:memory analysis,reverse engineering and operating system sec
3、urity2#BHUSA BlackHatEventsCode Injection Attack1.Allocate2.Write3.Execute31C0B001.a memory regionmalicious codethe code31C0B001.3#BHUSA BlackHatEventsCode Injection Attack31C0B001.31C0B001.1.Allocate2.Write3.Executea memory regionmalicious codethe code4#BHUSA BlackHatEventsCode Injection Attack31C0
4、B001.bytecode2.Writemalicious code5#BHUSA BlackHatEventsTodays Topic:Bytecode Jiu-Jitsu6Injector Injector(malware)(malware)InterpreterInterpreter#BHUSA BlackHatEventsOutline 入門入門 Introduction to Code Injection Attack 理合理合 Bytecode Jiu-Jitsu Overview 稽古稽古 Interpreter Implementation Basics 打込打込 Interp
5、reter Analysis 試合試合 Bytecode Jiu-Jitsu Attack 亂取亂取 Experiments and Evaluations 受身受身 Countermeasures against Bytecode Jiu-Jitsu 総括総括 Takeaways7#BHUSA BlackHatEvents入門入門 Introduction to Code Injection Attack8#BHUSA BlackHatEventsCode Injection Attack Malware tries to conceal their malicious behavior o
6、n the target host Code injection is a technique to blend malicious behavior with benign one by forcing a benign process to execute malicious codeInjector codeMalicious codefor injectionLegitimatebenign codeMalicious codefor injectionInjectmalicious codeStart a thread to execute malicious codeInjecto
7、r(malware)Allocatememory regionCreate/Open abenign process9#BHUSA BlackHatEventsProcess HollowingInjector codeMalicious imagefor injectionCreate a suspendedbenign processInjector(malware)Benign executableLegitimatebenign imageIP10#BHUSA BlackHatEventsProcess HollowingInjector codeMalicious imagefor
8、injectionCreate a suspendedbenign processInjector(malware)Benign executableUnmap image11#BHUSA BlackHatEventsProcess HollowingInjectmalicious image(replacement)Injector codeMalicious imagefor injectionCreate a suspendedbenign processInjector(malware)Benign executableUnmap imageMalicious imagefor inj
9、ection12#BHUSA BlackHatEventsProcess HollowingInjectmalicious image(replacement)Injector codeMalicious imagefor injectionCreate a suspendedbenign processInjector(malware)Benign executableUnmap imageMalicious imagefor injectionAdjustinstruction pointerIP13#BHUSA BlackHatEventsProcess HollowingInjectm
10、alicious image(replacement)Injector codeMalicious imagefor injectionCreate a suspendedbenign processInjector(malware)Benign executableUnmap imageMalicious imagefor injectionAdjustinstruction pointerResumeIP14#BHUSA BlackHatEventsProcess HollowingInjectmalicious image(replacement)Injector codeMalicio
11、us imagefor injectionCreate a suspendedbenign processInjector(malware)Benign executableUnmap imageMalicious imagefor injectionAdjustinstruction pointerResumeIPNot the same15#BHUSA BlackHatEventsProcess Hollowing Variants Process Doppelgnging1.Start a transaction and writes malicious code to a benign
12、 file2.Creates an in-memory image from the file3.Rolls the file back 4.Creates a process from the image Process Herpaderping1.Writes malicious code to a benign file2.Creates an in-memory image from the file3.Creates a process from the image4.Overwrites the file to make it benign5.Creates the first t
13、hread6.Closes the file16#BHUSA BlackHatEvents理合理合 Bytecode Jiu-Jitsu Overview17#BHUSA BlackHatEventsOur New Technique:Bytecode Jiu-Jitsu We introduce a novel technique of a code injection attack We call it Bytecode Jiu-Jitsu The attack technique injects malicious bytecode into an interpreter process
14、(e.g.Python)Existing attacktechniquesBytecode Jiu-JitsuInjection targetArbitrary processInterpreter processCode to be injectedNative codeBytecodeBehavior blended intoExecutableScript18#BHUSA BlackHatEventsBytecode Jiu-Jitsu Overview Attackers environmentInterpreterMalicious bytecode(and data)Victims
15、environmentInterpreterTarget bytecode(and data)TargetscriptInputEmbed into injectorInjectorInterpreterTarget bytecode(and data)TargetscriptMaliciousscriptInputInputMalicious bytecode(and data)Injection19#BHUSA BlackHatEventsBytecode Jiu-Jitsu Overview Attackers environmentInterpreterMalicious byteco
16、de(and data)VictimsenvironmentInterpreterTarget bytecode(and data)TargetscriptInputInjectorInterpreterTarget bytecode(and data)TargetscriptMaliciousscriptInputInputMalicious bytecode(and data)Embed into injectorPreparation phaseBenign script to be replacedInjectionExtracted as injection payloadExtra
17、cted as signature for memory scan20#BHUSA BlackHatEventsBytecode Jiu-Jitsu Overview Attackers environmentInterpreterMalicious bytecode(and data)VictimsenvironmentInterpreterTarget bytecode(and data)TargetscriptInputEmbed into injectorInjectorInterpreterTarget bytecode(and data)TargetscriptMaliciouss
18、criptInputInputMalicious bytecode(and data)Attack phaseInjectionInfiltrate intovictimsenvironmentScan memory to locate bytecode by signature21#BHUSA BlackHatEventsHow to realize Bytecode Jiu-Jitsu?Problem Bytecode Jiu-Jitsu requires the internal specifications of target interpretersi.e.,data structu
19、res of bytecode and data However,they are sometimes not publicly available Solution:Manual reverse engineering?22#BHUSA BlackHatEvents稽古稽古 Interpreter Implementation Basics23#BHUSA BlackHatEventsScript Execution MechanismBytecode cacheExecution cycle in interpretation functionVirtual stack/virtual r
20、egisterAnalysisphaseCode-genphaseVirtual MachineScriptVirtual ProgramCounter(VPC)Symbol tableFetcherDecoder/DispatcherVM instructionhandler24#BHUSA BlackHatEventsBytecode Cache ImplementationTypically implemented with array of structuresOpcodeOperandLOAD_CONST1STORE_FAST0LOAD_FAST0COMPARE_OP2POP_TOP
21、2LOAD_CONST0BytecodeLOAD_CONST1STORE_FAST0LOAD_FAST0LOAD_CONST2COMPARE_OP2POP_TOPLOAD_CONST0Array of structures Opcode,Operand25#BHUSA BlackHatEventsBytecode Cache ImplementationTypically implemented with array of structuresOpcodeOperandLOAD_CONST1STORE_FAST0LOAD_FAST0COMPARE_OP2POP_TOP2LOAD_CONST0B
22、ytecodeLOAD_CONST1STORE_FAST0LOAD_FAST0LOAD_CONST2COMPARE_OP2POP_TOPLOAD_CONST0Array of structures Opcode,OperandIndex for a symbol table(Bytecode depends on symbol tables for data access.)26#BHUSA BlackHatEventsSymbol Table ImplementationVal:25Type:Int23(age)14constsvarsglobalValue objectSymbol tab
23、les are composed of references between multiple structures and arraysIt contains actual data,such as integers,strings,etc.(end node)It manages references to symbol tables(start node of chains)Management structure27age=25#BHUSA BlackHatEventsSymbol Table ImplementationInterpretation functioninterp(sc
24、ript_ctx_info,func_info,)Value objectManagement structureValue objectManagement structureEach of management structures has symbol tables for each scopeArguments include pointers tomanagement structures28#BHUSA BlackHatEventsInterpreter Analysis Issues These data structures are complicated.Not easy t
25、o extract them because bytecode and symbol tables must be kept consistency between them.Interpreters share this overall design,but the concrete implementation details differ across interpreters and versions.Manual reverse engineering of interpreters requires heavy effort.Which means Bytecode Jiu-Jit
26、su is not practical?29#BHUSA BlackHatEventsInterpreter Analysis Issues These data structures are complicated.Not easy to extract them because bytecode and symbol tables must be kept consistency between them.Interpreters share this overall design,but the concrete implementation details differ across
27、interpreters and versions.Manual reverse engineering of interpreters requires heavy effort.Which means Bytecode Jiu-Jitsu is not practical?No,the reverse engineering can be automated!30#BHUSA BlackHatEventsHow to realize Bytecode Jiu-Jitsu?Problem Bytecode Jiu-Jitsu requires the internal specificati
28、ons of target interpretersi.e.,data structures of bytecode and data However,they are sometimes not publicly available Solution:Manual reverse engineering?Automated reverse engineering!Dynamic analysis of interpreter binaries by crafted testing scripts for analyzing implementation details Tracking po
29、inter dereferences and analyzing memory accesses to reveal reference relationships and data structuresToo tedious 31#BHUSA BlackHatEvents打込 Interpreter Analysis:Prepare Bytecode and Symbol Tables to Inject 32#BHUSA BlackHatEventsInterpreter Analysis TechniqueOur analysis techniqueKnowledge on langua
30、ge specificationManualBytecode andSymbol tablesinfoInjectorTest scriptsInterpreter binaryInputOutputMemory access tracesGenAnalyzeObserve behavior33#BHUSA BlackHatEventsTechnical Overviewinterp(script_ctx_info,)Interpretation function Find the interpretation function34#BHUSA BlackHatEventsTechnical
31、Overviewinterp(script_ctx_info,)Interpretation function35 Find memory regions accessedduring bytecode interpretation#BHUSA BlackHatEventsTechnical Overview12345interp(script_ctx_info,)Value objectInterpretation function Find a value object36#BHUSA BlackHatEventsTechnical Overview12345interp(script_c
32、tx_info,)Value objectInterpretation functionManagement structure Find a dereference path to the object37#BHUSA BlackHatEventsTechnical Overview12345interp(script_ctx_info,)Management structureValue objectInterpretation function Find a symbol table,identify its data structure38#BHUSA BlackHatEventsKe
33、y Steps of Interpreter AnalysisFind the interpretation functionFind accessed memory regionsFind a value objectFind a dereference path to the objectFind a symbol table,identify its data structureExtract bytecode and symbol tables39#BHUSA BlackHatEventsKey Steps of Interpreter AnalysisFind the interpr
34、etation functionFind accessed memory regionsFind a value objectFind a dereference path to the objectFind a symbol table,identify its data structureExtract bytecode and symbol tables40#BHUSA BlackHatEventsWhat do we need to know first?Bytecode cacheExecution cycle in interpretation functionVirtual st
35、ack/virtual registerAnalysisphaseCode-genphaseVirtual MachineScriptVirtual ProgramCounter(VPC)Symbol tableFetcherDecoder/DispatcherVM instructionhandlerDetecting VPC first is a keyDetection Goal41#BHUSA BlackHatEventsKey Assumptions for DetectionBytecode cacheExecution cycle in interpretation functi
36、onVirtual stack/virtual registerAnalysisphaseCode-genphaseVirtual MachineScriptVirtual ProgramCounter(VPC)Symbol tableFetcherDecoder/DispatcherVM instructionhandler The number of memory reads to the VPC is proportionalto the number of statements in the input script An instruction in a bytecode cache
37、is always pointed by the VPC The interpretation function has repeated memory reads to the VPC42#BHUSA BlackHatEventsCode-genphaseSymbol tableVirtual stack/virtual registerAnalysisphaseExecution cycle in interpretation functionBytecode cacheDecoder/DispatcherVM instructionhandlerDetectionVirtual Mach
38、ineScriptVirtual ProgramCounter(VPC)FetcherVPC Run scripts of various length Find a memory region whose#of reads is proportionalScriptScriptScriptBytecode cache/interpretation function Detect by using memory accesses to the VPC43#BHUSA BlackHatEventsKey Steps of Interpreter AnalysisFind the interpre
39、tation functionFind accessed memory regionsFind a value objectFind a dereference path to the objectFind a symbol table,identify its data structureExtract bytecode and symbol tables44#BHUSA BlackHatEventsAccessed Memory Region DetectionPointerDereferenceDestination address Assign a tagto the pointer
40、to the management structurePropagate&CheckPointer taintingAssign a taint taginterp(script_ctx_info,)Interpretation function45 Determine a memory regionwith the tag as accessed#BHUSA BlackHatEventsAccessed Memory Region DetectionPointerDereferenceDestination address Assign a tagto the pointer to the
41、management structurePropagate&CheckPointer taintingAssign a taint taginterp(script_ctx_info,)Interpretation function46 Determine a memory regionwith the tag as accessedThe Analyses hereafter will focus only onthe accessed memory regions#BHUSA BlackHatEventsKey Steps of Interpreter AnalysisFind the i
42、nterpretation functionFind accessed memory regionsFind a value objectFind a dereference path to the objectFind a symbol table,identify its data structureExtract bytecode and symbol tables47#BHUSA BlackHatEventsFeatures of Test Script We manually craft test scripts to:Run dynamic analysis Control the
43、 memory state for the convenience of later analysisglobal_var=123456Feature 2:Use a characteristic valuesearchable in memoryFeature 1:Has an assignment statement in each scope(this example is for global scope)48#BHUSA BlackHatEventsValue Object DetectionFind a value object by searching memoryfor a c
44、haracteristic valueglobal_var=123456Test script123456interp(script_ctx_info,)Value objectInterpretation function49#BHUSA BlackHatEventsKey Steps of Interpreter AnalysisFind the interpretation functionFind accessed memory regionsFind a value objectFind a dereference path to the objectFind a symbol ta
45、ble,identify its data structureExtract bytecode and symbol tables50#BHUSA BlackHatEventsmov rcx,rdx+0 x40 mov rbx,rcx+rsi*8 mov rax,rbx+0 x10 Structure/Array Dereference Analysis Find structure/array accesses Determine base addresses and offsetsMember/ElemMember/ElemStruct/ArrayStruct/ArrayPointer d
46、ereferencerbx Get base address Get offset/index Repeat+0 x10 Find memory accesses that use the base register51#BHUSA BlackHatEventsDereference Analysis of Symbol Tables Analyze all accessed structures and arrays Find dereference paths from the management structure to value objectsinterp(script_ctx_i
47、nfo,)Management structureValue objectInterpretation function52#BHUSA BlackHatEventsKey Steps of Interpreter AnalysisFind the interpretation functionFind accessed memory regionsFind a value objectFind a dereference path to the objectFind a symbol table,identify its data structureExtract bytecode and
48、symbol tables53#BHUSA BlackHatEventsinterp(script_ctx_info,)Structure Analysis of Symbol Tables A symbol table containing arbitrary number of variables must be handled If references to value objects in the symbol table are managed with arrays Array length only varies Reference structure does not var
49、y901256781234ArrayManagement structureValue objectInterpretation function54#BHUSA BlackHatEventsKey Steps of Interpreter AnalysisFind the interpretation functionFind accessed memory regionsFind a value objectFind a dereference path to the objectFind a symbol table,identify its data structureExtract
50、bytecode and symbol tables55#BHUSA BlackHatEventsTime to Extract!Attackers environmentInterpreterMalicious bytecode(and data)VictimsenvironmentInterpreterTarget bytecode(and data)TargetscriptInputInjectorInterpreterTarget bytecode(and data)TargetscriptMaliciousscriptInputInputMalicious bytecode(and
51、data)Embed into injectorInjectionExtracted as signature for memory scanPreparation phase56Benign script to be replacedExtracted as injection payload#BHUSA BlackHatEventsExtraction of Bytecode and Symbol Tables Execute a malicious script with the behavior to inject Suspend the execution at the beginn
52、ing of the interpretation function Explore the structures from the management structure to symbol tables based on the obtained structural information Read their memory to extract bytecode and symbol tables57#BHUSA BlackHatEvents試合試合 Bytecode Jiu-Jitsu Attack:Determine Place to Inject in Victims Envi
53、ronment58#BHUSA BlackHatEventsTime to Inject!Attackers environmentInterpreterMalicious bytecode(and data)VictimsenvironmentInterpreterTarget bytecode(and data)TargetscriptInputEmbed into injectorInjectorInterpreterTarget bytecode(and data)TargetscriptMaliciousscriptInputInputMalicious bytecode(and d
54、ata)InjectionScan memory to locate bytecode by signatureInfiltrate intovictimsenvironmentAttack phase59#BHUSA BlackHatEventsKnow Your Victim Final step:Locate the proper position to inject to Memory space layout is randomized The location of bytecode and symbol tables differs across executions It is
55、 difficult to reveal the internal memory state of the interpreter in thevictims environment Should not use debuggers because its too suspicious Approach:memory search and exploration Identify internal state by memory read only Without using debuggers60#BHUSA BlackHatEventsRecognizing Structure of Ta
56、rget Interpreter1.Suspend execution and enumerate all stack and heap memory2.Detect management structures by backtracking from a value objectPointer to DPointer to BPointer to CManagementStructure AStructure BStructure D(Value Object)Array C Find a value object by searching a value in memory Step St
57、ep CiStep:find the pointer with memory searchStep:calculate the base address Step Step 1234 Step Step 61#BHUSA BlackHatEventsInjection of Bytecode and Symbol Tables Traverse memory in the forward direction Write bytecode and symbol tables Overwrite the VPC to point to the bytecode entry Resume the e
58、xecution62#BHUSA BlackHatEvents亂取亂取 Experiments and Evaluations63#BHUSA BlackHatEventsExperimental SetupTarget interpretersFeatureImplementation typePythonWidely used/Attackers frequently useOpen sourceLuaVBScriptBoth open source and proprietaryChose open-source interpreters as targets to verify det
59、ection points64#BHUSA BlackHatEventsAnalysis/Injection TestInterpretersVPCBytecodecacheInterp.functionSymbol tablesValue objectDetectionAnalysisPython Lua VBScript InterpretersBytecode,symbol tablesCodeexecutionExtractionInjectionPython Lua VBScript All steps of our analysis technique could analyze
60、interpreters correctly65#BHUSA BlackHatEventsDetectability of Bytecode Jiu-Jitsu We built two types of Bytecode Jiu-Jitsu injectors Inject infinite loop:for evaluating detectability of just the injection behavior Inject downloader malware:for evaluating detectability of injection+bytecode behavior E
61、valuated whether each security tool can detect themSecurity toolsTools used for the experimentAnti-virus(AV)72 AV productsMalware analysis sandboxCAPE sandboxEndpoint Detection and Response(EDR)System monitoring tool(frequently used as simple EDR)Memory forensics toolsVolatility with hollowfind/imgm
62、alfind/ptemalfind66#BHUSA BlackHatEventsDetectability of Bytecode Jiu-Jitsu:ResultSecurity toolsDetection resultInfinite loopDownloaderAV9/729/72SandboxEDRMemory forensics tools67#BHUSA BlackHatEventsDetectability of Bytecode Jiu-Jitsu:ResultSecurity toolsDetection resultInfinite loopDownloaderAV9/7
63、29/72SandboxEDRMemory forensics toolsOnly 9 AI-based engines flagged it as suspicious68#BHUSA BlackHatEventsDetectability of Bytecode Jiu-Jitsu:ResultSecurity toolsDetection resultInfinite loopDownloaderAV9/729/72SandboxEDRMemory forensics tools Injection requires only memory read/write,which makes
64、it difficult to detect Detected the behavior of injected bytecode69#BHUSA BlackHatEventsDetectability of Bytecode Jiu-Jitsu:ResultSecurity toolsDetection resultInfinite loopDownloaderAV9/729/72SandboxEDRMemory forensics tools Detection relies executable permission of memory Bytecode Jiu-Jitsu does n
65、ot require it and out of their scope70#BHUSA BlackHatEventsDemo71#BHUSA BlackHatEvents受身受身Countermeasures againstBytecode Jiu-Jitsu72#BHUSA BlackHatEventsCountermeasures with Existing Tools AV Flag memory read/write APIs as suspicious EDR and sandbox Detect memory writes to an interpreter process De
66、termine whether the written data is bytecode using signatures,etc.Memory forensics Analyze an injector binary,detect unnatural parent-child relationships OS security Protect interpreter processes and restrict memory write accesses Manual analysis Difficult.No bytecode specification,debuggers,or disa
67、ssemblers73#BHUSA BlackHatEventsCountermeasures in Future Studies Bytecode/Malicious bytecode identification Learning-based approach may be applicable Manual analysis support Analyze instruction set of bytecode,build debuggers/disassemblersIdentificationIdentificationInputInputOutputOutputApplies to
68、Applies toBytecodeUnknown byte sequenceBytecode/NotEDRs and sandboxesMalicious bytecodeBytecodeMalicious/BenignMemory forensics74#BHUSA BlackHatEvents総括総括 Takeaways75#BHUSA BlackHatEventsTakeaways Utilizing bytecode for code injection had not been much discussed before Our reverse engineering techniques revealed it to be a realistic threat Be more careful about bytecode as payload from now on!Security researchers should discuss further countermeasures We wish our PoC tools will help themOur PoC tools will be available soon here:https:/ BlackHatEventsThank you!77