《跨緩存游戲:以更有效的方式贏得勝利.pdf》由會員分享,可在線閱讀,更多相關《跨緩存游戲:以更有效的方式贏得勝利.pdf(93頁珍藏版)》請在三個皮匠報告上搜索。
1、#BHASIA BlackHatEventsGame of Cross Cache:Game of Cross Cache:Lets win it in a more effective way!Lets win it in a more effective way!Le Wu From Baidu Security#BHASIA BlackHatEventsAbout me Le Wu,NVamous on Twitter Focus on Android/Linux vulnerability Dirty Pagetable A novel technique to rule the Li
2、nux Kernel 1 Blackhat USA,Europe,Asia1:https:/yanglingxi1993.github.io/dirty_pagetable/dirty_pagetable.html#BHASIA BlackHatEventsAgenda Introduction to Cross-cache attack Challenges in Cross-cache attack Advancing Towards a More Effective Cross-cache Attack Exploit File UAF with Dirty Pagetable Summ
3、aryIntroduction to Cross-cache attackA Simplified Cross-cache Attack For UAFUAF(Object A or object B could be pages or other kinds of memory regions)Trigger UAF to release the victim object A;Reclaim the victim slab of victim object A to Page allocator;kmem_cache B reuse the pages of victim slab,and
4、 object A is reallocated as object B;Make use of corrupted object B to get ROOT;corrupt the object BOperations to victim object A;Cross-cache attack is getting popular:Original vulnerable object is not exploitable,especially the one allocated from a dedicated kmem_cacheTransform the unknown vulnerab
5、ility to well-known one to simplify the exploitationBuild data-only exploitation techniques to defeat growing mitigations like KASLR,PAN,CFI.MethodCross-cache FromCross-cache Toret2dir*direct mappingret2page*kernel allocated pageDrity Cred*struct credDirty Pagetable*user page table.Introduction to C
6、ross-cache attackWell,its known as an unstable technique.Introduction to Cross-cache attackCan we make it less unstable,or in other words,more efficient?Common workflow of Cross-cache attackStep0.Common knowledge for SLUB allocatorobjs_per_slab:number of objects in a single slaborder:order of pages
7、in a single slabCommon workflow of Cross-cache attackStep 0.Common knowledge for SLUB allocatorThe deterministic method for putting slab into the percpu partial list:Common workflow of Cross-cache attackStep0.Common knowledge for SLUB allocatorCreate a full slabCommon workflow of Cross-cache attackS
8、tep0.Common knowledge for SLUB allocatorThe deterministic method for putting slab into the percpu partial list:Pin on cpu#0 and release an object from the full slabFlushing for the percpu partial list:cpu_partial:the maximum number of slabs can be put in the percpu partial listCommon workflow of Cro
9、ss-cache attackStep0.Common knowledge for SLUB allocatorCommon workflow of Cross-cache attackStep0.Common knowledge for SLUB allocatorFlushing for the percpu partial list:Slabs containing some in-use objects are placed on SLUBs per-NUMA-node partial listSlabs that are completely empty are freed back
10、 to the page allocatorStep1.Pin our task to a single CPU,for example,cpu#02:https:/ drain partially-free slabs of all their free objectsStep3.Allocate around objs_per_slab*(1+cpu_partial)objectsCommon workflow of Cross-cache attack 2Step4.Allocate objs_per_slab-1 objects as pre-alloc objectsStep5.Al
11、locate the victim objectStep6.Trigger the vulnerability(UAF)to release the victim objectCommon workflow of Cross-cache attackStep7.Allocate objs_per_slab+1 objects as post-alloc objectsCommon workflow of Cross-cache attackStep8.Release all the pre-alloc and post-alloc objectsCommon workflow of Cross
12、-cache attackStep9.Free one object per slab from the allocations in Step3After releasing cpu_partial 1 objects:Common workflow of Cross-cache attackAfter releasing one more object,the flushing for cpu partial list gets triggered:Common workflow of Cross-cache attackStep9.Free one object per slab fro
13、m the allocations from Step3Step10.Heap spray with object B to occupy the victim slab,victim object A gets reallocated as object BStep11.Construct primitives for privilege escalationCommon workflow of Cross-cache attack#BHASIA BlackHatEventsChallenge 1Challenge 2 Challenge 1:How to discard the victi
14、m slab under a constrained allocation primitive Challenge 2:How to make high-order slab reuse the low-order slab deterministicallyChallenges in Cross-cache attackChallenges in Cross-cache attackChallenge 1:How to discard the victim slab under a constrained allocation primitiveStep 3.Allocate around
15、objs_per_slab*(1+cpu_partial)objectsThis step requires us:Allocate a large number of objectsKeep this large number of objects unreleased for a whileAllocate a large number of objectsKeep the large number of objects unreleased for a while Dedicated kmem-cache is becoming a mitigation for cross-cache
16、attack.We can hardly find suitable allocation primitives.The known mitigations like:CONFIG_RANDOM_KMALLOC_CACHES,AUTOSLAB Limited system resources Constraints of kernel components Temporary kernel object:gets allocated and then released.Challenges in Cross-cache attackChallenge 2:How to make high-or
17、der slab reuse the low-order slab deterministically order-N pages-order-M pages,N MCan be done by allocating tons of object B,order-N pages will definitely be reused as order-M pages.This may require:too many object B,this can be really hard under a limited system resourcesChallenges in Cross-cache
18、attackAllocating tons of object B wont help.We need to let order-N pages get compacted into order-M pages,so object B can reuse these order-N pages.So how?-Shaping the heap!Challenges in Cross-cache attack order-N pages-order-M pages,N cmd_done,NW_CMD_TIMEOUT);mutex_lock(&host_ctx-lock);npu_dequeue_
19、network_cmd(network,unload_cmd1);npu_free_network_cmd(host_ctx,unload_cmd1);free_network(host_ctx,client,network-id);mutex_unlock(&host_ctx-lock);mutex_lock(&host_ctx-lock);network=get_network_by_hdl(host_ctx,unload-network_hdl);unload_cmd1=npu_alloc_network_cmd(host_ctx,0);npu_queue_network_cmd(net
20、work,unload_cmd1);mutex_unlock(&host_ctx-lock);Task A(On cpu1)mutex_lock(&host_ctx-lock);network=get_network_by_hdl(host_ctx,unload-network_hdl);unload_cmd2=npu_alloc_network_cmd(host_ctx,0);npu_queue_network_cmd(network,unload_cmd2);mutex_unlock(&host_ctx-lock);wait_for_completion_timeout(&unload_c
21、md2-cmd_done,NW_CMD_TIMEOUT);mutex_lock(&host_ctx-lock);npu_dequeue_network_cmd(network,unload_cmd2);npu_free_network_cmd(host_ctx,unload_cmd2);free_network(host_ctx,client,network-id);mutex_unlock(&host_ctx-lock);Task B(On cpu2)unload_cmd1 gets released here!UAF or Double free happens!20s3:https:/
22、Towards a More Effective Cross-Cache AttackCVE-2023-21400)3wait_for_completion_timeout(&unload_cmd1-cmd_done,NW_CMD_TIMEOUT);mutex_lock(&host_ctx-lock);npu_dequeue_network_cmd(network,unload_cmd1);npu_free_network_cmd(host_ctx,unload_cmd1);free_network(host_ctx,client,network-id);mutex_unlock(&host_
23、ctx-lock);With the bug,we can:static void npu_dequeue_network_cmd(struct npu_network*network,struct npu_network_cmd*cmd)list_del(&cmd-list);static void npu_free_network_cmd(struct npu_host_ctx*ctx,struct npu_network_cmd*cmd)if(cmd-stats_buf)kmem_cache_free(ctx-stats_buf_cache,cmd-stats_buf);kmem_cac
24、he_free(ctx-network_cmd_cache,cmd);list_del()primitive Double free primitive Arbitrary kmem_cache_free()primitive Victim object:struct npu_network_cmd struct list_head list;.struct completion cmd_done;/*stats buf info*/uint32_t stats_buf_size;void _user*stats_buf_u;void*stats_buf;int ret_status;Allo
25、cated from a dedicated kmem_cache IPA_TX_PKT_WRAPPERAdvancing Towards a More Effective Cross-Cache AttackCVE-2023-21400Allocated from a dedicated kmem_cache IPA_TX_PKT_WRAPPERAdvancing Towards a More Effective Cross-Cache AttackCVE-2023-21400Clean and inactive kmem_cache Advancing Towards a More Eff
26、ective Cross-Cache AttackAllocated from a dedicated kmem_cache IPA_TX_PKT_WRAPPERCVE-2023-21400Exploitation plan:Victim npu_network_cmd objectVictim file arrayA file UAFGet ROOT!Data-only exploitation,woohoo!Advancing Towards a More Effective Cross-Cache AttackBut the cross cache is known for the un
27、stable.Trigger the issueCross-cache attackDirty PagetableMake use of arbitrary kfree()primitiveCVE-2023-21400Step1.Trigger the issueStep2.Cross-cache attack:cross from kmem_cache IPA_TX_PKT_WRAPPER to file_array(kmalloc-8k)kmem_cache IPA_TX_PKT_WRAPPER:order-0 slabfile_array:allocated from kmem_cach
28、e kmalloc-2k kmalloc-8k,all are order-3 slabstatic struct fdtable*alloc_fdtable(unsigned int nr)struct fdtable*fdt;void*data;.nr/=(1024/sizeof(struct file*);nr=roundup_pow_of_two(nr+1);nr*=(1024/sizeof(struct file*);.data=kvmalloc_array(nr,sizeof(struct file*),GFP_KERNEL_ACCOUNT);.fdt-fd=data;.retur
29、n fdt;.We choose kmalloc-8k to allocate file array from.Advancing Towards a More Effective Cross-Cache Attack Challenge 1:How to discard the victim order-0 slab under a constrained allocation primitive Challenge 2:How to make order-3 slab reuse the order-0 slab deterministicallyChallenge 1Challenge
30、2Advancing Towards a More Effective Cross-Cache AttackStep2.Cross-cache attack:cross from kmem_cache IPA_TX_PKT_WRAPPER to file_array(kmalloc-8k)Challenge 1:How to discard the victim order-0 slab under a constrained allocation primitiveWe cant Allocate a large number of npu_network_cmd objects and k
31、eep this large number of objects unreleased for a while.npu_network_cmd object is a temporary likely kernel object:gets allocated and then releasedoMSM_NPU_LOAD_NETWORK_V2oMSM_NPU_UNLOAD_NETWORKoMSM_NPU_EXEC_NETWORK_V2(use this later)struct npu_network_cmd*cmd=NULL;mutex_lock(&host_ctx-lock);cmd=kme
32、m_cache_zalloc(ctx-network_cmd_cache,GFP_KERNEL);mutex_unlock(&host_ctx-lock);wait_for_npu_firmware();mutex_lock(&host_ctx-lock);kmem_cache_free(ctx-network_cmd_cache,cmd);mutex_unlock(&host_ctx-lock);Advancing Towards a More Effective Cross-Cache AttackA really constrained allocation primitive:Well
33、,we found another kernel object sharing the same kmem_cache IPA_TX_PKT_WRAPPER because of SLAB Merging:struct msm_cvp_frame struct list_head list;struct msm_cvp_list bufs;u64 ktid;From msm_cvp driver:System privilege required to access the driver So we cant even discard the victim order-0 slab with
34、the old methodAdvancing Towards a More Effective Cross-Cache AttackChallenge 1:How to discard the victim order-0 slab under a constrained allocation primitiveSolving Challenge1:Discard the empty slab in a Race wayThe slab move primitive:move the cpu slab from one cpu to another cpus percpu partial l
35、istAdvancing Towards a More Effective Cross-Cache AttackStep1.Pin the task on cpu#1Step2.Make cpu slab of cpu#1 full by allocating OBJS_PER_SLAB objectsAdvancing Towards a More Effective Cross-Cache AttackThe slab move primitive:move the cpu slab from one cpu to another cpus percpu partial listExamp
36、le:move cpu slab of cpu#1 into the percpu parital list of cpu#0Solving Challenge1:Discard the empty slab in a Race wayStep1.Pin the task on cpu#1Step2.Let cpu slab of cpu#1 become full by allocating OBJS_PER_SLAB objectsStep3.Pin the task on cpu#0Step4.Release all the objects allocated in step2.The“
37、slab move”happens(The move would happen when the first object of the full slab get released)With the help of slab move primitive,we can put one more slab into the cpu partial list of target cpu by allocating OBJS_PER_SLAB objects at most!Advancing Towards a More Effective Cross-Cache AttackThe slab
38、move primitive:move the cpu slab from one cpu to another cpus percpu partial listExample:move cpu slab of cpu#1 into the percpu parital list of cpu#0Solving Challenge1:Discard the empty slab in a Race wayRepeat the slab move primitiveAdvancing Towards a More Effective Cross-Cache Attackwe can put co
39、ntrollable number of slabs into the percpu partial list of target cpuSolving Challenge1:Discard the empty slab in a Race wayBy this new way of putting slabs into the percpu partial list,we can remove the Step3 in common workflow of cross-cache attack,and replace the step9 with repeating slab move pr
40、imitiveAdvancing Towards a More Effective Cross-Cache AttackStep 3.Allocate around objs_per_slab*(1+cpu_partial)objectsSolving Challenge1:Discard the empty slab in a Race wayRepeating slab move pritimive helps us accomplish discarding of victim slab under a very constrained allocation of objects:Ide
41、ally,we can finish the attack with only OBJS_PER_SLAB objects!However,its still not good enough for the issue:We only have the ability to allocate one npu_network_cmd object and hold it for a very short timeAdvancing Towards a More Effective Cross-Cache AttackSolving Challenge1:Discard the empty sla
42、b in a Race wayRace style slab move primitive:Task 1struct npu_network_cmd*cmd;mutex_lock(&host_ctx-lock);cmd=kmem_cache_zalloc(.);mutex_unlock(&host_ctx-lock);mutex_lock(&host_ctx-lock);kmem_cache_free(.,cmd);mutex_unlock(&host_ctx-lock);Pinned on cpu#1Task Nstruct npu_network_cmd*cmd;mutex_lock(&h
43、ost_ctx-lock);cmd=kmem_cache_zalloc(.);mutex_unlock(&host_ctx-lock);mutex_lock(&host_ctx-lock);kmem_cache_free(.,cmd);mutex_unlock(&host_ctx-lock);Pinned on cpu#1.(N OBJS_PER_SLAB)Advancing Towards a More Effective Cross-Cache AttackSolving Challenge1:Discard the empty slab in a Race wayPinned on cp
44、u#1struct npu_network_cmd*cmd;mutex_lock(&host_ctx-lock);cmd=kmem_cache_zalloc(.);mutex_unlock(&host_ctx-lock);.Pinned on cpu#1mutex_lock(&host_ctx-lock);kmem_cache_free(.,cmd);mutex_unlock(&host_ctx-lock);.OBJS_PER_SLAB tasks can race like this:OBJS_PER_SLAB allocationsPinned on cpu#1struct npu_net
45、work_cmd*cmd;mutex_lock(&host_ctx-lock);cmd=kmem_cache_zalloc(.);mutex_unlock(&host_ctx-lock);Pinned on cpu#1struct npu_network_cmd*cmd;mutex_lock(&host_ctx-lock);cmd=kmem_cache_zalloc(.);mutex_unlock(&host_ctx-lock);Pinned on cpu#1mutex_lock(&host_ctx-lock);kmem_cache_free(.,cmd);mutex_unlock(&host
46、_ctx-lock);Pinned on cpu#1mutex_lock(&host_ctx-lock);kmem_cache_free(.,cmd);mutex_unlock(&host_ctx-lock);Advancing Towards a More Effective Cross-Cache AttackRace style slab move primitive:Task 1struct npu_network_cmd*cmd;mutex_lock(&host_ctx-lock);cmd=kmem_cache_zalloc(.);mutex_unlock(&host_ctx-loc
47、k);mutex_lock(&host_ctx-lock);kmem_cache_free(.,cmd);mutex_unlock(&host_ctx-lock);Pinned on cpu#1Task Nstruct npu_network_cmd*cmd;mutex_lock(&host_ctx-lock);cmd=kmem_cache_zalloc(.);mutex_unlock(&host_ctx-lock);mutex_lock(&host_ctx-lock);kmem_cache_free(.,cmd);mutex_unlock(&host_ctx-lock);Pinned on
48、cpu#1.(N OBJS_PER_SLAB)Solving Challenge1:Discard the empty slab in a Race wayOBJS_PER_SLAB allocations lead to A full slab created on cpu#1Pinned on cpu#1struct npu_network_cmd*cmd;mutex_lock(&host_ctx-lock);cmd=kmem_cache_zalloc(.);mutex_unlock(&host_ctx-lock);.Pinned on cpu#1mutex_lock(&host_ctx-
49、lock);kmem_cache_free(.,cmd);mutex_unlock(&host_ctx-lock);.Pinned on cpu#1struct npu_network_cmd*cmd;mutex_lock(&host_ctx-lock);cmd=kmem_cache_zalloc(.);mutex_unlock(&host_ctx-lock);Pinned on cpu#1struct npu_network_cmd*cmd;mutex_lock(&host_ctx-lock);cmd=kmem_cache_zalloc(.);mutex_unlock(&host_ctx-l
50、ock);Pinned on cpu#1mutex_lock(&host_ctx-lock);kmem_cache_free(.,cmd);mutex_unlock(&host_ctx-lock);Pinned on cpu#0mutex_lock(&host_ctx-lock);kmem_cache_free(.,cmd);mutex_unlock(&host_ctx-lock);The full slab gets moved from cpu#1 to the percpupartial list of cpu#0Switch any task to cpu#0Advancing Tow
51、ards a More Effective Cross-Cache AttackRace style slab move primitive:Solving Challenge1:Discard the empty slab in a Race wayTask for Switching cpuFor(i=0;i SWITCH_CPU_NUM;i+)pin Task i to cpu#0;(SWITCH_CPU_NUM lock);cmd=kmem_cache_zalloc(.);mutex_unlock(&host_ctx-lock);mutex_lock(&host_ctx-lock);k
52、mem_cache_free(.,cmd);mutex_unlock(&host_ctx-lock);Pinned on cpu#1Task NPinned on cpu#1.(N OBJS_PER_SLAB)Pinned on cpu#2Solving Challenge1:Discard the empty slab in a Race wayPin task to cpu#1 struct npu_network_cmd*cmd;mutex_lock(&host_ctx-lock);cmd=kmem_cache_zalloc(.);mutex_unlock(&host_ctx-lock)
53、;mutex_lock(&host_ctx-lock);kmem_cache_free(.,cmd);mutex_unlock(&host_ctx-lock);Pin task to cpu#1 By adjusting:The number of race tasksSWITCH_CPU_NUMRace timeMaybe some time window expanding technique?Move a relatively stablenumber of slabs into the percpu parital list of cpu#0Will there be some sid
54、e effects for the original percpu slabs of cpu#0?Not really.In the worst case,we might allocate SWITCH_CPU_NUM objects on cpu#0,which wont create a full slab on cpu#0,so:If any of these objects gets released on cpu#0,no slab move would happen because we are the same cpuIf any of these objects gets r
55、eleased on cpu#1,no slab move would happen because the slab is not fullRace style slab move primitiveAdvancing Towards a More Effective Cross-Cache AttackWith the race style slab move primitive,we can easily all add enough slabs into the percpu partial list,and then succeed in reclaiming the empty s
56、lab with a really constrained allocation.Solving Challenge1:Discard the empty slab in a Race wayThe new optimized workflow of cross-cache attack for the issueStep1.Defragmentation with race style slab move primitive,a new slab will be created:Advancing Towards a More Effective Cross-Cache AttackStep
57、2.Allocate the victim objectStep3.Trigger the vulnerability(UAF)to release the victim objectAdvancing Towards a More Effective Cross-Cache AttackThe new optimized workflow of cross-cache attack for the issueStep4.Move the victim slab to the percpu partial list of cpu#1.Dont trigger the flushing of p
58、ercpu partial listAdvancing Towards a More Effective Cross-Cache AttackThe new optimized workflow of cross-cache attack for the issueStep 6:Heap spray with file array to occupy the victim slabStep 5:move the victim slab from the percpu partial list of cpu#1 to cpu#0.Trigger flushing of percpu partia
59、l list of cpu#0Advancing Towards a More Effective Cross-Cache AttackThe new optimized workflow of cross-cache attack for the issue Challenge 1:How to discard the victim order-0 slab under a constrained allocation primitive Challenge 2:How to make order-3 slab reuse the order-0 slab deterministically
60、SOLVED!Advancing Towards a More Effective Cross-Cache AttackStep2.Cross-cache attack:cross from kmem_cache IPA_TX_PKT_WRAPPER to file_array(kmalloc-8k)Challenge 1Challenge 2Advancing Towards a More Effective Cross-Cache AttackChallenge 1Challenge 2Challenge 2:How to make order-3 slab reuse the order
61、-0 slab deterministicallyPre-knowledge for page allocator(based on kernel 4.14)A simplified view of page allocator for Android devices:(single pgdata&single zone)Kernel space alloc_pages()User space mmap()Advancing Towards a More Effective Cross-Cache AttackExported by procfs/proc/pagetypeinfo(unrea
62、dable by untrusted app)Advancing Towards a More Effective Cross-Cache AttackPre-knowledge for page allocator(based on kernel 4.14)/proc/zoneinfo(unreadable by untrusted app)High watermark for zoneCurrent number of order-0 pagesMaxium number of order-0 pagesSpecific number of order-0 pages for pcplis
63、t shrink or bulkAdvancing Towards a More Effective Cross-Cache AttackExported by procfsPre-knowledge for page allocator(based on kernel 4.14)Order-0 allocation and releasing will use pcplist first,stack-liked wayFlushing for the pcplist:flush from tailCharactoristic of pcplistAdvancing Towards a Mor
64、e Effective Cross-Cache AttackPre-knowledge for page allocator(based on kernel 4.14)Deterministic page merging:static inline void _free_one_page(struct page*page,unsigned long pfn,struct zone*zone,unsigned int order,int migratetype)continue_merging:while(order lru);zone-free_areaorder.nr_free-;rmv_p
65、age_order(buddy);combined_pfn=buddy_pfn&pfn;page=page+(combined_pfn-pfn);pfn=combined_pfn;order+;Page allocator tends to merge low-order pages to high-order pages when low-order pages gets reclaimed into free_area.Advancing Towards a More Effective Cross-Cache AttackPre-knowledge for page allocator(
66、based on kernel 4.14)Solving Challenge2:Deterministic heap shapingStep1:Pin task on cpu#0Step2:Allocate a specific number of order-0 pages,the specific number is:maxium number of order-0 pages could be in pcplist.Releasing these pages will definitely trigger the flushing or pcplist later.Requirement
67、s for page allocation:Able to allocate a large number of order-0 pagesAllocated from UNMOVALE free_areaRequirements for page releasing:Synchronized releasing(No cpu switching)Choosing the proper kernel component:ION Pipe Socket GPUs(kgsl).ION:releasing pages asynchronously Pipe Socket GPUs(kgsl):rel
68、easing pages asynchronously.Advancing Towards a More Effective Cross-Cache AttackStep3:allocate a few hundreds of physically continuous order-0 pages from UNMOVALE free_area.Memory areaIn-use order-0 pagefree order-0 pagepfnpfn+8pfn+16pfn+24Advancing Towards a More Effective Cross-Cache AttackSolvin
69、g Challenge2:Deterministic heap shapingStep3:allocate a few hundreds of physically continuous order-0 pages from UNMOVALE free_area.Requirements for page allocation:Able to allocate a large number of order-0 pagesAllocated from UNMOVALE free_areaRelatively Clean:No other allocation than allocating o
70、rder-0 pagesRequirements for page releasing:Synchronized releasingAble to release pages partiallyChoosing the proper kernel component:ION Pipe Socket GPUs(kgsl).ION:releasing pages asynchronously Pipe Socket GPUs(kgsl):releasing pages asynchronously.Memory areaAdvancing Towards a More Effective Cros
71、s-Cache AttackSolving Challenge2:Deterministic heap shapingPage allocation and releasing with pipe:Allocating order-0 page when writing pipe:Releasing order-0 page when reading pipe:pipe_write(struct kiocb*iocb,struct iov_iter*from)if(bufs buffers)int newbuf=(pipe-curbuf+bufs)&(pipe-buffers-1);struc
72、t pipe_buffer*buf=pipe-bufs+newbuf;struct page*page=pipe-tmp_page;int copied;if(!page)page=alloc_page(GFP_HIGHUSER|_GFP_ACCOUNT);if(unlikely(!page)ret=ret?:-ENOMEM;break;pipe-tmp_page=page;.static void anon_pipe_buf_release(struct pipe_inode_info*pipe,struct pipe_buffer*buf)struct page*page=buf-page
73、;/*If nobody else uses this page,and we dont already have a*temporary page,lets keep track of it as a one-deep*allocation cache.(Otherwise just release our reference to it)*/if(page_count(page)=1&!pipe-tmp_page)pipe-tmp_page=page;else put_page(page);(The very first page wont be released,so we need t
74、o pre-allocated it before the heap shaping)Step3:allocate a few hundreds of physically continuous order-0 pages from UNMOVALE free_areaAdvancing Towards a More Effective Cross-Cache AttackSolving Challenge2:Deterministic heap shapingStep3:allocate a few hundreds of physically continuous order-0 page
75、s from UNMOVALE free_area.Owned by pipe_nOwned by pipe_n+1Step4:Create order-o page holes by releasing one order-0 page every 8 order-0 pages.Owned by pipe_n+2Owned by pipe_nOwned by pipe_n+1Owned by pipe_n+2Memory areaMemory areaorder-0 page holeAdvancing Towards a More Effective Cross-Cache Attack
76、Solving Challenge2:Deterministic heap shapingPcplist of cpu#0 would be like:pcplist.Advancing Towards a More Effective Cross-Cache AttackSolving Challenge2:Deterministic heap shapingorder-0 page holeStep5.Trigger the step1 in“new optimized workflow of cross cache attack for the issue”The optimized w
77、orkflow of cross cache attack for the issue:Advancing Towards a More Effective Cross-Cache AttackStep1.Defragmentation with race style slab move primitive,a new slab will be created:Solving Challenge2:Deterministic heap shapingEmpty slab comes from order-0 page holes.Owned by pipe_nOwned by pipe_n+1
78、Owned by pipe_n+2Memory areaorder-0 page holeNew slab(victim slab)Advancing Towards a More Effective Cross-Cache AttackStep5.Trigger the step1 in“new optimized workflow of cross cache attack for the issue”Solving Challenge2:Deterministic heap shapingStep6.Occupy all the other order-0 page holes,exce
79、pt the one has been used as new slabRequirements for page allocation:Able to allocate a large number of order-0 pagesAllocated from UNMOVALE free_areaChoosing the proper kernel component:ION Pipe Socket GPUs(kgsl).Owned by pipe_nOwned by pipe_n+1Owned by pipe_n+2Memory areaorder-0 page holeNew slabI
80、ON occupied pageAdvancing Towards a More Effective Cross-Cache AttackSolving Challenge2:Deterministic heap shapingStep7.Finish the step2 step5 of“new optimized workflow of cross cache attack for the issue”.Owned by pipe_nOwned by pipe_n+1Owned by pipe_n+2Memory areaorder-0 page holereleased victim s
81、labION occupied pageAfter the step5 of optimized workflow of cross cache attack for the issue,the victim slab will be reclaimed to page allocator:Pcplist of cpu#0 would be like:pcplist.Advancing Towards a More Effective Cross-Cache AttackSolving Challenge2:Deterministic heap shapingStep8.Release all
82、 the pages owned by the pipe.Memory areaorder-0 page holereleased victim slabION occupied page.free order-0 pageThere must be one and only one order-3 pages here,and released victim slab must be in it!Pcplist of cpu#0 would be like:pcplist.Advancing Towards a More Effective Cross-Cache AttackSolving
83、 Challenge2:Deterministic heap shapingStep9.Release all the pages created in step2 to forse the flushing of pcplistVictim slab and other order-0 pages are reclaimed into free_area,page merging will happen because of Deterministic page merging.Memory area.Order-3 pagesStep10.Heap spray lots of file a
84、rray to occupy the order-3 pages where victim slab liesAdvancing Towards a More Effective Cross-Cache AttackSolving Challenge2:Deterministic heap shapingIn actual practice,the success rate of the entire utilization largely depends on step 3:Step3:allocate a few hundreds of physically continuous orde
85、r-0 pages from UNMOVALE free_areaAdvancing Towards a More Effective Cross-Cache AttackSolving Challenge2:Deterministic heap shapingHow?Detect status of page allocator in a side-channel wayIf we keeps on allocate order-0 pages with _GFP_KSWAPD_RECLAIM flag enabled from UNMOVALBE free_area:State 1:all
86、ocated from pcplist first(1)(2)Advancing Towards a More Effective Cross-Cache AttackState 2:pcplist become empty,Unmovable free_area will be used:Start from low-orderIf we keeps on allocate order-0 pages with _GFP_KSWAPD_RECLAIM flag enabled from UNMOVALBE free_area:State3:If Unmovable free_area bec
87、om empty,other migration type free_areas will be used for allocation acording to fallback liststatic int fallbacksMIGRATE_TYPES4=MIGRATE_UNMOVABLE =MIGRATE_RECLAIMABLE,MIGRATE_MOVABLE,MIGRATE_TYPES,.;Wake up kswapd for reclaiming pages if free pages of zone is under High watermark.(3)Advancing Towar
88、ds a More Effective Cross-Cache AttackDetect status of page allocator in a side-channel wayIf we keeps on allocate order-0 pages with _GFP_KSWAPD_RECLAIM flag enabled from UNMOVALBE free_area:State 4:If other migration type free_areas becom empty,then enter the slow path for allocating order-0 page:
89、Wake up kswpad for reclaiming pagesDirect reclaim.Advancing Towards a More Effective Cross-Cache AttackDetect status of page allocator in a side-channel wayIf we keeps on allocate order-0 pages with _GFP_KSWAPD_RECLAIM flag enabled from UNMOVALBE free_area:Reclaming pages:Wake up kswpad for reclaimi
90、ng pagesdirect reclaim LRU_INACTIVE_ANON LRU_INACTIVE_FILE LRU_ACTIVE_ANON LRU_ACTIVE_FILE shrinker_listAdvancing Towards a More Effective Cross-Cache AttackDetect status of page allocator in a side-channel wayExported by/proc/meminfo,accessable from untrusted app:LRU_ACTIVE_ANON LRU_INACTIVE_ANON L
91、RU_ACTIVE_FILE LRU_INACTIVE_FILE shrinker_listAdvancing Towards a More Effective Cross-Cache AttackDetect status of page allocator in a side-channel wayGet reduced frequentlyPage allocator might be in State 3 or State 4Unmovable free_area isalmost empty!Advancing Towards a More Effective Cross-Cache
92、 AttackDetect status of page allocator in a side-channel wayTested on the device with kernel 4.14:/proc/pagetypeinfo:Advancing Towards a More Effective Cross-Cache AttackDetect status of page allocator in a side-channel wayStrategy for allocating a few hundreds of physically continuous order-0 pages
93、 from UNMOVALE free_area:Step1:reserve a dozen of order-8/9 pages with IONStep2:Create and detect the empty state of Unmovable free_area:2.1:Consume a large memory from both Unmoable free_area and Movable free_area.This will put memory of zone under pressure(for example,under High watermark)Allocate
94、_large_memory _with_ION();/Consume a large memory from both Unmoable free_areaAllocate_large_memory_with_mmap();/Consume a large memory from both Moable free_area2.2:Run the circle to detect the empty state of Unmovable free_areaWhile(1)Allocate_a_few_order0_pages();Detect_page_allocator_state_by_wa
95、tching_meminfo();If(page_allocator_enter_state_3_or_4)break;#if defined(CONFIG_IOMMU_IO_PGTABLE_ARMV7S)static const unsigned int orders=8,4,0;#elsestatic const unsigned int orders=9,4,0;#endifAdvancing Towards a More Effective Cross-Cache AttackStep3:release the order-8 pages with IONStep4:allocate
96、some order-0 pages to reduce the noiseStep5:allocate a few hundreds of physically continuous order-0 pages from UNMOVALE free_areaAdvancing Towards a More Effective Cross-Cache AttackStrategy for allocating a few hundreds of physically continuous order-0 pages from UNMOVALE free_area:Step5:allocate
97、a few hundreds of order-0 pages from UNMOVALE free_areaThe order-0 page comes from the spliting of high-order pages:Original state of Unmovable free_areaOrder:Order:Allocate one order-0 pageAllocated pageSo these order-0 pages will be physically continuousAdvancing Towards a More Effective Cross-Cac
98、he AttackStrategy for allocating a few hundreds of physically continuous order-0 pages from UNMOVALE free_area:Challenge 2:How to make order-3 slab reuse the order-0 slab deterministicallyChallenge 1:How to discard the victim order-0 slab under a constrained allocation primitiveChallenge 2:How to ma
99、ke order-3 slab reuse the order-0 slab deterministicallySOLVED!SOLVED!Challenge 1Challenge 2Advancing Towards a More Effective Cross-Cache AttackExploit File UAF with Dirty Pagetable121:Use the old method to discard the victim filp slab2:Occupy the released victim filp slab with user page table by h
100、eap spraying many user page tablesStep1.Use the mentioned method to make Unmovable free_area become almost emptyStep2.Discard the victim filp slabStep3.Heap spray many user page tables to occupy the released victim filp slab.The occupation is more likely to succeed because the free_area is relativel
101、y clean.Exploit File UAF with Dirty PagetableAdapt Dirty Pagetable to Samsung DeviceNot working:(Construct physical AARW with Dirty Pagetable:https:/yanglingxi1993.github.io/dirty_pagetable/dirty_pagetable.htmlMitigations on Samsung Device:Physical KASLRRO kernel textExploit File UAF with Dirty Page
102、tableCorrupt kernel object to construct AARWExploit File UAF with Dirty PagetableAdapt Dirty Pagetable to Samsung Device Make the page of pipe buffer follow the page owned by ION:Using the similar technique for allocating physically continuous order-0 pages.Exploit File UAF with Dirty PagetableAdapt
103、 Dirty Pagetable to Samsung DeviceCorrupt pipe_buffer to construct AARW victim_pte+=0 x1000 Using pipe primitive to construct AARW!Exploit File UAF with Dirty PagetableAdapt Dirty Pagetable to Samsung DeviceCorrupt kernel object to construct virtual AARW Make the page of pipe buffer follow the page
104、owned by ION:Using the similar technique for allocating physically continuous order-0 pages.for(int i=0;i len=0;if(!ss_initialized)goto allow;.tclass=unmap_class(orig_tclass);.context_struct_compute_av(scontext,tcontext,tclass,avd,xperms);map_decision(orig_tclass,avd,policydb.allow_unknown);out:read
105、_unlock(&policy_rwlock);return;allow:avd-allowed=0 xffffffff;goto out;static void map_decision(u16 tclass,struct av_decision*avd,int allow_unknown)if(tclass current_mapping_size)unsigned i,n=current_mappingtclass.num_perms;u32 result;for(i=0,result=0;i allowed¤t_mappingtclass.permsi)result|=1i
106、;if(allow_unknown&!current_mappingtclass.permsi)result|=1allowed=result;.System privilege required Less than 10%success rate Attack from Untrusted App 65%(13/20)success rateWin The GameMitigations for Cross-cache AttackSLAB_VIRTUAL:https:/ the Game!#BHASIA BlackHatEventsSummary Advancing Towards a M
107、ore Effective Cross-Cache Attack Solve the challenge 1:Discard the victim order-0 slab under a really limitation allocation primitive Solve the challenge 2:How to make order-3 slab reuse the order-0 slab deterministically Dirty Pagetable on Samsung Device#BHASIA BlackHatEventsAcknowledgementsYe Zhang,Teacher Jin#BHASIA BlackHatEventsQ&A