《桑內·馬薩克斯_揭開APT背后的思維分析借口在CTI和歸因中的作用.pdf》由會員分享,可在線閱讀,更多相關《桑內·馬薩克斯_揭開APT背后的思維分析借口在CTI和歸因中的作用.pdf(52頁珍藏版)》請在三個皮匠報告上搜索。
1、Unraveling the Mind behind the APTAnalyzing the Role of Pretexting in CTI and AttributionSpeaker:Sanne MaasakkersBlackHat USA 2024 briefingsContents3Introduction01Introduction02Research concept03Analyzing content04Analyzing context05Result&demos06Conclusion&outlookSanne4Introduction-Joined Mandiant
2、Intelligence/Google Cloud in 2023 as Senior Analyst-Previously worked in Red Team/Research&Intel Fusion Team(Fox-IT)and Fusion Centre(NCSC-NL)analyzing threats against The Netherlands-3 malware and being creative with(actor/threat)data-Coach of the European CTF team,creator of Hackchallenges-EU lead
3、 at(DEFCONs)Adversary Village383851515171710105 56 63 32 22 21 11 1ExploitPhishingPrior CompromiseStolen CredentialsBrute ForceWebCompromiseServer CompromiseThird-PartyCompromiseOtherPhishing(Social Media)SIM Swap%Threat groups6IntroductionThreat groups7IntroductionThreat groupsUNCUNCUNCUNCUNCUNCUNC
4、UNC8IntroductionUNCUNCUNCUNCUNCUNCUNCUNCUNCClusteringEmails are associated with a threat group mostly through various technical,tactical and strategical indicators,including:-TechnicalTechnical:reuse of malware or code within malware attachments,reuse of infrastructure,including IP addresses,domains
5、,and hosting providers.-TacticalTactical:consistent use of specific tactics in the infection chain,patterns in infrastructure.-StrategicalStrategical:common geographical and industry targeting.9IntroductionBehavioralBehavioralSpear phishing10IntroductionConceptThis research focuses on the behavioral
6、 characteristicsbehavioral characteristicsof APT phishing emails,including the pretext and email scenario,and their importance in linking(new)phishing campaigns to their authors.This includes both the content and contextcontent and contextof the email.11Research conceptExampleSubject:software update
7、Dear,If you already have software installed on yourcomputer,youll be asked to download andinstall the update.Once the new update isinstalled,software should function normally.install instructions including download linkYou must have administrative privileges onyour computer to install software.Servi
8、ceService DeskDeskSubject:Access has been changedDear,This message is to notice you that we have builta new type system.The certificate for thecurrent software client will soon expire andprevent users from logging on.install instructionsPlease contact the staff if you have anyquestions.ServiceServic
9、e deskdesk*Emails are slightly altered for security and privacy purposes 12Research conceptVIBEINT refers to information obtained from a gut feeling or intuition,often based on previous experience.It is mostly unverified and unreliable,but it can sometimes provide insights or lead to further investi
10、gation.13Research conceptScenario14Research concepthttps:/ updateDear,If you already have software installed on yourcomputer,youll be asked to download andinstall the update.Once the new update isinstalled,software should function normally.install instructions including download linkYou must have ad
11、ministrative privileges onyour computer to install software.ServiceService DeskDeskEmail*Email is slightly altered for security and privacy purposes SubjectSalutationLanguageTextual featuresAttachment or URLSignature15Research conceptContextSubject:software updateDear,If you already have software in
12、stalled on yourcomputer,youll be asked to download andinstall the update.Once the new update isinstalled,software should function normally.install instructions including download linkYou must have administrative privileges onyour computer to install software.ServiceService DeskDeskEmail*Email is sli
13、ghtly altered for security and privacy purposes ThemePersuasionSender typeGoalDesign16Research conceptAnalysisDatasetTextual featuresContextual featuresCombined modelContext analysisLanguage analysisStylometric analysis17Research conceptStylometry is the statistical analysis of linguistic style in w
14、ritten or spoken language,aiming to identify patterns and features unique to specific authors.This analysis can be applied to attribute authorship.18Analyzing contentStylometryIt uses statistics to analyze an authorslexicallexicaland syntactic featuressyntactic features.-Lexical featuresLexical feat
15、ures:word frequencies,word length distribution,Hapax Legomena,vocabulary richness.-Syntactic featuresSyntactic features:sentence length,average word length,punctuation usage.Think of it as identifying someone based on how they talk,not(just)what they say.It is a common technique and already used to
16、analyze(anonymous)(anonymous)authorsauthors,threatening letters or ransom textsransom texts.19Analyzing contentExampleStylometric 1Dear Sir,For your information.See the attach.Stylometric 220*Emails are slightly altered for security and privacy purposes month Financial Data Table.Have you got it?Ple
17、ase check it.average_lengthaverage_lengthshort_wordsshort_wordsproportion_digitsproportion_digits4.6250.500.0proportion_capitalproportion_capitaltext_richnesstext_richnesshapax_legomenahapax_legomena0.0918average_lengthaverage_lengthshort_wordsshort_wordsproportion_digitsproportion_digits4.630.360.0
18、proportion_capitalproportion_capitaltext_richnesstext_richnesshapax_legomenahapax_legomena0.090.829Analyzing contentLinguaStylometryHowever,stylometry has several limitations,including:-Semantic understandingSemantic understanding:it does not understand the meaning of words or the nuances of languag
19、e.-Contextual awarenessContextual awareness:it struggles to analyze relationships between non-sequential words,sentences,or paragraphs,missing the broader context of the text.-DomainDomain-specific knowledgespecific knowledge:it lacks understanding of specialized fields or jargon,which can be crucia
20、l for accurate analysis in certain types of texts.21Analyzing contentStylometryWhile stylometry has been used for years in authorship attribution,its efficacy on APT emails is limited.A trained model on a relevant dataset results in an overall accuracy of 41%accuracy of 41%.2201Text richness02Averag
21、e number of words/sentence03Distribution of unicode charactersAnalyzing contentA language model analyzes text by considering the context of each word,capturing subtle nuances in meaning,and understanding complex word and sentence relationships.23Analyzing contentLanguage modelPre-trained models can
22、be used to perform various natural language natural language processingprocessingtasks like text classification.They provide a powerful starting point for fine-tuning on specific tasks,saving time and resources compared to training from scratch.BERT BERT is a pre-trained language model based on the
23、transformerarchitecture.Transformers are deep learning models that use multimulti-head attentionhead attentionto weigh the importance of different words importance of different words in a sentence,allowing for better understanding of context and meaning.24Analyzing contentLanguage modelThis resulted
24、 in an accuracy of 60%accuracy of 60%on all 33 actors.But how?Machine Learning models can be explained by SHAPSHAP(SHapley Additive exPlanations).So;what happens if you try to predictpredicta new text on the fine-tuned language model?https:/ contentImportant information.Due to the deterioration of t
25、he epidemiologicalsituation,as well as due to the increase in the number of sick of the OmicronCOVID-19 embassy staff,the Embassy of the Republic of Turkey is beingtransferred to a state of isolation and closed to the public.Please check the listof sick employees to identify the possibility of conta
26、ct with them.All detailedinformation about the sick,as well as about the new mode of operation of theembassy in the attachment.-Please confirm receipt of the email with a returnresponse.Important information.Due to the deterioration of the epidemiologicalsituation,as well as due to the increase in t
27、he number of sick of the OmicronCOVID-19 embassy staff,the Embassy of the Republic of Turkey is beingtransferred to a state of isolation and closed to the public.Please check the listof sick employees to identify the possibility of contact with them.All detailedinformation about the sick,as well as
28、about the new mode of operation of theembassy in the attachment.-Please confirm receipt of the email with a returnresponse.0.0470.1130.0930.031.0001.0000.2360.0470.0280.0320.0140.0890.0790.01270.7230.0010.0330.005270.1170.0340.1530.0210.059Content analysis highlights2801For replies,the language used
29、 was not always consistent with thelanguage of the initial email.02Similar emails could be written in completely different languagesand discuss entirely different topics.03Its not just about using theme-specific words;it also focuses on the grammar used,such as had adverb past participle or speaking
30、 in the first person.Analyzing contentThe context of an email includes elements that shape its meaning and purpose,such as theme,goal and the social engineering techniques employed to influence the recipient.29Analyzing contextExtracting these featuresLarge Language Models(LLMs)can effectively extra
31、ct key contextual elements from emails,including:-The inclusion of personal touchespersonal touchesor signaturessignatures-The overall themeoverall themeof the email-The social engineering techniquessocial engineering techniquesused to influence the recipientThe local LLM is givenextra training extr
32、a training documentsdocumentsto better understandand classify these features from emails and do simple categorizationtasks.30Analyzing contextThemeAnalysis of email themes reveals the most common themes are as follows:-Invitations or requestsInvitations or requests(meetings,interviews,events)-COVIDC
33、OVID-1919related(absences,changes)-Account issuesAccount issues(resets,problems,settings)These are categorized into the following categories:-A recent eventrecent event(COVID-19 or global events)-An important valueimportant valuefor the receiver(proposals)-A timelesstimelessand generic theme(please
34、find attached)31prompt=fIwillgiveyouthesubject and content of an email.Firstofall,givemethemaintheme of the email.Additionally,youknoweverythingaboutCialdinis6principlesofinfluence:Reciprocity,CommitmentandConsistency,SocialProof,Authority,Liking,andScarcity.Basedonthesupplied text,I want you to giv
35、eme the most likely principle usedin the text(or None if none oftheprinciplesmatch)andthereasonwhyinmaximumof30words.nFormatinstructions:format_instructionsnEmailsubject:subjectnEmailcontent:bodynAnalyzing contextSocial engineeringThe principles of influenceprinciples of influence,defined byCialdini
36、Cialdini,are a set of psychological and social phenomena that can be used to influence behavior and decision-making.By leveraging these principles,phishers can create a sense of urgencysense of urgency,trusttrust,or authorityor authoritythat overrides the recipients natural caution.32prompt=fIwillgi
37、veyouthesubject and content of an email.Firstofall,givemethemaintheme of the email.Additionally,youknoweverythingaboutCialdinis6principlesofinfluence:Reciprocity,CommitmentandConsistency,SocialProof,Authority,Liking,andScarcity.Basedonthesupplied text,I want you to giveme the most likely principle u
38、sedin the text(or None if none oftheprinciplesmatch)andthereasonwhyinmaximumof30words.nFormatinstructions:format_instructionsnEmailsubject:subjectnEmailcontent:bodynAnalyzing contextPrinciples of influence33Principle:AuthorityGreetings!On behalf of important person in policy,Iwould like to invite yo
39、u to a briefing withimportant person in policy on date.person will discuss topic and your inputwill be appreciated.Kind regards,nameInvite.htaPrinciple:Commitment and ConsistencyDear name,As a follow up on our conversation,Imsending you the job profile of the developerposition at organization attach
40、ed.Lookingforward to hearing from you soon.Kind regards,namerecruiter at organizationJob profile.doc*Emails are slightly altered for security and privacy purposes Analyzing contextPrinciples of influence34Principle:LikingHey name,Long time no see and best wishes for theNew Year!I hope that you will
41、find goodhealth and luck in the upcoming year.Please find my New Years wishes attachedon this URL:URLnamePrinciple:ReciprocityHi name,Sorry for sending this via platform,but Ivehad a lot of struggles uploading the files.Hope this is OK!Hope it works for you now,it should only be accessible by you.Le
42、t meknow if there are problems.URL to platformname*Emails are slightly altered for security and privacy purposes Analyzing contextPrinciples of influence35Principle:ScarcityHi name,As mentioned,just wanted to pass thisdocument.The password is 123456.This is aconfidential document,so please dontshare
43、 it with anyone.Thank you and we keep in touch.nameFiles.rarPrinciple:Social ProofHey name,My name is name and Ive recently had atalk with person X,person Y.We werewondering if you would be interested injoining project Z,we definitely think yourethe right person with valuable insights.Please find mo
44、re details here:URLKind regards,name*Emails are slightly altered for security and privacy purposes Analyzing context36ContextContextThe prediction model built on contextual features achieved a 67%67%accuracy accuracy across all authors,with the following features being the most prevalent:3701Princip
45、le of influence02Sender category03ThemeAnalyzing contextThe models are combined with a meta model.A meta-model is a higher-level model that learns how to best integrate the predictions or outputs of the three individual models and is used for this analysis.38Result&demosResultThe total accuracy of a
46、ll three models combined(and tuned)results in an overall accuracy to 88 overall accuracy to 88-96%96%,after removing the least performing actors from the set.Insufficient data Insufficient data for certain actors impacts the models ability to learn models ability to learn their patterns effectively,
47、so the model is not fully able to make predictions for those actors.The remaining groups represent interesting groups that have been active in the last yearsin the last years.39Result&demosAnalysisGet insights in clusters Get insights in clusters Find similarities and differences1Finding outliers Fi
48、nding outliers Reconsider the links2Find authorFind authorCluster with more confidence340Lets dive in some visualizations and examples of how it helps clustering.Result&demos41Result42Result43Result44Result45ResultSubclusters46Multiple actors(labels)have subclusters within their respective clusters.
49、A review of these subclusters revealed the following:-Change in targetingChange in targeting:actors have adapted their writing styles based on the targets(geographical,industry or person).-Change over timeChange over time:subclusters showed emails sent around the same time.Although the content is to
50、tally different,these emails might be part of the same campaign.-Distinct clusterDistinct cluster:UNCs are considered part of the actor,but this isnt necessarily the case.They could represent a separate cluster,an affiliate,or simply another individual.Result&demos47Find authorcomment:”#muddywater”h
51、as:email_parents18 resultsResult&demosConclusionThe proposed model for clustering campaigns based on behavioral clustering campaigns based on behavioral features features has proven effective in analyzing the majority ofthe majority of emails emails from both APTs and TEMPs.This underscores thepoten
52、tial of behavioral analysis potential of behavioral analysis to contribute to the accurate clustering of groups or linking new attacks to groupslinking new attacks to groups,next to clustering techniques already in place.It can aid threat intelligence analysts to understand trends understand trends
53、and new phishing TTPs new phishing TTPs leveraged by specific threat actors and support in threat hunting.49Conclusion&outlookOutlook&implicationsFurther research could involve incorporating technicaltechnical,tactical tactical andstrategical attributes strategical attributes into the model to have a full overview of a campaign.As discussed,those models have limitations,but so does this model:-LLM usageLLM usage:The use of LLMs for gene