《通過減少數據庫中的數據來獲得更好的分析.pdf》由會員分享,可在線閱讀,更多相關《通過減少數據庫中的數據來獲得更好的分析.pdf(20頁珍藏版)》請在三個皮匠報告上搜索。
1、Get Better Analytics by Putting Less Data in Your DatabaseApril 2024Paige RobertsDirector of Product Innovation2All our tools only work on numbers.We to need convert the data.The Inherent Challenges We cant handle the volume of events we are receiving without loss.We need to be able to utilize addit
2、ional data assets to ask the right questions.We will have too many false positives.How do we to prioritize them?Data Increases20182019202020212022202320242025Source IDC/StatistaAnnual Data Volume Ave Increase:23.4%Annual Analytics Budget Ave Increase:11.0%Data VolumeAnalytics BudgetFaster Than Budge
3、tsStreaming Data OverloadIncreasing Analytic ChallengesExploding Analytic Data VolumesHuman Generated Web clickstreams Call center phone logs Email and text messages Social media firehoses Telco call detail records Digital orders and paymentsStreaming Data OverloadMachine Generated(vehicles,phones,r
4、obots,networks,devices)Machine logs Sensor readings SCADA streams Geolocation informationIncreasing Analytic ChallengesExploding Analytic Data VolumesHuman Generated Web clickstreams Call center phone logs Email and text messages Social media firehoses Telco call detail records Digital orders and pa
5、ymentsStreaming Data OverloadMachine Generated(vehicles,phones,robots,networks,devices)Machine logs Sensor readings SCADA streams Geolocation informationIncreasing Analytic ChallengesExploding Analytic Data VolumesHuman Generated Web clickstreams Call center phone logs Email and text messages Social
6、 media firehoses Telco call detail records Digital orders and paymentsToo Slow Bogged down analytic databases Unhappy customers-real-time response expectations not getting met Fraud detection,not fraud prevention Cyber intrusions found months later Machine alerts not acted on until too lateStreaming
7、 Data OverloadMachine Generated(vehicles,phones,robots,networks,devices)Machine logs Sensor readings SCADA streams Geolocation informationIncreasing Analytic ChallengesExploding Analytic Data VolumesHuman Generated Web clickstreams Call center phone logs Email and text messages Social media firehose
8、s Telco call detail records Digital orders and paymentsToo Slow Bogged down analytic databases Unhappy customers-real-time response expectations not getting met Fraud detection,not fraud prevention Cyber intrusions found months later Machine alerts not acted on until too lateDrowning in Noise False
9、alarms obscuring real alerts Machine learning needs more focused data for training Duplicates from multiple data sources Mountains of sensor/machine data with very little of it valuableDo It FasterCurrent State of the ArtALL the Data Goes In:MTTA=HoursData Lake/LakehouseEvent Stream ProcessingETL/EL
10、TData Noise Reduction AnalyticsStreaming SourcesEntity Resolution Data Noise ReductionAnomaly Detection+76,+152,+304PredictionRealtime Stream Processing Data RefinementOnly Valuable Data Goes In:MTTA=Seconds+76,+152,+304PredictionEvent Stream Processing Better AnalyticsStreaming SourcesETL/ELTEntity
11、 Resolution Data Noise ReductionAnomaly DetectionData Lake/LakehouseBoost the SignalBest Way to Boost SignalIs to Filter Out NoiseReduce data volume Increase data valueHigh-Volume Data becomes High-Value DataFeature ExtractionComplex Event ProcessingRecommendationsAlgorithm ImplementationEntity Reso
12、lutionAnomaly DetectionGraph Data Model Can Represent AnythingTheLegendLinkHyruleGanonZeldaofisnt named afterdefeatsseals awayexploresSubject Predicate Object=Node Edge NodeBut Graph Databases Are SlowSTOPwith the deep analytics of graph?How do I get analytics at the speed of event processingthatDot
13、 Streaming GraphAnd Quine Open SourceEvent Stream Processing No time windows Out-of-order data processing Parallel asynchronous processing Back-pressured stream processing Graph Data Representation Analyze categorical data Link heterogenous data Query past,present,and future Resolve dupes,find anoma
14、lies,etc.Streaming GraphAt cluster scale with commercial support and features.NoveltyAll thatDot products are powered by Quine open source software.Quine Detect known patterns.Filter out low value data.Detect unknown patterns.thatDot Streaming Graph provides commercial support and licensing for distributed use cases.A self-learning graph AI anomaly detecting application.How It WorksQuine and thatDot Streaming GraphThank YWere hiring! Star us on Github