1、Jet Streaming Data and Predictive Analytics:How the Lakehouse and Apache SparkEnable Collins Aerospace to Keep Aircraft FlyingDatabricks2023Sanket Amin Senior Manager,CAS Data Science and Analytics 2023 Collins Aerospace|This document contains no export controlled technical data.My Flight to a Bucke
2、t List Destination Was Delayed.My Flight to a Bucket List Destination Was Delayed.and delayedMy Flight to a Bucket List Destination Was Delayed.and delayed,and delayed5 Times in a Single DayMy Flight to a Bucket List Destination Was Delayed.Why did my Flight Get Delayed?Air Carrier Delay National Av
3、iation System Extreme Weather Security Issue Why did my Flight Get Delayed?Cost to Large US Airline Operator$8M/Month Air Carrier Delay National Aviation System Extreme Weather Security Issue Who is Collins Aerospace?By the numbers.Look Left Look Right Look Up Look DownCollins Aerospace is Thereis L
4、eading the Digital Transformation in AerospaceCAS.End to End Digital Technology LandscapeCASis Leading the Digital Transformation in Aerospace Monitor fleet health and issues requiring attention Drill into issues by looking into specific parameters View Mx recommendations Set thresholds on specific
5、parameters via custom alertsFleet OverviewFleet health Fault Codes,PHM AlertsIssues requiring attentionDrill down to aircraft issue detailsData Visualization&Analysis Aircraft full flight&snapshot data plotting Maintenance&fault timelines Custom alerts 2023 Collins Aerospace.|Collins Aerospace Propr
6、ietary.|This document does not include any export controlled technical data.Maintenance RecommendationsAircraft issue documentationPrognostic notificationsIssue life cycle trackingDirect communication with Ascentia teamFleet Health Score&AnalyticsDevelop your own analyticsFilter by flight phase and
7、apply statistical aggregationsMonitor your fleetHow We Learned To Develop and Deploy Predictive Analytics The Early DaysCirca 2017-2018Trying to fly with what we had.The Early DaysTrying to fly with what we had.Compute power limited to individual corporate issued laptops.30 Mins to Process 1 Month o
8、f Data From Two Aircraft We needed Scalability for Big Data and Rapid PrototypingIngestionData PreparationAnalysisPresentationBig Data Is In The AirTerabytes 50 GBs/daySensors,Weather,Positional,Fault Codes,Maintenance LogsFault Prediction,What If Analyses,Operational MonitoringData From Direct Sour
9、ce,Quality varies Our Wright Brothers MomentDec 2019When We Began to Rapidly Prototype AnalyticsOur Wright Brothers MomentWhen We Began to Rapidly Prototype Analytics The first organization within Collins and RTX to use Databricks Notebooks and Python naturally enabled Data Scientists to analyze dat
10、a Easy adoption of parallel computing on big datasets due to Apache Spark integration Aircraft data stored in as delta tables in the Data Lake introduced additional efficiencyWe saved cost while going fasterA home-grown scouting tool for finding potential analytics for Ascentia168,000 Features Produ
11、ced Per FlightWe saved cost while going fasterLeveraged Notebook widgets with Azure Data Factory for dynamic meta-data driven orchestrationImplemented more efficient Spark processing techniquesOptimization of delta tables.OptimizeTHEN PartitionA Databricks Consulting Partner CompanyWe saved cost whi
12、le going fasterLeveraged Notebook widgets with Azure Data Factory for dynamic meta-data driven orchestrationImplemented more efficient Spark processing techniquesOptimization of delta tables.A Databricks Consulting Partner CompanyThe Jumbo Jet Era2022-CurrentWhen We Began to Rapidly Deploy Analytics
13、RawRefinedCuratedSQL DBAPIData ScienceAzure Data FactoryData IngestionCosmosAirline Data SourcesDatabricksITC Tech CheckAscentiaAnalytic Results GeneratorThe Jumbo Jet EraWhere We Departed vs Where We Are2 Aircraft 7GBs of Data0 Analytics80 Aircraft7.5 TBs of Data12 Analytics768 Aircraft300 TBs of D
14、ata55 Analytics30%decrease in potential delays20%decrease in unplanned maintenanceWhere We Are HeadedContinue adoption of platform solutions to reach new heightsContinue adoption of platform solutions to reach new heightsMulti-cloud Interoperability Data Sharing Across the OrganizationIncrease oppor
15、tunity for data science and analyticsPromote Growth of Data Citizens Where We Are HeadedContinue adoption of platform solutions to reach new heightsWhere We Are HeadedContinue adoption of platform solutions to reach new heightsWhere We Are HeadedImplement MLOpshttps:/ml-ops.org/content/mlops-princip
16、lesEverything is Manual!Automate the Model Development and DeploymentAutomate the Monitoring and Updating with Reinforcement LearningContinue adoption of platform solutions to reach new heights3rdParty Connectivity Partner ConnectLeveraging the Databricks Python SQL Connector,DBSQL,Serverless SQL ClusterEnablesDashboards(Tableau,PowerBI)Engineering analysis tools(Matlab)Delivery production data apps(PlotlyDash)Where We Are HeadedWhere We Are HeadedTransforming Data into ValueThank you