《Maximum Availability Architecture Internals for Exadata On-Premises and Cloud [LRN3328].pdf》由會員分享,可在線閱讀,更多相關《Maximum Availability Architecture Internals for Exadata On-Premises and Cloud [LRN3328].pdf(34頁珍藏版)》請在三個皮匠報告上搜索。
1、Maximum Availability Architecture Internals for Exadata On-Premises and Cloud Alex Blyth,Senior Principal Product Manager,OracleNatarajan Shankar,Principal Product Manager,OracleJony Safi,Director Maximum Availability Architecture,OracleThursday September 12,10:15 AMOracle CloudWorld Copyright 2024,
2、Oracle and/or its affiliates1Jony SafiDirectorMaximum Availability ArchitectureOracleNatarajan ShankarPrincipal Product ManagerExadata and ExascaleOracleAlex BlythSenior Principal Product Manager,Exadata and ExascaleOracleOracle CloudWorld Copyright 2024,Oracle and/or its affiliates2394%IDC:The Stat
3、e of Ransomware and Disaster Preparedness:2022$250K+of respondents reported at least one outagesingle hour of downtime costs on average60%experienced unrecoverable data in the last 12 monthsOracle Maximum Availability ArchitectureOracle CloudWorld Copyright 2024,Oracle and/or its affiliates4Referenc
4、e architecturesDeployment choicesHA features,configuration and operational practices Customer insights and expert recommendationsProduction siteReplicated siteReplication24/7Zero Downtime Migration(ZDM)BronzeSilverGoldPlatinumContinuous availabilityApplication ContinuityEdition-based RedefinitionOnl
5、ine RedefinitionActive replicationData protectionFlashbackZDLRA+ZRCVRMANHigh performanceTrue CacheResource ManagementDatabase In-MemoryScale out&LifecycleRACGlobally Distributed DatabaseFPPReal Application TestingActive Data GuardGoldenGateFull Stack DRGeneric SystemsAutonomous DBBaseDB,ExaDBEnginee
6、red SystemsMulticloudExadata:The MAA Platform of ChoiceOracle CloudWorld Copyright 2024,Oracle and/or its affiliates5Evolution:We Continue to Protect your Service Level from the Most Difficult HA ProblemsX10M Storage ServerLow I/O latency preservation during unplanned and planned outagesTightly inte
7、grated hardware&software with auto repair of sick storageNew Exadata X10 Extreme Flash storage server with both performance and capacity optimized flashAs low as17 microseconds to retrieve a database I/O from storage server XRMEM CacheX10M Database ServerZero impact major Linux upgrades,e.g.OL8 in E
8、xadata release 23.1Zero impact security software upgrades including STIG complianceMS(Management Server)alerting of key Database and Grid Infrastructure software incidentsHuman Error Prevention!MAA Best Practice Full Stack Compliance Checks with ExachkExadata:The MAA Platform of ChoiceOracle CloudWo
9、rld Copyright 2024,Oracle and/or its affiliates6Metrics and Insights Made EasyHow do I really know what is going inside of Exadata?Performance data including Exadata metrics have been around since Exadata inception but they were sometimes difficult to consume and understandEnter Real Time Insight in
10、 Exadata release 22.1.Simply zoom into one of the dashboards to observe performance trends or shine a bright light on performance anomalieshttps:/ Exadata Maximum Availability ArchitectureOracle CloudWorld Copyright 2024,Oracle and/or its affiliates7Redundant Hardware-Servers,Disks,Flash,Network,Pow
11、erRedundant Software-Active clusters,Disk/flash mirroringOnline patching,reconfiguration,expansionLANWANFastest RAC Instance and Node Failure Recovery|Fastest Backup-RMAN Offload to Storage|Recovery ApplianceDeep ASM Integration|Fastest Data Guard Redo Apply|Complete Failure Testing with Shortest Br
12、ownouts Across Sites:Data Guard for DRWithin Exadata:Full Fault ToleranceWithin a Site:Local Data GuardRecovery Appliance:Immutable BackupsRedo-based change replication with data consistency checkingRemote Standby for Disaster RecoveryZero data loss protection with continual recovery validationZero
13、data loss protection with continual recovery validationRecovery Appliance:Immutable BackupsFast network failure detectionRedundancy protection on cellsrv shutdownReduced brownout for instance recoveryILOM hang detection and repairRedundancy protection on cell shutdownAutomatic ASM mirror read on I/O
14、 error corruptionI/O error prevention with Exadata disk scrubbing/ASM corruption repairExadata HARDCorruption prevention with HARD supportElimination of false positive drive failuresRedundancy Check during power downBlue OK-to-remove LED light notificationDrop BBU for ReplacementAppliance mode suppo
15、rtCell Alert SummaryFlash and Disk Life Cycle Management AlertsAuto disk managementPriority rebalance supportEM failure reportingFailure Monitoring on database serversUpdating database nodes with patchmgrOptimized and Faster Exadata PatchingCustom Diagnostic Package for Cell AlertsVLAN support and a
16、utomationActive Active ROCE NetworkExadata Smart Write BackExadata Smart Flash LoggingFastest Redo Apply and Instance RecoveryEfficient resilver rebalance after flash failureI/O latency capping for reads and writesCell I/O timeout thresholdSmart Write Back Flash Cache persistenceI/O and Network Reso
17、urce ManagementHealth factor on predicatively failed disksDisk confinementI/O hang detection and repairCell to Cell offload for Disk RepairCell-to-Cell Rebalance Preserves Flash CacheExadata Elastic ConfigurationDrop hard disk for replacementExadata:Built-in High AvailabilityOracle CloudWorld Copyri
18、ght 2024,Oracle and/or its affiliates8Automatic LED support for disk removalAuto onlineExachk full stack healthcheck with critical issues alerts Automated repair from controller cache failureCell-to-Cell Rebalance Data Accelerator Cache preservationSo whats new?Oracle CloudWorld Copyright 2024,Oracl
19、e and/or its affiliates9Two ground-breaking releases in 2024High-Availability Improvements in Oracle Database 23ai10Oracle CloudWorld Copyright 2024,Oracle and/or its affiliatesRestores redundancy from more disk partnersAccelerates access to data after planned maintenance and unplanned outagesAccele
20、rates planned maintenanceUp to 3x Faster Data Redundancy RestorationUp to 3x Faster StartupUp to 6x Faster ShutdownAutomatic Storage Management ImprovementsOracle CloudWorld Copyright 2024,Oracle and/or its affiliates11ASM Disk Partnering-ConceptASM utilizes disk partnerships to choose disks for pla
21、cing extents and their mirror copiesEach disk partners with 8 other disksThe Primary Extent is then mirrored on one(Normal Redundancy)or two(High Redundancy)partner disksRead IO for rebalance,rebuild,resync,resilver,disk/flash warmup operations provided by 8 disksOracle CloudWorld Copyright 2024,Ora
22、cle and/or its affiliates12ASM 23ai Increases Number of Disk Partners Each disk partners with up to 24 partner disks Read IO for rebalance,rebuild,resync,resilver,disk/flash warmup operations provided by provided by up to 24 disks Automatically managed by ASMNew partnering scheme not applied during
23、upgradePartners updated by ADD DISK,ADD FAILGROUP(adding a new storage server),and REBALANCE operationsOracle CloudWorld Copyright 2024,Oracle and/or its affiliates13Up to 3x faster redundancy restorationEnables faster deployment,node additions,and planned maintenance2-node cluster installation in l
24、ess than 4 minutesAccelerates deployment,planned maintenance,and recovery from unplanned outagesUp to 2.8x Smaller Image SizeUp to 4.7x Faster InstallationUp to 2x Faster StartupGrid Infrastructure ImprovementsOracle CloudWorld Copyright 2024,Oracle and/or its affiliates14Parallelizes PDB open and l
25、everages Exadata RDMA Data AcceleratorFaster resumption of SQL after database instance outage.Instance recovery completes up to 1.35x fasterAvoids cross-instance coordination between database instances using ultra-fast RDMA checksUp to 2x Faster PDB OpenUp to 10 x faster OLTP Workload ResumptionUp t
26、o 1.4x Faster Concurrent Sub-second Smart Scan QueriesReal Application Clusters ImprovementsOracle CloudWorld Copyright 2024,Oracle and/or its affiliates15New Capabilities inExadata System Software 24ai16Oracle CloudWorld Copyright 2024,Oracle and/or its affiliatesAI Smart ScanAI Vector Search offlo
27、aded to storage servers performs high-performance,low-latency scans of massive volumes of dataPerformanceFaster analytics with columnar scan in XRMEM,enhanced I/O optimizations with Flash CacheSecuritySecure boot for KVM guests;smart scan during online encryption and for AES-XTS encrypted tablespace
28、sExadata System Software 24aiOracle CloudWorld Copyright 2024,Oracle and/or its affiliates17Exadata ExascaleNew intelligent data and software architecture that delivers the best of Exadata and the best of CloudHigh AvailabilityDatabase server OS updates with no downtime;ASM,Grid Infrastructure and R
29、oCE Network optimizations to further enhance resilienceObservability andEase-of-UseAnalytical visibility into key metrics such as network fabric,XRMEM,Flash Cache,Disk,I/O trends,etc.Exadata System Software 24aiOracle CloudWorld Copyright 2024,Oracle and/or its affiliates18Oracle CloudWorld Copyrigh
30、t 2024,Oracle and/or its affiliates19Exadata ExascaleExascale decouples storage management from the database serversResources are pooled for greater efficiency and faster provisioningStorage management performed in storage serversImproves developer productivitySignificantly increases deployment flex
31、ibility Distributed Resource Management Exadata Exascale ArchitectureOracle CloudWorld Copyright 2024,Oracle and/or its affiliates20Storage ServerStorage ServerStorage ServerStorage ServerExadata Database ServerVirtual MachineExadata Database ServerVirtual MachineExadata Database ServerVirtual Machi
32、neExadata Exascale ServicesStorage Pools,Vault and Volume Management,redundancy and partnering,caching and tiering,file metadata management,snapshots and clones,data integrityExadata Smart FeaturesSmart Flash Cache,XRMEM Data Accelerator,Smart Scan,Storage Indexes,Columnar Caching,Bloom filters,etc.
33、Oracle Database 23aiOracle Grid InfrastructureCluster management per cluster storage managementExascale accelerates resync of extents by assigning a delta disk to store new writes that should go to an offline extent in addition to the staleness registryWhen the offline storage is available again,dat
34、a is copied from the delta disk to the newly online diskIf the delta disk is unavailable,the 8 MB extent is copied from another mirror using the staleness registryDelta Store resync applies fine-grained changes to newly online diskSingle 8 KB Block update results in 8 KB of copied dataExadata Exasca
35、le Delta Store Accelerates Storage ResyncOracle CloudWorld Copyright 2024,Oracle and/or its affiliates21Higher data durability and faster resyncPrimary ExtentExascale Staleness RegistrySecondary ExtentExascale Staleness RegistryTertiary Extentfile 1,block 10file 2,block 250Delta Disk Accelerates res
36、ync of extents after storage server planned maintenance and unplanned downtime Automatically created on storage servers when an extent mirror is absent and tracks extent modification for replay When the extent mirror returns,tracked writes are replayed from an online mirrorExascale Extent Staleness
37、RegistryOracle CloudWorld Copyright 2024,Oracle and/or its affiliates22Primary ExtentExascale Staleness RegistrySecondary ExtentExascale Staleness RegistryTertiary ExtentOracle CloudWorld Copyright 2024,Oracle and/or its affiliates23Planned MaintenanceExadata Live UpdateOracle CloudWorld Copyright 2
38、024,Oracle and/or its affiliates24Increase security and minimize database server and VM rebootExadata System Software provides operating system,firmware,and Exadata software updates that are crucial for the optimal and secure operation of Exadata Database Servers and Oracle DatabaseUpdates are appli
39、ed in a rolling fashion across database serversExadata Live Update applies updates online and defers any remaining work to occur at a scheduled timeExadata Live Update uses familiar Linux technologies,including RPM and ksplice,to apply updates online to database servers/VMs avoiding the need to rebo
40、otExadata Database ServerExadata Live Update OptionsExadata Live Update multiple options based on the Common Vulnerability Scoring System(CVSS).When using Exadata Live Update,you choose from the following options:Oracle CloudWorld Copyright 2024,Oracle and/or its affiliates25highcvssApplies only sec
41、urity updates to address vulnerabilities with a CVSS score of 7 or greaterallcvssApplies only security updates to address vulnerabilities with any CVSS scorefullPerforms a full update,which includes all security-related updates and all other non-security updates.Equivalent to regular updates applied
42、 with a server/VM reboot$patchmgr-dbnodes kvm_guests.lst-upgrade -repo -rolling -target_version 24.1.0.0.0.240517-live-update-target highcvss|allcvss|fullViewing Outstanding WorkNot all updates can be applied online,or activated without a reboote.g.firmware,booting with the latest kernel,JDKThese up
43、dates are called outstanding work and are staged for activation at the next graceful shutdownUse patchmgr-live-update-list-outstanding-work to show outstanding itemsOracle CloudWorld Copyright 2024,Oracle and/or its affiliates26$patchmgr-dbnodes kvm_guests.lst-live-update-list-outstanding-work*Summa
44、ry of outstanding work for Exadata Live Update:(*)2024-08-15 00:17:08:Exadata Live Update outstanding work is scheduled for completion at the next reboot -The Linux kernel will be updated from version 5.4.17-2136.330.7.5.el8uek to 5.4.17-2136.333.5.1.el8uek.Current Uptrack kernel version:5.4.17-2136
45、.333.5.1.el8uek.x86_64 -New package uptrack-updates-5.4.17-2136.333.5.1.el8uek.x86_64 (version 20240725-0)will be installed.Exadata Live UpdateOracle CloudWorld Copyright 2024,Oracle and/or its affiliates27Applying monthly maintenance releases-examplesQuarterly Update Windows(Recommended)Bi-Yearly U
46、pdate WindowsAugustSeptemberOctoberNovember24.1.3Full UpdateServer/VM reboot24.1.4Exadata Live UpdateNo reboot24.1.5Exadata Live UpdateNo reboot24.1.6Full UpdateServer/VM rebootDecemberJanuaryFebruaryMarch24.1.7Exadata Live UpdateNo reboot24.1.8Exadata Live UpdateNo reboot24.1.9Full UpdateServer/VM
47、reboot24.1.10Exadata Live UpdateNo rebootAugustSeptemberOctoberNovember24.1.3Full UpdateServer/VM reboot24.1.4Exadata Live UpdateNo reboot24.1.5Exadata Live UpdateNo reboot24.1.6Exadata Live UpdateNo rebootDecemberJanuaryFebruaryMarch24.1.7Exadata Live UpdateNo reboot24.1.8Exadata Live UpdateNo rebo
48、ot24.1.9Full UpdateServer/VM reboot24.1.10Exadata Live UpdateNo rebootOracle CloudWorld Copyright 2024,Oracle and/or its affiliates28Observability and Ease-of-UseExadata relies on highly available RoCE IPsDatabase and Storage servers automatically failover if the switch port is“down”Unhealthy switch
49、es or network may leave ports“up”but traffic stalledNetwork stalls may impact database availabilityThe new ExaPortMon process monitors live RoCE network trafficMigrates IP to operational port if stall detectedReturns IP to original port when the upstream issue is resolvedExadata RoCE Network Resilie
50、nceOracle CloudWorld Copyright 2024,Oracle and/or its affiliates29192.168.1.1192.168.1.2Spine SwitchLeaf SwitchLeaf SwitchOperator configurationerrorExaPortMonExaPortMonExaPortMonExaPortMonExaPortMonExadata Cache ObservabilityExadata Cache Stats(ecstat)is a tool utility that peers into the storage s
51、erver cachesSimplifies observability of client I/OAggregates client I/O by media type and reasonTop I/O reasons for Flash Cache hitsTop I/O reasons for disk hitsOracle CloudWorld Copyright 2024,Oracle and/or its affiliates30Automatically included in Exawatcher collectionEXAchk MAA Best Practices adv
52、isorHolistic report enabling customers to keep their Exadata MAA compliantEXAchk specifically developed for ExadataConfiguration checks for database,storage,and network fabric switchesMAA ScorecardAutomatic Correction(when applicable)Prereq checks for DB/GI upgrades and application continuity readin
53、essPrioritized,continuous improvement based on Development backed best practicesOracle CloudWorld Copyright 2024,Oracle and/or its affiliates31EXAchk MAA Best Practices advisorAutomatically scheduled to run weeklyShould be executed before and after any major configuration change e.g.:patching,storag
54、e additionImportant to keep EXAchk up-to-dateIncludes an auto-update facilityReview reports regularly and implement findingsOracle CloudWorld Copyright 2024,Oracle and/or its affiliates32Oracle CloudWorld Copyright 2024,Oracle and/or its affiliates33Exadata is the best MAA platform of choiceMaximum
55、Availability Architecture built-inHolistic approach database connection all the way through to firmwareUniquely leverages hardware for faster notification and resolutionCommitment to ongoing improvementSignificant improvements in Grid Infrastructure,Automatic Storage Management,and Real Application
56、ClustersNew Exadata Live Update,ExaPortMon,and Exadata Cache Stats capabilitiesBest practices and checks continuously enhanced in the EXAChk utilityA foundational principle of Exadata ExascaleExascale decouples compute and storageIncreases developer productivity and agilityIntroduction of new Exascale MAA capabilities including Staleness Register and Delta Store to increase availability and performanceOracle CloudWorld Copyright 2024,Oracle and/or its affiliates34