《3067 - High Performance Java-v3.pdf》由會員分享,可在線閱讀,更多相關《3067 - High Performance Java-v3.pdf(41頁珍藏版)》請在三個皮匠報告上搜索。
1、October 21-24,2024Mandalay Bay Convention CenterLas Vegas,NevadaSession 3067Sompo JapanVijay SundaresanIBM,Performance Architect,IBM Hybrid CloudHigh Performance Java with Open Liberty and SemeruruntimesNotices and disclaimers 2023 International Business Machines Corporation.All rights reserved.This
2、 document is distributed“as is”without any warranty,either express or This document is distributed“as is”without any warranty,either express or implied.In no event shall IBM be liable for any damage arising from the use of implied.In no event shall IBM be liable for any damage arising from the use o
3、f this information,including but not limited to,loss of data,business interruption,this information,including but not limited to,loss of data,business interruption,loss of profit or loss of opportunity.loss of profit or loss of opportunity.Customer examples are presented as illustrations of how thos
4、e customers have used IBM products and the results they may have achieved.Actual performance,cost,savings or other results in other operating environments may vary.Workshops,sessions and associated materials may have been prepared by independent session speakers,and do not necessarily reflect the vi
5、ews of IBM.Not all offerings are available in every country in which IBM operates.Any statements regarding IBMs future direction,intent or product plans are subject to change or withdrawal without notice.IBM,the IBM logo,and are trademarks of International Business Machines Corporation,registered in
6、 many jurisdictions worldwide.Other product and service names might be trademarks of IBM or other companies.A current list of IBM trademarks is available on the Web at“Copyright and trademark information”at: comments made in this presentation may be characterized as forward looking under the Private
7、 Securities Litigation Reform Act of 1995.Forward-looking statements are based on the companys current assumptions regarding future business and financial performance.Those statements by their nature address matters that are uncertain to different degrees and involve a number of factors that could c
8、ause actual results to differ materially.Additional information concerning these factors is contained in the Companys filings with the SEC.Copies are available from the SEC,from the IBM website,or from IBM Investor Relations.Any forward-looking statement made during this presentation speaks only as
9、of the date on which it is made.The company assumes no obligation to update or revise any forward-looking statements except as required by law;these charts and the associated remarks and comments are integrally related and are intended to be presented and understood together.2IBM TechXchange|2024 IB
10、M CorporationSemeruSemeru JDK performance JDK performance comparisonsda3Throughput (increasing)Time(increasing)Original startup and rampup of throughput over timePerformance Report CardStartup time B Rampup time CThroughput BMemory footprint CSimplified view of a hypothetical applications performanc
11、e over timeStartupRampupSteady stateThroughput (increasing)Time(increasing)InstantOn startup means rampup can start almost instantlyInstantOnPerformance Report CardStartup time B-ARampup time C-BThroughput BMemory footprint CSimplified view of a hypothetical applications performance over timeLiberty
12、+Semeru InstantOn:fast startup using Linux CRIUTarget Liberty application container deployments Start application containers in milliseconds,ideal for serverless Leverages Linux CRIU to perform checkpoint/restoreMake it really easy to consume for a user of Liberty containersCharacteristicsCharacteri
13、sticsSemeru Semeru InstantOnInstantOnSemeru Semeru JVMJVMGraal Graal NativeNativeFull Java supportYesYesNoInstant onYesNoYesHigh throughputYesYesNoLow memory(under load)YesYesYes?Dev-prod parityYesYesNoDevBuildProdProdProdcheckpointrestorerestorerestore6InstantOn:Updates since GA in June 2023Instant
14、On functionality available for Spring Boot applicationsoReleased as part of Liberty 24.0.0.6 GAInstantOn support on OpenShift Container Platform(OCP)oSupport added in Liberty 23.0.0.11 since OCP 4.14 was releasedInstantOn support on(Linux)Power and IBM Z platformsoSupport added in Liberty 24.0.0.1 o
15、Needs Liberty images based on Semeru Java21 UBI9 images for Power and IBM Z 7Semeru Runtimes InstantOn Goals(in progress)8Complete JDK CraC support to allow non-Liberty applications to be triedMore flexible restore:one example is to allow Java Debug on restoreImprovements to Liberty InstantOn first
16、response time 15-30%faster vs InstantOn GA last yearSemeru Runtimes Startup Time Goals9Goal:10%startup time improvement for applications that havent been changed to use the Semeru shared classes cacheResults so far:JBoss EAP startup time stats shownIn progress:Evaluation on other non-Liberty appsTec
17、hnical details A new option,-XX:+|-ShareOrphans is added to enable class sharing from all class loaders,irrespective of whether the class loader implements the shared classes cache APIWe plan to automatically enable this option in the next(4Q)Semeru release if Xshareclasses is specifiedJBoss(AcmeAir
18、 app)start-up statistics with Semeru improvements-11%Semeru Java17 baselineSemeru Java17 with option-XX:+ShareOrphans#of classes in shared cache334018875#of AOT methods in shared cache32505639Throughput (increasing)Time(increasing)Performance Report CardStartup time ARampup time BThroughput BMemory
19、footprint CInstantOn improved startup and rampup but can we ramp up even faster?Simplified view of a hypothetical applications performance over timeThroughput (increasing)Time(increasing)Cloud CompilerCloud Compiler generates compiled code faster meaning we ramp up even faster Performance Report Car
20、dStartup time ARampup time B-AThroughput BMemory footprint CSimplified view of a hypothetical applications performance over timeSemeru Cloud Compiler(aka JIT Server)Decouple the JIT compiler from the JVM and let it run as an independent processOffload JIT compilation to remote processTreat JIT compi
21、lation as a cloud service Auto-managed by orchestrator A mono-to-micro solution Local JIT still available12Semeru Runtimes Rampup Time Goals13Goal:InstantHot=“instant”Liberty rampupResults so far:Liberty(AcmeAir app)rampup shownIn progress:Liberty(AcmeAir app)rampup in 10-20 secs Technical details 2
22、0-30%JIT compilation time reductionUse AOT code during rampup from Cloud Compiler -JVM client needs the-XX:+JITServerUseAOTCache optionIncrease the number of methods eligible for AOT Reduce AOT load failures36%better Liberty throughput for first 90 seconds(rampup)using Semeru changes this year 24%be
23、tter Liberty throughput for first 30 seconds(rampup)using Semeru JITServer changes this year0.41.0:baseline Semeru release at year start dev_0725:current Semeru Throughput (increasing)Time(increasing)Performance Report CardStartup time ARampup time AThroughput BMemory footprint CInstantOn and Cloud
24、Compiler have improved startup and ramp up but can we get better throughput?Simplified view of a hypothetical applications performance over timeThroughput (increasing)Time(increasing)Performance Report CardStartup time ARampup time AThroughput B-AMemory footprint CMore compiler optimizations leads t
25、o higher throughputMore Compiler OptimizationsSimplified view of a hypothetical applications performance over timeSemeru Runtimes Throughput Goals16Goal:10-15%improvement on multiple apps Results so far:Apache Spark(TPCH)and Dacapo shownIn progress:5-10%improvement for Liberty apps Technical details
26、 Platform tuning on latest hardware:Java intrinsics,array copying,object allocation General performance enhancements -java.lang.invoke.*classes -Final static field folding -Change tradeoffs with other performance metrics Analyze and tune apps -Liberty,Elastic Search,Spring,PrestoDB,Spark,Dacapo36%be
27、tter Spark(mean)TPC-H throughput using Semeru X86 changes this year 8%better Dacapo(geo mean)throughput using Semeru X86 changes this year Throughput (increasing)Time(increasing)Performance Report CardStartup time B-ARampup time C-AThroughput B-A Memory footprint CStartup,rampup,throughput have impr
28、oved,but what about memory footprint?Simplified view of a hypothetical applications performance over timeMemory Footprint (increasing)Time(increasing)Performance Report CardStartup time B-ARampup time C-AThroughput B-A Memory footprint COriginal memory footprint shows highly spiky behavior over time
29、 Simplified view of a hypothetical applications performance over timeMemory Footprint (increasing)Time(increasing)Performance Report CardStartup time ARampup time AThroughput A Memory footprint C-BCloud CompilerCloud Compiler removes memory spikes due to compilations from JVM clientsSimplified view
30、of a hypothetical applications performance over timeSemeru Cloud Compiler(aka JIT Server)20Cloud Compiler removes JIT induced CPU and memory spikes from JVM clientsMemory Footprint (increasing)Time(increasing)Cloud Compiler has improved memory footprint,but can we get even better?Performance Report
31、CardStartup time ARampup time AThroughput A Memory footprint BSimplified view of a hypothetical applications performance over timeMemory Footprint (increasing)Time(increasing)More memory management optimizations reduce memory footprint consistently during runMore Memory Management Optimizations Perf
32、ormance Report CardStartup time ARampup time AThroughput A Memory footprint B-ASimplified view of a hypothetical applications performance over timeSemeru Runtimes Memory Footprint Goals23Goal:15-20%memory footprint improvementResults so far:Liberty(AcmeAir app)footprint shownIn Progress:Prototypes f
33、or another 10%memory footprint reduction in Liberty(AcmeAir app)Technical details Use“memory disclaim”scheme for different memory areas -JIT persistent memory -(Coming soon)JIT code memory -(Coming soon)Class memory-8.5%8.5%lower Liberty memory footprint under load using Semeru changes this year Low
34、er memory footprint achieved with no loss of throughput/startup performanceStartup,rampup,throughput,memory footprint have all improved from where we startedPerformance Report CardStartup time B-ARampup time C-AThroughput B-A Memory footprint C-ASimplified view of a hypothetical applications perform
35、ance over timeLiberty application server performance Liberty application server performance comparisonsda25EE9 Performance(Daytrader9)Liberty outperforms others on all metrics for Jakarta EE9 performance.Startup time,memory footprint and throughput is significantly better than competitors.Comparison
36、s used each application servers Docker imageSystem Configuration:-SUT:LinTel RHEL 8.7,Intel(R)Xeon(R)Gold 6338 2.00GHz,4 cpus,4 GB RAM.JDK 17 version distributed with the docker images used for each server instance unless otherwise documented.Note:*denotes the app server was running the non-Jakarta
37、or EE8 version of the Daytrader benchmark26100%101%98%100%55%27%42%0%20%40%60%80%100%120%Open Liberty24.0.0.3 FullWebsphere Liberty24.0.0.3 FullOpen Liberty23.0.0.1 FullOpen Liberty24.0.0.9 InstantOnWildFly 31.0.0 FinalJBoss 7.4.12*Payara 6.2024.2Percent of Open Liberty Percent of Open Liberty 24.0.
38、0.324.0.0.3(higher is better)(higher is better)Daytrader9 Throughput(higher is better)100%101%107%100%361%397%513%0%100%200%300%400%500%600%Open Liberty24.0.0.3Websphere Liberty24.0.0.3Open Liberty23.0.0.1Open Liberty24.0.0.9 InstantOnWildFly Full31.0.0.FinalJBoss EAP 7.4.12*Payara Server-full6.2024
39、.2Percent of Open Liberty Percent of Open Liberty 24.0.0.324.0.0.3(lower is better)(lower is better)Daytrader9 Memory footprint(First response)(lower is better)100%100%101%7%183%176%508%0%100%200%300%400%500%600%Open Liberty24.0.0.3Websphere Liberty24.0.0.3Open Liberty23.0.0.1Open Liberty24.0.0.9 In
40、stantOnWildFly Full31.0.0.FinalJBoss EAP 7.4.12*Payara Server-full6.2024.2Percent of Open Liberty Percent of Open Liberty 24.0.0.324.0.0.3(lower is better)(lower is better)Daytrader9 First response(lower is better)100%100%100%101%101%82%61%93%28%0%20%40%60%80%100%120%Open Liberty24.0.0.2WebsphereLib
41、erty24.0.0.2Open Liberty23.0.0.3Open Liberty24.0.0.2Instant-onWebSphereTradional9.0.5.18*WildFly Full31.0.0.FinalJBOSS EAP7.4.12Tomcat10.1.19Payara Serverfull 6.2024.2Percent of Open Liberty Percent of Open Liberty 24.0.0.224.0.0.2(higher is better)(higher is better)Trade7-jakarta Throughput(higher
42、is better)EE7 Performance(Trade7-jakarta)Liberty outperforms others on all metrics for EE7 performance(startup time and memory footprint is over 50%better)Trade7 was transformed to Jakarta naming.Comparisons used each application servers Docker imageSystem Configuration:-SUT:LinTel RHEL 8.7,Intel(R)
43、Xeon(R)Gold 6338 2.00GHz,4 cpus,4 GB RAM.JDK 17 version distributed with the docker images used for each server instance.Note:*denotes the app server was running the non-Jakarta version of the benchmark100%100%100%13%248%285%252%721%0%200%400%600%800%Open Liberty24.0.0.2WebsphereLiberty 24.0.0.2Open
44、 Liberty23.0.0.3Open Liberty24.0.0.2InstantOnWildFly Full31.0.0.FinalJboss EAP 7.4.12 Tomcat 10.1.19Payara Server-full 6.2024.2Percent of Open Liberty Percent of Open Liberty 24.0.0.224.0.0.2(lower is better)(lower is better)Trade7-jakarta First response(lower is better)100%108%108%101%426%613%161%7
45、01%0%200%400%600%800%Open Liberty24.0.0.2WebsphereLiberty 24.0.0.2Open Liberty23.0.0.3Open Liberty24.0.0.2InstantOnWildFly Full31.0.0.FinalJboss EAP 7.4.12 Tomcat 10.1.19Payara Server-full 6.2024.2Percent of Open Liberty Percent of Open Liberty 24.0.0.224.0.0.2(lower is better)(lower is better)Trade
46、7-jakarta Memory footprint(First response)(lower is better)27EE7 Performance(Acmeair Monolithic)Liberty outperforms others on all metrics for EE7 performance(startup time,throughput and memory footprint are much better)Acmeair Monolithic was transformed to Jakarta naming.Comparisons used each applic
47、ation servers Docker imageSystem Configuration:-SUT:LinTel RHEL 8.7,Intel(R)Xeon(R)Gold 6338 2.00GHz,4 cpus,4 GB RAM.JDK 17 version distributed with the docker images used for each server instance.Note:*denotes the app server was running the non-Jakarta version of the benchmark28100%100%99%100%80%81
48、%85%84%52%0%20%40%60%80%100%120%Open Liberty24.0.0.2WebsphereLiberty24.0.0.2Open Liberty23.0.0.3Open Liberty24.0.0.2Instant-onWebSphereTradional9.0.5.18*WildFly Full31.0.0.FinalJBOSS EAP7.4.12TomeeWebprofile9.1.2Payara Serverfull 6.2024.2Percent of Open Liberty Percent of Open Liberty 24.0.0.224.0.0
49、.2(higher is better)(higher is better)Acmeair Monolithic Throughput(higher is better)100%100%104%12%275%296%151%606%0%100%200%300%400%500%600%700%Open Liberty24.0.0.2WebSphereLiberty 24.0.0.2Open Liberty23.0.0.3Open Liberty24.0.0.2InstantOnWildfly27.0.1.FinalJBoss EAP7.4.12*TomeeWebprofile 9.1.2Paya
50、ra6.2024.2Percent of Open Liberty Percent of Open Liberty 24.0.0.224.0.0.2(lower is better)(lower is better)Acmeair Monolithic First response(lower is better)100%101%105%90%311%303%163%461%0%100%200%300%400%500%Open Liberty24.0.0.2WebSphereLiberty 24.0.0.2Open Liberty23.0.0.3Open Liberty24.0.0.2Inst
51、antOnWildfly27.0.1.FinalJBoss EAP7.4.12*TomeeWebprofile 9.1.2Payara 6.2024.2Percent of Open Liberty Percent of Open Liberty 24.0.0.224.0.0.2(lower is better)(lower is better)Acmeair Monolithic Memory footprint(First response)(lower is better)100%99%97%100%37%41%23%14%0%20%40%60%80%100%120%Open Liber
52、ty24.0.0.2WebSphereLiberty 24.0.0.2Open Liberty23.0.0.3Open Liberty24.0.0.2InstantOnWildFly Full31.0.1.FinalTomeeMicroprofile9.1.2Helidon MP 4.0.6Payara Micro6.2024.3Percent of Open Liberty Percent of Open Liberty 24.0.0.224.0.0.2(lower is better)(lower is better)AcmeairMS(1 instance for each servic
53、e)-Throughput(higher is better)MicroProfile Performance(Acmeair Microservices)Liberty provides the most balanced performance across all MicroProfile implementations Comparisons used each application servers MicroProfile(latest spec version supported)Docker imageSystem Configuration:-SUT:LinTel RHEL
54、8.7,Intel(R)Xeon(R)Gold 6338 2.00GHz,2 cpus,1 GB RAM.JDK 17 version distributed with the docker images used for each server instance.100%100%105%17%186%111%63%300%0%50%100%150%200%250%300%350%Open Liberty24.0.0.2WebSphereLiberty 24.0.0.2Open Liberty23.0.0.3Open Liberty24.0.0.2InstantOnWildFly Full31
55、.0.1.FinalTomeeMicroprofile 9.1.2Helidon MP 4.0.6Payara Micro6.2024.3Percent of Open Liberty Percent of Open Liberty 24.0.0.224.0.0.2(lower is better)(lower is better)AcmeairMS AuthService First response(lower is better)100%100%103%96%168%96%75%162%0%50%100%150%200%Open Liberty24.0.0.2WebSphereLiber
56、ty 24.0.0.2Open Liberty23.0.0.3Open Liberty24.0.0.2InstantOnWildFly Full31.0.1.FinalTomeeMicroprofile9.1.2Helidon MP 4.0.6Payara Micro6.2024.3Percent of Open Liberty Percent of Open Liberty 24.0.0.224.0.0.2(lower is better)(lower is better)AcmeairMS AuthService Memory footprint(First response)(lower
57、 is better)2930System Configuration:-SUT:LinTel RHEL 8.7,Intel(R)Xeon(R)Gold 6338 2.00GHz,4 cpus,4 GB RAM.JDK 17 version distributed with the docker images used for each server instance.Note:*denotes the app server was running the non-Jakarta or EE8 version of the benchmarkLiberty scales better than
58、 other frameworks(raw throughput)020004000600080001000012000140001600018000151020304050Req/sec(higher is better)Number of clientsScaling Trade7-jakarta-raw Throughput(higher is better)Open Liberty 24.0.0.3 FullTomcat 10.1.19JBoss 7.4.12*WildFly 31.0.0 FinalPayara 6.2024.2Liberty scales better than o
59、ther frameworks(throughput:memory ratio)31System Configuration:-SUT:LinTel RHEL 8.7,Intel(R)Xeon(R)Gold 6338 2.00GHz,4 cpus,4 GB RAM.JDK 17 version distributed with the docker images used for each server instance.Note:*denotes the app server was running the non-Jakarta or EE8 version of the benchmar
60、k0102030405060708090151020304050Memory Ratio(higher is better)Number of clientsScaling Trade7-jakarta Throughput/Memory(higher is better)Open Liberty 24.0.0.3 FullTomcat 10.1.19JBoss 7.4.12*WildFly 31.0.0 FinalPayara 6.2024.2Sompo Japan:Liberty Customer Testimonial Sompo Japan:Liberty Customer Testi
61、monial comparisonsda3233Improvement of app development productivityThe reason SOMPO chose Liberty Reduce deployment timeA lightweight runtime limited to necessary featuresMinimizing testing downtime due to JVM restarts that occur throughout developmentCompatibility with CI/CDHighly compatible with v
62、arious CI/CD toolsSubversion,Maven,Jenkins,UCD(Urban Code Deployment),etc.We have implemented CI/CD using these tools in our company as well34 Cost reductionThe reason SOMPO chose Liberty Reduction of server and mainframe resourcesThe server and mainframe require minimal resources for operation.Even
63、 though the reduction per unit is small,it all adds up in a large systemReducing memory and CPU resources is not feasible for all servers and hosts.While it is possible to make the JVM itself light weight by using only the necessary features,this results in only a small reduction in resources.Howeve
64、r,applications with high processing demands can benefit significantly,contributing to overall resource reduction.Among such cases,the servers that have shown actual cost-saving effects in running costs.35Benefits vary by applicationSome require similar resources post-modernization,while others,as sh
65、own,can achieve significant savings.Optimization may require analysis and tuning.For example,memory savings might not be realized until settings,like Java heap size,are adjusted to match the reduced usage.Calculated with exchange rate JPY to USD (1 yen=0.0070 USD)Modernization Benefits:Refactoring t
66、o Java&LibertyExample:Core Usage Reduction Application 1Previously:56 cores across 8 servers Now:24 cores 57%core reductionAnnual Savings:15,012 USD(2,144,634 JPY)Example:Memory Usage Reduction Application 2Previously:96 GB across 6 serversNow:64 GB 33%memory reductionAnnual Savings:11,364 USD(1,623
67、,516 JPY)Java SE language enhancements Java SE language enhancements comparisonsda36Virtual Threads in Java21:motivation Java unit of concurrency is the java.lang.Thread Historically,Java threads map 1-1 to OS threads OS threads are relatively expensive to create and manage Number of OS threads is l
68、imited by OS constraints and memory Scalability of thread-per-request model hits OS thread limits Async models created to improve concurrency add complexity Virtual thread:object in Java heap,create instances that map 1-1 to tasks Lightweight:no backing OS thread for a virtual thread Virtual threads
69、 are mounted on OS threads(aka platform threads)to run,and unmounted when they are completed or blockedEvaluation of Virtual Threads for use in LibertyAll comparisons are against Liberty thread pool that is self tuningBottom line:we decided against using virtual threads in Liberty38ThroughputVirtual
70、 threads are slightly(10-15%)slower due to inherent overheads like mount/unmount,thread locals etc.Virtual threads scheduling can cause worse CPU utilization(up to 40%lower)OS scheduler(CFS)behavior on Linux kernel levels 5k)of threads are needed quicklyMemory FootprintVirtual threads could take mor
71、e or less overall physical memoryHard to predict because of GC interactions Practical lessons from tuning Virtual Threads Considerations before deciding if virtual threads is right for your applicationoBaseline:may have a smart thread pool to begin withoApplication:tasks must have sufficient”off-CPU
72、”work(e.g.,waiting on database)oDeployment:enough load and CPUs allocated to each app instance such that threads are the bottleneckoLocking:Java synchronization on call stacks pin virtual thread to carrier thread.Use juc locking insteadoDependencies:all relevant components(external and open source)n
73、eed to be modified to support virtual threadsOther parts of architecture(e.g.,database)need to be able to handle high concurrency Conclusion Think/experiment about your use case,to assess if virtual threads would help If would like help to discuss for your use case,please reach out to us39Interested in learning more?Join our Feedback program!Co-create novel solutions that leverage IBM technologies to solve business problems and speed innovation.Lets define success together.https:/ibm.biz/ibm-feedbackprogramSign up here:IBM TechXchange|2024 IBM Corporation