《3969 IBM z_OS performance Chapman_20MinuterPerformance.pdf》由會員分享,可在線閱讀,更多相關《3969 IBM z_OS performance Chapman_20MinuterPerformance.pdf(21頁珍藏版)》請在三個皮匠報告上搜索。
1、z/OS Performance Management If You Only Have 20 Minutes a DayScott ChapmanEnterprise Performance Strategies,Inc.Scott.chapmanEPSContact,Copyright,and TrademarksQuestions?Send email to performance.questionsEPS,or visit our website at https:/ or http:/.Copyright Notice:Enterprise Performance Strategie
2、s,Inc.All rights reserved.No part of this material may be reproduced,distributed,stored in a retrieval system,transmitted,displayed,published or broadcast in any form or by any means,electronic,mechanical,photocopy,recording,or otherwise,without the prior written permission of Enterprise Performance
3、 Strategies.To obtain written permission please contact Enterprise Performance Strategies,Inc.Contact information can be obtained by visiting http:/.Trademarks:Enterprise Performance Strategies,Inc.presentation materials contain trademarks and registered trademarks of several companies.The following
4、 are trademarks of Enterprise Performance Strategies,Inc.:Health Check,Reductions,PivotorThe following are trademarks of the International Business Machines Corporation in the United States and/or other countries:IBM,z/OS,zSeries,WebSphere,CICS,DB2,S390,WebSphere Application Server,and many others.O
5、ther trademarks and registered trademarks may exist in this presentation Enterprise Performance SAbstract(why youre here!)In todays IBM Z Enterprise environment,performance management is not always a full time responsibility.Even in the world of AI(Artificial Intelligence),boots on the ground perfor
6、mance skills are still needed.This 20-minute presentation will outline how to conduct a z/OS performance management analysis if you only had 20-minutes a day while using Actual Intelligence(AI).This session will be filled with great and useful information.Enterprise Performance SEPS:We do z/OS Perfo
7、rmancePivotor-Reporting and analysis software and servicesNot just reporting,but analysis-based reporting based on our expertise Education and instructionWe have taught our z/OS performance workshops all over the worldConsultingPerformance war rooms:concentrated,highly productive group discussions a
8、nd analysisInformationWe present around the world and participate in online forumshttps:/ Enterprise Performance SWhy 20 minutes?TBH:Because that matched this time slot The real point is my recommendation is that every shop have somebody who takes at least a cursory look at performance every day20 m
9、inutes is only 1/24th of an 8 hour work day so not a large investmentWhen I was a customer,most days my daily check-out was much lessOf course some days life was more interesting Enterprise Performance SWhat is the goal?Gain a sense of the normal and expectedTrain your neural network Gain a deeper u
10、nderstanding of the performance/capacity driversGenerate ideas for continuous improvementsMay involve trade-offs that require a holistic viewBe ready for problemsProblems will occur:if youve been using your tools to review performance everyday youll be more prepared to analyze a new problem fasterPr
11、event future problems By catching problems when theyre small and making continuous improvements Enterprise Performance SWhat do you need?Some form of performance reporting you can easily access everydayExamples here are from our own tool:PivotorIf you have to take 20 minutes running reports,thats no
12、t going to work so well!Knowledge of the business and the applicationNeed know what applications are important to the business and howNeed to understand where/how those applications run in z/OSIdeally should capture business metrics!CuriosityWhile probably not every day,you should expect to do some
13、digging some daysBeing curious about even minor eye catchers is a good way to practice and learn Enterprise Performance SWhat should you look at?System capacity utilizationHow busy was the system yesterday?Application performanceResponse times and batch completion:did you make all the SLAs?Business
14、metricsWhatever metrics drive utilization(and hence performance)E.G.orders received,calls received,widgets produced,ATM transactions,whateverSpecial events such as storm mode or maybe just a special promotionSystem performance Technical measures that may impact application performance E.G.overall di
15、sk response time,CF response times,delay samples,etc.Enterprise Performance S Enterprise Performance Strategies9The chart everyone wants to look at first:how busy was the system yesterday?It wont take long before youll have an expectation of what the chart should look like on a Wednesday.The chart e
16、veryone wants to look at first:how busy was the system yesterday?It wont take long before youll have an expectation of what the chart should look like on a Wednesday.While probably unimportant,this is an eye-catcher,so lets be curious:what was SYNF doing? Enterprise Performance Strategies10Hmm STCLO
17、.I wonder what in STCLO?Hmm STCLO.I wonder what in STCLO? Enterprise Performance Strategies11Report class name seems to imply that this was OMVS work.So we apparently have somebody doing OMVS work that is using about a fifth of CP for about 15 minutes.I wonder who and if this is something new they i
18、ntending on doing regularly.Report class name seems to imply that this was OMVS work.So we apparently have somebody doing OMVS work that is using about a fifth of CP for about 15 minutes.I wonder who and if this is something new they intending on doing Enterprise Performance Strategies12Looks like t
19、here were a number of similarly named STCs which got extra intensive in the 9:00 hour.Probably the first part of the name is a userid.Follow-up with the owner to find out what they were doing.Looks like there were a number of similarly named STCs which got extra intensive in the 9:00 hour.Probably t
20、he first part of the name is a userid.Follow-up with the owner to find out what they were You want me to talk to people?Yeah,sorry about that:but it turns out people know thingsBe polite,friendly,and openThe goal is to learn a little a tidbit that may be useful in the futureYeah,this is an insignifi
21、cant event:one tiny system used a bit more CPU briefly,but it may presage later changesE.G.you may learn that the application team is testing some new process thats running new Unix scripts to do somethingOr maybe it was just somebody doing a new product install on a sysprog test LPARMay give you an
22、 opportunity to get involved early in the process before they come looking for help“I noticed that work was going to STCLO,which may be fine,but as you get a little further down the road to deploying this lets talk to make sure its getting the service it needs to support the business.”Enterprise Per
23、formance S Enterprise Performance Strategies14This is just an example of a CICS region doing about 100 tps over the course of the day with most minutes showing an average RT of less than 0.2 seconds.But that large spike in average RT in the 15:00 hour is“interesting”.This is just an example of a CIC
24、S region doing about 100 tps over the course of the day with most minutes showing an average RT of less than 0.2 seconds.But that large spike in average RT in the 15:00 hour is“interesting” Enterprise Performance Strategies15Zooming in on the response time distribution we can verify that there were
25、several transactions finishing in the 5-10 second RT bucket,so not just a single long-running transaction that ended and drove up the average.For time reasons,were not going to try to dig into this.Zooming in on the response time distribution we can verify that there were several transactions finish
26、ing in the 5-10 second RT bucket,so not just a single long-running transaction that ended and drove up the average.For time reasons,were not going to try to dig into Do we care about a 2-minute issue?Maybe so,maybe noEven if this wasnt an SLA violation,if its unusual,it might be worth looking into s
27、o a small problem doesnt become a bigger oneAlso,helps you be prepared when maybe the users finally complainUsers dont necessarily report every hiccup:until it becomes too annoying or problematic,at which point it becomes an emergency“Every day a little after 3pm everything seems to lock up,and toda
28、y that happened when trying to help a VIP so we need to get this fixed now!”Enterprise Performance SWhat about business metrics?Talk to the application folks!The application does the business work,so often somebody can just write a database query to pull relevant daily business metricsEven better wo
29、uld be to get them at an hourly or even per-minute basis!Yes I always want more data!Ideally collect that data and report on it with your performance/capacity measuresIf youre a Pivotor customer:yes,we can set up custom reporting for you for thatCan come in really handy for understanding if a change
30、 in performance or utilization is a result of doing more business work or a technical changeJust counting CICS transactions not necessarily enough:the application may change business flows that cause a change in transaction counts without a change in business volume Enterprise Performance SWhat abou
31、t system performance?System performance metrics are interesting and important,but you can have bad performance metrics and be satisfying the business needsHence the reason I dont start at system performanceNote system capacity is a different story!Dont necessarily need to look at things like CF or d
32、isk RTs dailyProbably fairly stable,and when they vary the first question will be what technical and business volumes changed:e.g.did we do more transactions yesterday?Do look in on them occasionally thoughFix easy things as they come to light,to help avoid problems in the futureEven if“nobody”notic
33、esAlso,sometimes the averages can hide the fact that something important is suffering significantly Enterprise Performance S Enterprise Performance Strategies19That tiny(on average)bit of IOSQ time can probably be easily resolved with a parmlib change.Its not going to materially change the average d
34、isk RT,but if there were important I/Os waiting on a PAV,this could still be an important change.That tiny(on average)bit of IOSQ time can probably be easily resolved with a parmlib change.Its not going to materially change the average disk RT,but if there were important I/Os waiting on a PAV,this c
35、ould still be an important SummaryOn a daily basis review:Capacity UtilizationApplication performance/SLA attainmentBusiness metricsOccasionally review technical system performance numbersStay curious!Pull the thread,go down the rabbit hole!You might learn something interestingAt the very least your
36、e practicing looking into a problem,so youll be faster at it when its a“real”problem Enterprise Performance SActual Intelligence Artificial IntelligenceArtificial Intelligence&Machine Learning can be very helpful,but wont:Be curious about“why”something happenedUnderstand that thunderstorms that blew through yesterday,likely accounting for more calls into the call centerKnow that Jane was installing new software yesterdayBalance the workload performance based on non-technical needsBuild relationships between teams Enterprise Performance Strategies21