Hidden Treasures in Mainframe Performance

Transcription

Hidden Treasures in Mainframe Performance
Hidden Treasures in Performance
Hidden Treasures in Mainframe Performance
A real life case study
20. April 2016
Günter Priller
About the Presentation
This presentation was developed by:
• Contigon informationstechnologie + consulting gmbh
• Vogelsberg Consulting GmbH
The presentation aims for:
• Showing an extended approach in performance analysis
• Describing how workload pattern recognition leads to optimizations
• Encouraging a deeper dive into performance data analytics
09.05.2016
2
About the Presentation Developers
• Specialized in performance and tuning
• More than 20 years of experience
• Focused on Mainframe performance
09.05.2016
3
The starting point
• We had optimized a LPAR by approximately 35% with several changes
• On the system level changes to MQ, LE, IMS, SMS and z/OS had been applied
• On the application level we applied SQL and COBOL changes
• Best practice was implemented, the biggest CPU burners had been eliminated and
some recommendations were waiting to be implemented within the next weeks
• This was the starting point for a deeper dive because the ‚undergrowth‘ was removed
09.05.2016
4
The starting point
09.05.2016
5
The Challenge
Service Class BATCHNO
• The CPU consumption seemed to be unusually high compared to other LPAR workloads
• The LPAR itself was more an online LPAR rather than a batch LPAR
• Further investigation was required as it seemed that there where hidden CPU savings
which had been invisible before the first optimization phases
• The real challenge was: usually you stop after 35% CPU reduction for one LPAR but we
thought – THE REAL FUN STARTS NOW!
09.05.2016
6
Address Space Level Breakdown
09.05.2016
7
CATALOG Analysis
When we saw the high CATALOG CPU we decided to analyze the
SMF42 records for existing anomalies
• The first observation was a high usage of load libraries in terms of EXCPs
• A quick check of the JCL showed that every job was using the JOBLIB statement
• We also observed periodic high usage of datasets every full hour
• The root cause was the HSM interval migration and some ‚miscellaneous‘ MGMTCLAS
definitions
• The last indicator for an anomaly was a high execution frequency of a few batch jobs
per hour
09.05.2016
8
CATALOG Conclusions
The following anomalies could be determined as an indicator of
high CPU usage in the CATALOG and application address spaces
• The JOBLIB statement causing unnecessary CPU consumption in jobstep initiator times
and in uncaptured CPU
• The HSM interval migration per hour along with MGMTCLAS definitions (datasets with
0 days before migration)
• The frequency of a few jobsteps (every 5 minutes) which were using the fast migrated
datasets
• 3 application programs responsible for most of the dataset usage
09.05.2016
9
Initiator CPU times
According to the observation that the JOBLIB statement was used
in every batch job we needed to investigate on initiator CPU time
• SMF30 subtype 4 provides the new field ICU which reports on initiator CPU times per
jobsteps
• An expected ratio between initiator CPU and complete CPU time would be 1 : 25 or
even 1 : 30
• Our workload shows a different behavior
09.05.2016
10
Initiator CPU times
09.05.2016
11
Initiator CPU times application program1
09.05.2016
12
Initiator CPU times application program2
09.05.2016
13
Initiator CPU Analysis on application
• Most of the initiator CPU time was concentrated on 3 application programs
• An analysis of the purpose of these programs showed the following
• Program1 copied one dataset to another
• Program2 concatenated different datasets according to JCL
• Program3 was sending messages via WTO to the job output
• Our first conclusion was:
• Replace Program1 by ICEGENER or SORT
• Replace Program2 by IEFBR14
• Replace Program3 by JCL abilities
09.05.2016
14
Initiator CPU Analysis on application
• Tests for the desired replacements showed improvements of 60% CPU reduction for
the 3 programs
• The next step was to analyze the frequency and occurrence
• As these programs were considered to be self written utilities they occurred in
nearly every job in different jobsteps
• At least 5 jobs were scheduled every 5 minutes
• These jobs contained at least 30 occurrences on step level of these self written
utilities
• Now it was time to question the frequency
09.05.2016
15
Initiator CPU Analysis on application
• The high frequency of the execution for the jobs/programs could be explained by:
we run it every 5 minutes and if there is nothing to do it will not fail
• Changing this time driven behavior to an event driven schedule could save 80% or
more of the executions
• The number of executions per program could be reduced by 80%
• Along with the replacement of the self written utilities by standard utilities this meant
a 90% CPU reduction for the execution of these processes
09.05.2016
16
Final Conclusions
During the complete analysis we tried to aim for easy to apply
changes which would have a reasonable impact in terms of CPU
savings
• The recommended changes could be categorized as non intrusive and concentrated on
JCL and scheduling
• Replacement of application programs in JCL by common utilities
• Elimination of the JOBLIB statement
• Adjustment of scheduling plans (event driven versus time driven)
• Adjustment of HSM processing parameters
• Re-design of MGMTCLAS settings in SMS affecting HSM processing
The changes applied to savings in CATALOG, application jobsteps, HSM CPU, uncaptured
CPU and to a simpler scheduling control
09.05.2016
17
Final Conclusions
• The workload picture at the starting point did not really promise substantial savings
• One KPI was leading to the further analysis – the CATALOG address space CPU
consumption
• By using a combination of SMF data, parameter settings and JCL design in a context
we could get to the bottom of the issue
• The projected savings in CPU consumption amounted to more than 10% CPU savings
including batch workload, CATALOG, HSM and uncaptured CPU
The approach of combining different data sources to a complete picture succeeded and
showed even further opportunities to go for.
09.05.2016
18
End of Presentation
Thank you for your attention and patience
Q/A
09.05.2016
19