How to Monitor Performance

Transcription

How to Monitor Performance
How to Monitor Performance
How to Monitor Performance
Overview / CQ / AEM 5.6 / How To /
The following lists common performance issues which occur, together with proposals on how to spot and
counteract them.
Recognizing common performance problems
Area
Symptom(s)
To increase capacity...
To reduce volume...
Client
High client CPU usage.
Install a client CPU with
higher performance.
Simplify (HTML) layout.
Low server CPU usage.
Upgrade to a faster
browser.
Improve client-side
cache.
CPU usage low on both
servers and clients.
Remove any network
bottlenecks.
Improve/optimize the
configuration of the
client cache.
Browsing locally
on the server is
(comparatively) fast.
Increase network
bandwidth.
Reduce the "weight" of
your web pages (e.g.
less images, optimized
HTML).
CPU usage on the webserver is high.
Cluster your webservers.
Reduce the hits per
page (visit).
Some clients fast, some
slow.
Server
Network
Web-server
Use a hardware loadbalancer.
Application
Server CPU usage is
high.
Cluster your CQ5
instances.
Search for, and
eliminate, CPU and
memory hogs (use code
review, timing output,
etc).
High memory
consumption.
Improve caching on all
levels.
Low response times.
Optimize templates
and components (e.g.
structure, logic).
Repository
Cache
Performance issues may stem from a number of causes that have nothing to do with your website, including
temporary slowdowns in connection speed, CPU load, and many more.
It may also impact either all your visitors, or only a subset of them.
All this information needs to be obtained, sorted and analyzed before you can either optimize the general
performance or solve specific issues.
•
Before you experience a performance issue:
© 2012 Adobe Systems Incorporated.
All rights reserved.
Page 1
Created on 2014-09-15
How to Monitor Performance
•
•
collect as much information as possible to build up a good working knowledge of the system under
normal circumstances
When you experience a performance issue:
• try to replicate it with one (or preferably more) standard web-browsers, on a different client that you
know has good general performance and/or on the server itself (if possible)
• check whether anything (related to the system) has changed within an appropriate time-space, and if
any of these changes could have impacted the performance
• ask questions such as:
• does the issue only occur at specific times?
• does the issue only occur on specific pages?
• are other requests impacted?
• collect as much information as possible to compare with your knowledge of the system under normal
circumstances:
TOOLS FOR MONITORING AND ANALYZING PERFORMANCE
The following gives a short overview of some of the tools available for monitoring and analyzing
performance.
Some of these will be dependent on your operating system.
Tool
Used to analyze...
Usage / More information...
request.log
Response times and
concurrency.
Interpreting the request.log.
truss/strace
Page Loads
Unix/Linux commands to trace
system calls and signals.
Increase the log level to INFO.
Analyze the number of page
loads per request, which pages,
etc.
Thread dumps
Observe JVM threads. Identify
contentions, locks and longrunners.
Dependent on the operating
system:
- Unix/Linux: kill -QUIT <pid>
- Windows (console mode): CtrlBreak
Analysis tools are also available,
such as TDA.
Heap Dumps
Out of Memory issues that cause
slow performance.
Add the:
-XX:
+HeapDumpOnOutOfMemoryError
option to the java call to CQ.
See the Troubleshooting Guide
for Java SE 6 with HotSpot VM.
System calls
Identify timing issues.
Calls to
System.currentTimeMillis() or
com.day.util.Timing are used to
generate timestamps from your
code, or via HTML-comments.
Note: These should be
implemented so that they can
be activated / deactivated as
required; when a system is
running smoothly the overhead
© 2012 Adobe Systems Incorporated.
All rights reserved.
Page 2
Created on 2014-09-15
How to Monitor Performance
of collecting statistics will not be
needed.
Apache Bench
Identify memory leaks,
selectively analyze response
time.
Search Analysis
basic usage is:
ab -k -n <requests> -c
<concurrency> <url>
See Apache Bench and the ab
man page for full details.
Execute search queries offline,
identify response time of query,
test and confirm result set.
JMeter
Load and functional tests.
http://jakarta.apache.org/jmeter/
JProfiler
In-depth CPU and memory
profiling.
http://www.ej-technologies.com/
JConsole
Observe JVM metrics and
threads.
Usage: jconsole
See jconsole and Monitoring
Performance using JConsole.
Note: With JDK 1.6, JConsole
is extensible with plug-ins; for
example, Top or TDA (Thread
Dump Analyzer).
Java VisualVM
Observe JVM metrics, threads,
memory and profiling.
Usage: jvisualvm or visualvm
See jvisualvm, visualvm and
Monitoring Performance using
(J)VisualVM.
Note: With JDK 1.6, VisualVM is
extensible with plug-ins.
truss/strace, lsof
In depth kernel call and process
analysis (Unix).
Unix/Linux commands.
Timing Statistics
See timing statistics for page
rendering.
To see timing statistics for
page rendering you can use
Ctrl-Shift-U together with ?
debugClientLibs=true set in the
URL.
CPU and memory profiling tool
Used when analyzing slow
requests during development.
For example, YourKit.
Information Collection
The ongoing state of your
installation.
Knowing as much as possible
about your installation can
also help you track down what
might have caused a change in
performance, and whether these
changes are justified. These
metrics need to be collected
at regular intervals so you can
easily see significant changes.
© 2012 Adobe Systems Incorporated.
All rights reserved.
Page 3
Created on 2014-09-15
How to Monitor Performance
INTERPRETING THE REQUEST.LOG
This file registers basic information about every request made to CQ. From this valuable conclusions can be
extracted.
The request.log offers a built-in way to get a look at how long requests take. For development purposes it
is useful to tail -f the request.log and watch for slow response times. To analyze a bigger request.log we
recommend the use of rlog.jar which allows you to sort and filter for response times.
We recommend isolating the "slow" pages from the request.log, then individually tuning them for a better
performance. This is usually done by including performance metrics per component or using a performance
profiling tool such as yourkit.
Monitoring traffic on your website
The request log registers each request made, together with the response made:
09:43:41 [66] -> GET /author/y.html HTTP/1.1
09:43:41 [66] <- 200 text/html 797ms
By totaling all the GET entries within a specific periods (e.g. over various 24 hour periods) you can make
statements about the average traffic on your website.
Monitoring response times with the CQ request.log
A good starting point for performance analysis is the request log:
<cq-installation-dir>/crx-quickstart/logs/request.log
The log looks as follows (the lines are shortened for simplicity):
31/Mar/2009:11:32:57
31/Mar/2009:11:32:57
31/Mar/2009:11:33:17
31/Mar/2009:11:33:17
+0200
+0200
+0200
+0200
[379]
[379]
[380]
[380]
->
<->
<-
GET
200
GET
200
/path/x HTTP/1.1
text/html 33ms
/path/y HTTP/1.1
application/json 39ms
This log has one line per request or response:
• The date at which each request or response was made.
• The number of the request, in square brackets. This number matches for the request and the response.
• An arrow indicating whether this is a request (arrow pointing to the right) or a response (arrow to the left).
• For requests, the line contains:
• the method (typically, GET, HEAD or POST)
• the requested page
• the protocol
• For responses, the line contains:
• the status code (200 means “success”, 404 means “page not found”
• the MIME type
• the response time
Using small scripts, you can extract the required information from the log file and assemble the statistics you
want. From these, you can see which pages or types of pages are slow, and if the overall performance is
satisfactory.
Monitoring search response times with the CQ5 request.log
Search requests are also registered in the log file:
31/Mar/2009:11:35:34 +0200 [338] -> GET /author/playground/en/tools/search.html?
query=dilbert&size=5&dispenc=utf-8 HTTP/1.1
31/Mar/2009:11:35:34 +0200 [338] <- 200 text/html 1562ms
So, as above, you can use scripts to extract the relevant information and build up statistics.
However, once you have determined the response time, you may need to analyze why the request is taking
the time it does, and what can be done to improve the response. Further information about the underlying
search functionality of CRX can be found at Searching in CRX.
Monitoring the number and impact of concurrent users
Again the request.log can be used to monitor concurrency and the system's reaction to it.
Tests must be made to determine how many concurrent users the system can handle before a negative
impact is seen. Again scripts can be used to extract results from the log file:
© 2012 Adobe Systems Incorporated.
All rights reserved.
Page 4
Created on 2014-09-15
How to Monitor Performance
•
•
monitor how many requests are made within a specific time span e.g. one minute
test the effects of a specific number of users all making the same requests at (as close as possible) the
same time; e.g. 30 users clicking Save at the same time.
31/Mar/2009:11:45:29 +0200 [333] -> GET /author/libs/Personalize/content/statics.close.gif HTTP/1.1
31/Mar/2009:11:45:29 +0200 [334] -> GET /author/libs/Personalize/content/statics.detach.gif HTTP/1.1
31/Mar/2009:11:45:30 +0200 [335] -> GET /author/libs/CFC/content/imgs/
logo.rZMNURccynWcTpCxyuBNiTCoiBMmw000.default.gif HTTP/1.1
31/Mar/2009:11:45:32 +0200 [335] <- 304 text/html 0ms
31/Mar/2009:11:45:33 +0200 [334] <- 200 image/gif 31ms
31/Mar/2009:11:45:38 +0200 [333] <- 200 image/gif 31ms
31/Mar/2009:11:45:42 +0200 [336] -> GET /author/libs/CFC/content/imgs/
logo.rZMNURccynWcTZRXunQbbQtvuuCMbRRBuWXz0000.default.gif HTTP/1.1
31/Mar/2009:11:45:43 +0200 [337] -> GET /author/titlebar_bg.gif HTTP/1.1
31/Mar/2009:11:45:43 +0200 [336] <- 304 text/html 0ms
31/Mar/2009:11:45:44 +0200 [337] <- 304 text/html 0ms
USING RLOG.JAR TO FIND REQUESTS WITH LONG DURATION TIMES
CQ includes various helper tools located in:
<cq-installation-dir>/crx-quickstart/opt/helpers
One of these, rlog.jar, can be used to quickly sort request.log so that requests are displayed by duration,
from longest to shortest time.
The following command shows the possible arguments:
$java -jar rlog.jar
Request Log Analyzer Version 21584 Copyright 2005 Day Management AG
Usage:
java -jar rlog.jar [options] <filename>
Options:
-h
Prints this usage.
-n <maxResults> Limits output to <maxResults> lines.
-m <maxRequests> Limits input to <maxRequest> requests.
-xdev
Exclude POST request to CRXDE.
For example, you can run it specifying request.log file as a parameter and show the 10 first requests that
have the longest duration:
$ java -jar ../opt/helpers/rlog.jar -n 10 request.log
*Info * Parsed 464 requests.
*Info * Time for parsing: 22ms
*Info * Time for sorting: 2ms
*Info * Total Memory: 1mb
*Info * Free Memory: 1mb
*Info * Used Memory: 0mb
-----------------------------------------------------18051ms 31/Mar/2009:11:15:34 +0200 200 GET /content/geometrixx/en/company.html text/ html
2198ms 31/Mar/2009:11:15:20 +0200 200 GET /libs/cq/widgets.js application/x-javascript
1981ms 31/Mar/2009:11:15:11 +0200 200 GET /libs/wcm/content/welcome.html text/html
1973ms 31/Mar/2009:11:15:52 +0200 200 GET /content/campaigns/geometrixx.teasers..html text/
html
1883ms 31/Mar/2009:11:15:20 +0200 200 GET /libs/security/cq-security.js application/xjavascript
1876ms 31/Mar/2009:11:15:20 +0200 200 GET /libs/tagging/widgets.js application/x-javascript
1869ms 31/Mar/2009:11:15:20 +0200 200 GET /libs/tagging/widgets/themes/default.js application/
x-javascript
1729ms 30/Mar/2009:16:45:56 +0200 200 GET /libs/wcm/content/welcome.html text/html;
charset=utf-8
1510ms 31/Mar/2009:11:15:34 +0200 200 GET /bin/wcm/contentfinder/asset/view.json/ content/dam?
_dc=1238490934657&query=&mimeType=image&_charset_=utf-8 application/json
1462ms 30/Mar/2009:17:23:08 +0200 200 GET /libs/wcm/content/welcome.html text/html;
charset=utf-8
You may need to concatenate the individual request.log files if you need to do this operation on a large data
sample.
REQUEST COUNTERS
Information about request traffic (number of requests during a specific time period) gives you an indication
of the load on your instance. This information can be extracted from request.log, though using counters will
automate data collection to let you see:
© 2012 Adobe Systems Incorporated.
All rights reserved.
Page 5
Created on 2014-09-15
How to Monitor Performance
•
•
•
significant differences in activity (ie differentiate between "many requests" and "low activity"
when an instance is not being used
any restarts (counters are reset to 0)
To automate information collection you can also install a RequestFilter to increment a counter on every
request. Multiple counters can be used for different time periods.
The information gathered can be used to indicate:
• significant changes in activity
• a redundant instance
• any restarts (counter reset to 0)
HTML COMMENTS
It is recommended that every project includes html comments for server performance. Many good public
examples can be found; select a page, open the page source for viewing and scroll to the bottom, code such
as the following can be seen:
</body>
</html>
<!-Page took 58 milliseconds to be rendered by server
-->
APACHE BENCH
To minimize the impact of special cases (such as garbage collection, etc), it is recommended to use a tool
such as apachebench (see for example, ab for further documentation) in the following way:
$ ab -c 5 -k -n 1000 "http://localhost:4503/content/geometrixx/en/company.html"
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking localhost (be patient)
Completed 100 requests
Completed 200 requests
Completed 300 requests
Completed 400 requests
Completed 500 requests
Completed 600 requests
Completed 700 requests
Completed 800 requests
Completed 900 requests
Completed 1000 requests
Finished 1000 requests
Server Software:
Server Hostname:
Server Port:
Day-Servlet-Engine/4.1.8
localhost
4503
Document Path:
Document Length:
/content/geometrixx/en/company.html
14246 bytes
Concurrency Level:
5
Time taken for tests:
54.595 seconds
Complete requests:
1000
Failed requests:
943
(Connect: 0, Receive: 0, Length: 943, Exceptions: 0)
Write errors:
0
Keep-Alive requests:
0
Total transferred:
14391487 bytes
HTML transferred:
14242487 bytes
Requests per second:
18.32 [#/sec] (mean)
Time per request:
272.974 [ms] (mean)
Time per request:
54.595 [ms] (mean, across all concurrent requests)
Transfer rate:
257.43 [Kbytes/sec] received
Connection Times (ms)
© 2012 Adobe Systems Incorporated.
All rights reserved.
Page 6
Created on 2014-09-15
How to Monitor Performance
Connect:
Processing:
Waiting:
Total:
min
0
121
114
121
mean[+/-sd] median
1
2.6
0
271 72.9
258
256 69.3
244
272 72.9
260
max
40
653
628
654
Percentage of the requests served within a certain time (ms)
50%
260
66%
290
75%
310
80%
324
90%
368
95%
411
98%
453
99%
491
100%
654 (longest request)
The numbers above are taken from a standard, single cpu, dual-core, intel laptop accessing the geometrixx
company page, as included in a default CQ installation. The page is very simple, but not optimized for
performance.
apachebench also displays the time per request as the mean, across all concurrent requests; see Time per
request: 54.595 [ms] (mean, across all concurrent requests). You can change the value of the concurrency
parameter -c (number of multiple requests to perform at a time) to see any effects.
MONITORING PERFORMANCE USING JCONSOLE
The tool command jconsole is available with the JDK.
1.
2.
3.
Start your CQ5 instance.
Run jconsole.
Select your CQ instance and Connect.
4.
From within the Local application, double-click com.day.crx.quickstart.Main; the Overview will be shown
as default:
After this you can select other options.
© 2012 Adobe Systems Incorporated.
All rights reserved.
Page 7
Created on 2014-09-15
How to Monitor Performance
MONITORING PERFORMANCE USING (J)VISUALVM
Since JDK 1.6, the tool command jvisualvm is available. After you have installed JDK 1.6 you can:
1.
Start your CQ5 instance.
NOTE
If using Java 5 you can add the -Dcom.sun.management.jmxremote argument to the
java command line that starts your JVM. JMX is enabled per default with Java 6.
2.
3.
Run either:
• jvisualvm: in the JDK 1.6 bin folder (tested version)
• visualvm: can be downloaded from VisualVM (bleeding edge version)
From within the Local application, double-click com.day.crx.quickstart.Main; the Overview will be shown
as default:
After this you can select other options, including Monitor:
You can use this tool to generate thread dumps and memory head dumps. This information is often
requested by the technical support team.
© 2012 Adobe Systems Incorporated.
All rights reserved.
Page 8
Created on 2014-09-15
How to Monitor Performance
INFORMATION COLLECTION
Knowing as much as possible about your installation can help you track down what might have caused a
change in performance, and whether these changes are justified. These metrics need to be collected at
regular intervals so you can easily see significant changes.
The following information can be useful:
• How many authors are working with the system?
• What is the average number of page activations per day?
• How many pages do you currently maintain on this system?
• If you use MSM, what is the average number of rollouts per month?
• What is the average number of Live Copies per month?
• If you use CQ DAM, how many assets do you currently maintain in CQ DAM?
• What is the average size of the assets?
• How many templates are currently used?
• How many components are currently used?
• How many requests per hour do you have on the author system at peak time?
• How many requests per hour do you have on the publish system at peak time?
How many authors are working with the system?
To see the number of authors that have used the system since installation use the command line:
cd <cq-installation-dir>/crx-quickstart/logs
cut -d " " -f 3 access.log | sort -u | wc -l
To see the number of authors working on a given date:
grep "<date>" access.log | cut -d " " -f 3 | sort -u | wc -l
What is the average number of page activations per day?
To see the total number of page activations since server installation use a repository query; via CRXDE Tools - Query:
• Type XPath
• Path /
• Query //element(*, cq:AuditEvent)[@cq:type='Activate']
Then calculate the number of days that have elapsed since installation to calculate the average.
How many pages do you currently maintain on this system?
To see the number of pages currently on the server use a repository query; via CRXDE - Tools - Query:
• Type XPath
• Path /
• Query //element(*, cq:Page)
If you use MSM, what is the average number of rollouts per month?
To determine the total number of rollouts since installation use a repository query; via CRXDE - Tools Query:
• Type XPath
• Path /
• Query //element(*, cq:AuditEvent)[@cq:type='PageRolledOut']
Calculate the number of months that have elapsed since installation to calculate the average.
© 2012 Adobe Systems Incorporated.
All rights reserved.
Page 9
Created on 2014-09-15
How to Monitor Performance
What is the average number of Live Copies per month?
To determine the total number of Live Copies made since installation use a repository query; via CRXDE Tools - Query:
• Type XPath
• Path /
• Query //element(*, cq:LiveSyncConfig)
Again use the number of months that have elapsed since installation to calculate the average.
If you use CQ DAM, how many assets do you currently maintain in CQ
DAM?
To see how many DAM assets you currently maintain, use a repository query; via CRXDE - Tools - Query:
• Type XPath
• Path /
• Query /jcr:root/content/dam//element(*, dam:Asset)
What is the average size of the assets?
To determine the total size of the /var/dam folder:
1.
Use WebDAV to map the CQ repository to the local file system.
2.
Use the command line:
cd /Volumes/localhost/var
du -sh dam/
To get the average size, divide the global size by the total number of assets in /var/dam (obtained
above).
How many templates are currently used?
To see the number of templates currently on the server use a repository query; via CRXDE - Tools - Query:
• Type XPath
• Path /
• Query //element(*, cq:Template)
How many components are currently used?
To see the number of components currently on the server use a repository query; via CRXDE - Tools Query:
• Type XPath
• Path /
• Query //element(*, cq:Component)
How many requests per hour do you have on the author system at peak
time?
To determine the requests per hour you have on the author system at peak time:
1.
To determine the total number of requests since installation use the command line:
© 2012 Adobe Systems Incorporated.
All rights reserved.
Page 10
Created on 2014-09-15
How to Monitor Performance
cd <cq-installation-dir>/crx-quickstart/logs
grep -R "\->" request.log | wc -l
2.
To determine the start and end dates:
vim request.log
G / 1G: for the last/first lines
Use these values to calculate the number of hours that have elapsed since installation, then the
average number of requests per hour.
How many requests per hour do you have on the publish system at
peak time?
Repeat the above procedure on your publish instance.
Analyzing Specific Scenarios
The following is a list of suggestions on what to check if you start experiencing certain CQ performance
problems. The list is not (unfortunately) fully comprehensive.
CPU AT 100%
If the CPU of your system is constantly running at 100% then see:
• The Knowledge Base:
• Analyze Slow and Blocked Processes
OUT OF MEMORY
Although such errors should be detected during Development and Testing, certain scenarios can slip
through.
If your system is running out of memory this can be seen in various ways, including performance degradation
and error messages including the subtext:
java.lang.OutOfMemoryError
In these cases check:
• the JVM settings used to start CQ
• The Knowledge Base:
• Analyze Memory Problems
DISK I/O
If your system is either running out of diskspace, or you notice disk thrashing starting see:
• Optimizing Tar Files and Optimizing Tar Files in a Cluster
• Whether you have disabled collection of debug information; this can be configured in various locations,
including:
• Apache Sling JSP Script Handler
• Apache Sling Java Script Handler
• Apache Sling Logging Configuration
• CQ HTML Library Manager
• CQ WCM Debug Filter
• Loggers
• Whether and how you have configured Version Purging
• The Knowledge Base:
• Too Many Open Files
• Journal consumes too much diskspace
© 2012 Adobe Systems Incorporated.
All rights reserved.
Page 11
Created on 2014-09-15
How to Monitor Performance
REGULAR PERFORMANCE DEGRADATION
If you see the performance of your instance deteriorating after each reboot (sometimes a week or more
later), then the following can be checked:
• Out of Memory
• The Knowledge Base:
• Unclosed Sessions
JVM TUNING
The Java Virtual Machine (JVM) has significantly improved in respect to tuning (especially since Java 7).
Because of this, specifying a reasonable fixed JVM size and using the defaults will often be suitable.
If the default settings are not suitable, then it is important to establish a method to monitor and assess GC
performance before attempting to tune the JVM; this can involve monitoring factors including, heap size,
algorithm and other aspects.
Some common choices are:
• VerboseGC:
-verbose:gc \
-Xloggc:$LOGS/verbosegc.log \
-XX:+PrintGCDetails \
-XX:+PrintGCDateStamps
The resulting log can be ingested by a GC visualizer such as:
http://www.ibm.com/developerworks/library/j-ibmtools2/
Or JConsole:
• These settings are for a "wide open" JMX connection:
-Dcom.sun.management.jmxremote \
-Dcom.sun.management.jmxremote.port=8889 \
-Dcom.sun.management.jmxremote.authenticate=false \
-Dcom.sun.management.jmxremote.ssl=false
• Then connect to the JVM with the JConsole; see:
http://docs.oracle.com/javase/6/docs/technotes/guides/management/jconsole.html
This will help you see how much memory is being used, what GC algorithms are being used, how long they
take to run, and what effect this has on your application performance. Without this, tuning is just "randomly
twiddling knobs".
NOTE
For Oracle's VM there is also information at:
http://docs.oracle.com/javase/7/docs/technotes/guides/vm/server-class.html
© 2012 Adobe Systems Incorporated.
All rights reserved.
Page 12
Created on 2014-09-15