Statspack analysis with Mumbai

Here’s a real-world example that shows what the Statspack analysis functionality in Mumbai can do for you.

The main idea is that you usually don’t know where to look with the large amount of data gathered by Statspack, especially over longer periods of time. You could create one Statspack report over the complete gathering period, but of course aggregates conceal details and we might miss important information.
We could also create a number of Statspack reports for smaller intervals and look through all of these, but this is definitely awkward to do.

With Mumbai you can get a graphical representation of the amount of wait event and DB CPU time over a time interval that you have Statspack data for.

What I usually fo first is to look for the DB CPU time over elapsed time (by selecting the Wait Events report in Mumbau and going to the Absolute wait times per second tab) to get a general idea about the load in the database:

The graph above is for an 18-day interval and it’s easy to distinguish 5 working days and the weekend and day/night periods. Three database restarts can be recognized at the time where DB CPU graph crosses the y-axis.

Also, there are three significant peaks in the CPU load at discrete points in time. This is something that would be averaged out in a Statspack report for the complete interval. At this point I created three Statspack report for the three peak intervals and I was able to pinpoint the root causes for the high CPU load (a statement issued manually from TOAD and slower average wait times for storage related wait events assumed to be caused by concurrent access to the SAN by other applications/servers) .

The image above shows the Top 2 to Top 4 wait events for a 7-day interval (Top 1 was “enqueue” here and is omitted here for clarity). One interesting thing to note is that the DB file scattered read times are on a rather high level and while the DB CPU and DB file sequential read times go down, DB file scattered read stays on a high level even over the weekend. With some additional knowledge about the application using this database this was an indication that a separate interface process working on messages from interfaced subsystems running constantly in the background was doing FULL TABLE or FULL INDEX SCANS. A Statspack report for a small weekend interval and a look at the execution plan recorded by Statspack confirmed this (FULL INDEX SCAN) and the root cause was pretty easy to fix.

So, Mumbai can give you a better idea where to look in the large amount of Statspack data that you might have. An analysis on a deeper level, by creating separate Statspack reports for smaller intervals or even by going to the more detailed session level, is usually still necessary.