Today's post is the second in a long series. Part one covered the Microsoft Process Monitor tool. Part will cover interpreting Process Monitor results. In my lab I have a TPC-H "Like" environment (If testing is not sanctioned by the TPC committee then you are legally obliged to call it TPCH Like - http://www.tpc.org/tpch/). The environment consists of a 300GB data base with clustered and Non-Unique Indexes (depending on the table and keys). The data is generated by the TCPH utility called DBGEN.EXE (you can download the source files and documentation at http://www.tpc.org/tpch/).
Here is the database structure:
For the purposes of these examples I am using the Supplier table. It consists of 3000000 rows.
I fired up two different queries and ran Process Monitor to track the I/O (http://technet.microsoft.com/en-us/sysinternals/bb896645.aspx - Version 2.95 as of this writing).
The first query simply counts the rows in the table:
Select COUNT(*) as "row count"
FROM (select * FROM [TPCH300R10Thk].[dbo].[H_Supplier]) as t
The query returns a value of 3000000. It scans the .mdf file with 64K and larger I/O (a total of 209 reads averaging 165K each). For quick setup I only created one data file and one log file.
I set up Process Monitor to track only sqlservr.exe events (see Part 1 http://sqlvelocity.typepad.com/blog/2011/04/detailed-windows-io-process-monitor.html). I saved all of the events that generated the query values as a .csv file:
Next, open the .csv file in Excel. We first need to separate the detail column into usable parts. Each detail records the offset, length and I/O flags:
Offset: 1,245,184, Length: 65,536, I/O Flags: Non-cached, Priority: Normal
I use the text to columns feature of Excel to separate the values into columns (Data tab then Text to Columns):
I use a Space as the delimiter of choice (using comma does not work as commas are used as both a 1000's separator and a value Delimiter):
Finally I use step 3 of the text to column wizard to clean up the import by skipping the text (as I don't want it for this analysis).
I am left with an offset column and a length column:
As a final step I clean up the data by removing the commas and converting the columns to number with no decimal places.
I inserted a Scatter graph (Scatter with Markers only) and plotted all of the offsets (Insert tab – charts – scatter). This creates a visual representation of the query:
This graph shows that the row count query created a nice sequential access pattern. In addition to the access pattern we can also look at the actual size of each I/O. In this example SQL server issued 30 64K I/O requests. As the SQL engine detected sequential access it began to size up the I/O request to 128, 192 and 256K.
Note: I wanted to specifically point out that the offset range is localized. If the data set was more varied, for instance starting at 0 and going all the way up to 361,380,839,424 the graph will appear to show only sequential data access. It is important to look at the data ranges before plotting to ensure the graph shows the appropriate data. In other words, zoom…
Excluding the first 6 I/O's and re-creating the chart will result in a more detailed data view:
The graph reveals that the initial I/O requests were purely sequential and extremely small. SQL Server realized the sequentially of the data and cranked up the request size. Realize that this is a 2TB volume with approximately 600GB of data (this is why the offsets values are so large). This query is accessing 463MB of data space with 33MB of index.
The second query lists the supplier's phone numbers that are from PERU (Country 17 is PERU – I skipped the join to simplify the I/O access):
Where s_nationkey =17
The query caused a more random I/O access pattern that is localized to the specific table and the I/O sizes were highly varied (from 16K all the way up to 512K).
As you can see, tying procmon.exe with Microsoft Excel results in an extremely detailed view of each I/O request. We can view the size of each request and exactly how the request accessed the disk. Using procmon.exe when testing new queries can lead to code optimization that is simply not possible with other tools. Perfmon is a great tool for looking at long term trends and finding problems but it is not granular enough to really diagnose specific problems.
Please remember to always experiment with these tools. Never run Process Monitor in production unless it is an emergency and you take precautions to limit system impact (as noted in the last post; this tool is tracing EVERYTHING in memory. Tracing SQLServer.exe as it scans a 100TB data warehouse will result in a runaway application and a system reboot). View the Process Monitor help file for notes on scripting etc…