Thursday, March 18, 2010

Getting the Improvement Measure of Missing Indexes from the Plan Cache

I have been interrogating the plan cache a lot more every since I was tuning the cost threshold for parallelism on one of my servers and ran across the post Tuning ‘cost threshold for parallelism’ from the Plan Cache by Jonathan Kehayias.

You may be aware that in SQL Server 2008, missing indexes show up in the execution plan. This also works if you are using SQL Server 2008 Management Studio to view SQL Server 2005 plans. One thing I noticed though is that if there are multiple batches in the plan, the same missing index shows for each query in the plan. I also noticed that there could be multiple missing indexes in the actual XML, but the GUI only shows one. The way the XML is structured, you can even have multiple missing indexes per statement and all you will ever see is one missing index.

You may have also seen the Missing Index Report you can create using DMV's. This report generates an improvement measure based on the characteristics of the missing index. The problem with the DMV's is that you can't tell what query caused the index recommendation.

I figured since everything I wanted was in the Plan Cache, I could write a query that would calculate an improvement measure based on the number of times the plan was used, the cost of the statement in the plan, and the impact the index will have on the statement. In addition, I can also see the statement that would benefit from the index.

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
WITH XMLNAMESPACES  
   
(DEFAULT 'http://schemas.microsoft.com/sqlserver/2004/07/showplan'
SELECT TOP 100--ecp.plan_handle,
        
DENSE_RANK() OVER ORDER BY ecp.plan_handle AS ArbitraryPlanNumber ,
        
n.value('(@StatementSubTreeCost)[1]''VARCHAR(128)')
        * 
ISNULL(n.value('(./QueryPlan/MissingIndexes/MissingIndexGroup/@Impact)[1]','float'), 0)
        * 
ecp.usecounts AS Improvement ,
        
query_plan AS CompleteQueryPlan ,
        
n.value('(@StatementId)[1]''float'AS StatementID ,
        
n.value('(@StatementText)[1]''VARCHAR(4000)'AS StatementText ,
        
n.value('(@StatementSubTreeCost)[1]''VARCHAR(128)'AS StatementSubTreeCost ,
        
n.query('./QueryPlan/MissingIndexes'MissingIndex ,
        
n.value('(./QueryPlan/MissingIndexes/MissingIndexGroup/@Impact)[1]','float'IndexImpact ,
        
ecp.usecounts
FROM    sys.dm_exec_cached_plans AS ecp
        
CROSS APPLY sys.dm_exec_query_plan(plan_handleAS eqp
        
CROSS APPLY query_plan.nodes('/ShowPlanXML/BatchSequence/Batch/Statements/StmtSimple')
        
AS qn )
WHERE   n.value('(./QueryPlan/MissingIndexes/MissingIndexGroup/@Impact)[1]','float'IS NOT NULL AND
        
ecp.usecounts 100 AND
        
eqp.query_plan.exist('declare default element namespace "http://schemas.microsoft.com/sqlserver/2004/07/showplan"; 
         /ShowPlanXML/BatchSequence/Batch/Statements/StmtSimple/QueryPlan/MissingIndexes'

ORDER BY Improvement DESC


I also included an arbitrary plan number, which is just an ID that gets assigned to the plan, so I can easily see if the statements are coming from the same plan. I could use the plan hash, but using an ID is easier for me to identify the same plans.

0 comments: