Search This Blog

Friday, July 30, 2010

Avoid Functions on Indexed Columns

Many of us have the have the habit of doing this common mistake of having functions over column names, which makes the column take a totally new value, and by which we make
SQL Server not use the indexes which results in poor performance of a query.

Some of the functions we use generally use in where clause are convert, substring, datediff, etc. Let us understand the performance implications by means of a example below.

SET NOCOUNT ON
SET STATISTICS TIME OFF

IF EXISTS ( SELECT 'X' FROM sysobjects where TYPE = 'U' and name = 'TEST_FUNCTION_PERF')
DROP TABLE TEST_FUNCTION_PERF
GO

--Create a table with 3 columns
CREATE TABLE TEST_FUNCTION_PERF
( Col1_Int INT IDENTITY(1,1) NOT NULL,
Col2_Date DATETIME NOT NULL,
Col3_Varchar VARCHAR(255) NOT NULL
)

--Let us create clustered index on Date column , it can be on any columns which we would use in where clause

CREATE CLUSTERED INDEX IX_TEST ON TEST_FUNCTION_PERF(Col2_Date)

--Populate table with 90000 records

DECLARE @iCount INTEGER
SELECT @iCount = 1

WHILE @iCount < = 90000 BEGIN INSERT INTO TEST_FUNCTION_PERF(Col2_Date,Col3_Varchar) SELECT GETDATE()- @iCount,'ABC' + CONVERT(VARCHAR,@iCount) SELECT @iCount = @iCount + 1 END Now let us enable statistics to know how our queries perform in terms of CPU Time and Elapsed Time for 3 different queries below which produces same results of 59 rows. SET STATISTICS TIME ON --Query 1: SELECT COL2_DATE FROM TEST_FUNCTION_PERF WHERE CONVERT(VARCHAR(10),COL2_DATE,101) > GETDATE()- 60

--Query 2
SELECT COL2_DATE FROM TEST_FUNCTION_PERF WHERE DATEDIFF(DD,COL2_DATE,GETDATE()) <> DATEADD(DD,-60,GETDATE())

/*
SQL Server Execution Times:
CPU time = 359 ms, elapsed time = 438 ms.

SQL Server Execution Times:
CPU time = 31 ms, elapsed time = 63 ms.

SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
*/

If you see execution plan of these queries first 2 queries would have gone for Index Scan and the last one would be Index Seek which is optimal and the best way to write such queries is using Query3 format i.e. by keeping index columns separated from functions.

No comments: