Community Server for Rent a Coder

How Software Gets Done
Welcome to Community Server for Rent a Coder Sign in | Join | Help
in Search

Real Life Microsoft IT

A "must use" SQL Server 2005 Feature-ROW_NUMBER()

If you run your SQL Server as the back end for a website, then paging is one of the most common things that you do.  In SQL Server 2000, doing this paging was complicated and resource intensive.  But in 2005, there is a new feature called ROW_NUMBER than you can combine with BETWEEN that will give you orders of magnitude improvement.

SQL Server Processor utlization before ROW_NUMBER (Last Saturday):




SQL Server Processor utlization after ROW_NUMBER (the cutover is plainly visible at around 1:30 AM this Saturday):

When pushed to production, he CPU INSTANTANEOUSLY dropped from an average of about 50% CPU toabout  25% CPU, and stayed there!  That's a 50% reduction in CPU just from one query change!  If you run a website with SQL Server, this should be one of the FIRST things you look at after you upgrade.

Here are the best sources I found on how to implement it:

1) http://sqljunkies.com/WebLog/amachanic/archive/2004/11/03/4945.aspx

Adam Mechanic ran some tests on the new ROW_NUMBER feature using paging and shows dramatic results.  The SQL 2000 style SQL returns a cost of 5,516.  The new 2005 style SQL returns a cost of 0.202!

SQL 2000 style:

SELECT x.SomewhatLargeString
FROM (
SELECT TOP 20010 A.SomewhatLargeString, COUNT(*) AS TheCount
FROM #BigTableOfStrings A
JOIN #BigTableOfStrings B ON B.SomewhatLargeString <= A.SomewhatLargeString
GROUP BY A.SomewhatLargeString
ORDER BY A.SomewhatLargeString ) x
WHERE x.TheCount BETWEEN 20001 AND 20010

SQL 2005 Style:

SELECT x.SomewhatLargeString,x.TheCount
FROM (
SELECT TOP 20010 A.SomewhatLargeString, ROW_NUMBER() OVER(ORDER BY A.SomewhatLargeString) AS TheCount
FROM #BigTableOfStrings A
ORDER BY A.SomewhatLargeString) x
WHERE x.TheCount BETWEEN 20000 AND 20010

When you look at any new queries implementing ROW_NUMBER with BETWEEN, you may first wonder why there are 2 queries (the base query nested inside a top level query).  I did too.  I wondered, "Why not combine it into one query"?  The reason was clarified to me in a support ticket to Microsoft while trying to do this with a SELECT DISTINCT subquery.  By design, SQL Server cannot referenced an aliased field using a WHERE clause (which is where the BETWEEN is located).  And you can't reference the "actual" field name for ROW_NUMBER(), becuase there is no actual field name.  So to work around it, the solution is to wrap the query in a top level query that pull at least the aliased field...and in that process, the field now becomes the "actual" field name at the top level and can be referenced by the WHERE/BETWEEN.  If that doesn't make sense...you don't need to understand it to use it...just copy the sytax above and USE IT.

The other tip is to also use the TOP setting in the inner select and set it to the highest # in your range of rows.  It will speed up the query and you don't need any rows past that anyway, so it's a "free boost".

2) http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnsql90/html/sql_05TSQLEnhance.asp

Microsoft's information

Ian Ippolito

Published Saturday, April 29, 2006 11:30 AM by admin
Filed Under:

Comments

 

Real Life Microsoft IT said:

We recently upgraded from SQL Server 2000 to SQL 2005.&amp;nbsp; We simultaneously did a serious upgrade...
May 26, 2006 2:32 PM
Anonymous comments are disabled

This Blog

Post Calendar

<April 2006>
SuMoTuWeThFrSa
2627282930311
2345678
9101112131415
16171819202122
23242526272829
30123456

Post Categories

Syndication

Powered by Community Server, by Telligent Systems