ADO.NET and SQL Server Optimization
Recently a compatriot and I got into this diatribe over the use of CommandType.Text and "EXEC spName param, param, ..." to make SQL Server stored procedure calls.
I had chimed in on his answer to a poster who was having difficulty with a sproc call, and he suggested "you don't need to use the Command Object" and proceeded to show an illustration like the above.
Even after I had provided some pretty good background material on why this is not a good idea, he continued to hammer on little pieces of the subject, in effect missing the big picture. There have been many times I've been corrected. As I've grown older (and hopefully wiser) I've learned to put my ego in my back pocket and thank the person for the education. And you? I guess what I am trying to say is this; you have two MVPs who didn't know about this. The only reason I found out about it is because I was lucky enough to attend Tech-Ed 2004 and sit in Gert Drapers' presentation. So if two C# MVP's weren't aware of it, how many other people do you suppose are writing whole applications using Language Event processing instead of RPC Event processing as they should?
The bottom line is, you should never use CommandType.Text and concatenated "EXEC SPNAME PARM PARM" convention. In fact, even if you are not using stored procedures at all - you should parameterize your text queries - because the SQL Server engine is smart enough to "Template-ize" them - and cache the compiled execution path for re-use. What I posted was Gert Drapers' PowerPoint Presentation to illustrate. Gert Drapers, for those who don't know, is a Software Architect in the SQL Server Development group at Microsoft. He is involved in the architecture design and performance tuning of the most challenging applications and large-scale database deployments around the world; worked on both SQL Server 7.0 and 2000; first as Software Design Engineer in the SQL Server Storage Engine and later as the Product Unit Manager of the SQL Server Management Tools team and as Architect of SQL Server DTS. So, he certainly has the credentials, IMHO.
CommandType.Text = Language Event
You should not execute stored procs using CommandType.Text , that's what your "EXEC spName param param" example is doing.
You should use RPC Event for All procedure invocations
"CommandType.StoredProcedure" and add the correct SqlParameter objects to the Command.
There are numerous reasons to do this, many are illustrated in the downloadable Powerpoint Presentation, which you are welcome (and encouraged) to view.
Here are some of the notes (slightly edited) that I took of Mr. Drapers' presentation at Tech-Ed 2004:
First, we need to make a decision which command type to use. It is NOT a convenience issue. From a server persepective, there is a significant difference.:
IN SQL Server you have language events, and RPC events, This is a split code path. In other words, If you want to run a stored procedure, you want to be sure that you both start and end inside the RPC Event source code path. You do this by explicitly setting commandType.StoredProcedure in the ADO.NET SQLClient class. You don't want to execute it as a Language Event, because that is not going to trigger the most optimal code path inside SQL Server.
If SQL Profiler shows you a SQL Statement starting event and then you see anything else than INSERT, UPDATE, DELETE, or if you see "EXECUTE" you are probably dealing with a stored procedure. That means that your program is actually choosing the wrong code path.
Parameter re-use: If you send in a "EXEC" - type select concatenated text vs Parameterized Command, the execution plan gets recompiled. But , when you use named parameters in ADO.NET, you are actually creating a Template. This Template is how SQL Server achieves plan re-use and does not have to recompile the execution plan for the query.
You can actually see this plan re-use and optimization by querying select * from master.dbo.syscacheobjects . In SyscacheObjects, there is a Sql column and a usecounts column. Sql holds the sql fragment, and usecounts shows how many times the SQL Engine has been able to re-use the code path without having to recompile, resulting in more optimized code and lower CPU utilization.
So, for example if you run 10,000 stored proc invocations with different (or even the same) parameter values using CommandType.Text and "EXEC SPNAME param1 param2" etc., these will be usually recompiled every single time (depending on whether autoparameterization is used by SQL Server). This burns up CPU cycles. Event though the performance difference may not be readily apparent, you can bet that this is an inefficient way to do it. When you use CommandType.StoreProcedure with SqlParameters, it gets compiled ONCE - and re-used by the Sql Server engine every subsequent time. This ain't rocket science!
Parameterized select statements are normally significantly faster than "text - only" select statements.
And of course, with Stored Procedures and selects / inserts, using fully parameterized execution results in significant performance increases.
Client - side Parameter Caching
You can also use a Client-side caching mechanism for SqlParameters. The MS Data Application Blocks "SqlHelper" class already has a very nice one built in. By caching the parameter set for each stored procedure, I've seen performance increases ; it seems that the creation of ADO.NET SqlParameter objects is expensive, and if we can cache and re-use them , there are substantial economies of throughput to be obtained.
If a multi-insert operation is bracketed within a transaction, the deferred writing to the transaction log creates much enhanced throughput. Drapers illustrated this in real time in his Tech-Ed presentation and I was, well - trust me, its hugely faster! Also client side Parameter Caching (a.k.a. SqlHelper) can increase performance.
What others have to say about the subject:
Kirk Allen Evans (Excerpts):
Databases create a query plan or "internal road map" to determine the quickest and most efficient route to deliver the requested data. The database's cache manager compiles the query plan and stores it in its cache when you execute a query for the first time. If you execute the same query text again, the database retrieves the query plan from the cache and uses it to retrieve the data. But if the query differs by even one character, such as the data in a WHERE clause, the database must compile the query again, generate a new query plan, and store the query and plan in its cache.
If an application searches for a customer by name, many different cache listings will appear in syscacheobjects for essentially the same query.
A more efficient approach than forcing the cache manager to recompile queries each time is to reuse the same query multiple times using parameters. This saves CPU cycles because it avoids recreating cache entries. SQL Server provides an excellent mechanism for doing this through its implementation of stored procedures. Create a stored procedure that accepts a variable and use the variable in the SQL WHERE clause.
The real performance difference occurs with complex SQL statements that are inlined, where the WHERE clause varies only by its values. This causes discrete cache entries into the cache plan, meaning the query is parsed for execution plan each time the query is run.
The Prepare statement caches the statement and indicates to the provider that the same statement will be run many times. The provider may or may not perform optimimizations to support this, which SQL Server does. By executing this each time within the loop, you cause extra processing. Move this out of the loop, because you only have to prepare it once.
Second, you will get better performance by using a parameterized SQL statement by replacing the literal values with question marks and using parameters in place of the literal values. The parameters can then be reused as well, where you simply set the values during each iteration.
Third, you can enhance performance a little more if your provider supports the .Prepared property for the Command object. The .Prepared property pre-compiles the SQL statement and caches it, avoiding parses for invaalid SQL statements.
One more comment: you are not going to see a noticeable time difference between using parameterized SQL queries that simply call a stored proc over using a CommandType of StoredProc. The real difference occurs back on the server, where the procedure is parsed and cached.
Jesse Ezell:
In any case, ALWAYS use parameterized queries. Anything else is a rediculously [sic] lame decision.
Barry Gervin:
In terms of SQLServer and Oracle, there are no implicit performance improvements between a stored proc and a parameterized inline sql statement. Both of them have their execution plans compiled at first execution - and then reused.
In SqlServer - with non-parameterized sql - the syntax will be parsed and auto-parameterized - so 2nd & nth executions on similar sql (different hard coded values) should exhibit performance on scale with parameterized sql (or stored procs).
Sometimes the auto-parming doing by Sql isn't as efficient as a developer could manually parm it. Think of a case when you have a constant in the query that neverchanges: where inactive="n" and state="CA". SqlServer will auto-parm the "n" and the "CA" but you know that "n" will never change. That would lead to not as efficient parameterization.
And Finally, an example of using the Prepare Statement--
public void SqlCommandPrepareEx() {
int id = 20;
string desc = "myFirstRegion" ;
SqlConnection rConn = new SqlConnection("Persist Security Info=False;Integrated Security=SSPI;database=northwind;server=mySQLServer");
rConn.Open();
SqlCommand command = new SqlCommand(null, rConn);
// Create and prepare an SQL statement.
command.CommandText = "insert into Region (RegionID, RegionDescription) values (@id, @desc)" ;
command.Parameters.Add ( "@id", id) ;
command.Parameters.Add ( "@desc", desc) ;
command.Prepare() ; // Calling Prepare after having set the Commandtext and parameters.
command.ExecuteNonQuery();
// Note- this command is now "Templatized- if you change parameter values and call it again,
// SQL Server will re-use the execution path without needing to recompile.
// Change parameter values and call ExecuteNonQuery.
command.Parameters[0].Value = 21;
command.Parameters[1].Value = "mySecondRegion";
command.ExecuteNonQuery();
}
Some useful links about all this:
SyscacheObjects( MSDN):
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/tsqlref/ts_sys-c_3dyr.asp
Kimberly Tripp:
http://www.sqlskills.com/blogs/kimberly/CategoryView.aspx?category=Optimizing%20Procedural%20Code
Kalen Delaney ("Inside SQL Server 2000"):
http://www.winnetmag.com/SQLServer/Articles/ArticleID/6113/pg/2/2.html
Andrew Novick:
http://www.databasejournal.com/features/mssql/article.php/2203601
SQL Server 2005:
http://www.microsoft.com/technet/prodtechnol/sql/2005/recomp.mspx
The Bottom Line!
The lesson from all this is that just because something "works" and you have gotten used to doing it doesn't mean you automatically have the right to become a self-proclaimed "guru" and recommend it to everybody else in the face of facts, figures and arithmetic that explain why what you are doing is NOT "best practices code". And if you are obstinate and refuse to accept the facts, digging in to protect your already indefensible position, shame on you! Don't use CommandType.Text to execute stored procedures. If you are lazy and you don't like to type, use the SqlHelper class with (connectionstring,spname,object[] parmvalues) and your SqlParameters will be automatically CACHED for you! Finally, ALWAYS parameterize your text sql queries with parameter placeholders and REAL SqlParameters!
I had chimed in on his answer to a poster who was having difficulty with a sproc call, and he suggested "you don't need to use the Command Object" and proceeded to show an illustration like the above.
Even after I had provided some pretty good background material on why this is not a good idea, he continued to hammer on little pieces of the subject, in effect missing the big picture. There have been many times I've been corrected. As I've grown older (and hopefully wiser) I've learned to put my ego in my back pocket and thank the person for the education. And you? I guess what I am trying to say is this; you have two MVPs who didn't know about this. The only reason I found out about it is because I was lucky enough to attend Tech-Ed 2004 and sit in Gert Drapers' presentation. So if two C# MVP's weren't aware of it, how many other people do you suppose are writing whole applications using Language Event processing instead of RPC Event processing as they should?
The bottom line is, you should never use CommandType.Text and concatenated "EXEC SPNAME PARM PARM" convention. In fact, even if you are not using stored procedures at all - you should parameterize your text queries - because the SQL Server engine is smart enough to "Template-ize" them - and cache the compiled execution path for re-use. What I posted was Gert Drapers' PowerPoint Presentation to illustrate. Gert Drapers, for those who don't know, is a Software Architect in the SQL Server Development group at Microsoft. He is involved in the architecture design and performance tuning of the most challenging applications and large-scale database deployments around the world; worked on both SQL Server 7.0 and 2000; first as Software Design Engineer in the SQL Server Storage Engine and later as the Product Unit Manager of the SQL Server Management Tools team and as Architect of SQL Server DTS. So, he certainly has the credentials, IMHO.
CommandType.Text = Language Event
You should not execute stored procs using CommandType.Text , that's what your "EXEC spName param param" example is doing.
You should use RPC Event for All procedure invocations
"CommandType.StoredProcedure" and add the correct SqlParameter objects to the Command.
There are numerous reasons to do this, many are illustrated in the downloadable Powerpoint Presentation, which you are welcome (and encouraged) to view.
Here are some of the notes (slightly edited) that I took of Mr. Drapers' presentation at Tech-Ed 2004:
First, we need to make a decision which command type to use. It is NOT a convenience issue. From a server persepective, there is a significant difference.:
IN SQL Server you have language events, and RPC events, This is a split code path. In other words, If you want to run a stored procedure, you want to be sure that you both start and end inside the RPC Event source code path. You do this by explicitly setting commandType.StoredProcedure in the ADO.NET SQLClient class. You don't want to execute it as a Language Event, because that is not going to trigger the most optimal code path inside SQL Server.
If SQL Profiler shows you a SQL Statement starting event and then you see anything else than INSERT, UPDATE, DELETE, or if you see "EXECUTE
Parameter re-use: If you send in a "EXEC" - type select concatenated text vs Parameterized Command, the execution plan gets recompiled. But , when you use named parameters in ADO.NET, you are actually creating a Template. This Template is how SQL Server achieves plan re-use and does not have to recompile the execution plan for the query.
You can actually see this plan re-use and optimization by querying select * from master.dbo.syscacheobjects . In SyscacheObjects, there is a Sql column and a usecounts column. Sql holds the sql fragment, and usecounts shows how many times the SQL Engine has been able to re-use the code path without having to recompile, resulting in more optimized code and lower CPU utilization.
So, for example if you run 10,000 stored proc invocations with different (or even the same) parameter values using CommandType.Text and "EXEC SPNAME param1 param2" etc., these will be usually recompiled every single time (depending on whether autoparameterization is used by SQL Server). This burns up CPU cycles. Event though the performance difference may not be readily apparent, you can bet that this is an inefficient way to do it. When you use CommandType.StoreProcedure with SqlParameters, it gets compiled ONCE - and re-used by the Sql Server engine every subsequent time. This ain't rocket science!
Parameterized select statements are normally significantly faster than "text - only" select statements.
And of course, with Stored Procedures and selects / inserts, using fully parameterized execution results in significant performance increases.
Client - side Parameter Caching
You can also use a Client-side caching mechanism for SqlParameters. The MS Data Application Blocks "SqlHelper" class already has a very nice one built in. By caching the parameter set for each stored procedure, I've seen performance increases ; it seems that the creation of ADO.NET SqlParameter objects is expensive, and if we can cache and re-use them , there are substantial economies of throughput to be obtained.
If a multi-insert operation is bracketed within a transaction, the deferred writing to the transaction log creates much enhanced throughput. Drapers illustrated this in real time in his Tech-Ed presentation and I was, well - trust me, its hugely faster! Also client side Parameter Caching (a.k.a. SqlHelper) can increase performance.
What others have to say about the subject:
Kirk Allen Evans (Excerpts):
Databases create a query plan or "internal road map" to determine the quickest and most efficient route to deliver the requested data. The database's cache manager compiles the query plan and stores it in its cache when you execute a query for the first time. If you execute the same query text again, the database retrieves the query plan from the cache and uses it to retrieve the data. But if the query differs by even one character, such as the data in a WHERE clause, the database must compile the query again, generate a new query plan, and store the query and plan in its cache.
If an application searches for a customer by name, many different cache listings will appear in syscacheobjects for essentially the same query.
A more efficient approach than forcing the cache manager to recompile queries each time is to reuse the same query multiple times using parameters. This saves CPU cycles because it avoids recreating cache entries. SQL Server provides an excellent mechanism for doing this through its implementation of stored procedures. Create a stored procedure that accepts a variable and use the variable in the SQL WHERE clause.
The real performance difference occurs with complex SQL statements that are inlined, where the WHERE clause varies only by its values. This causes discrete cache entries into the cache plan, meaning the query is parsed for execution plan each time the query is run.
The Prepare statement caches the statement and indicates to the provider that the same statement will be run many times. The provider may or may not perform optimimizations to support this, which SQL Server does. By executing this each time within the loop, you cause extra processing. Move this out of the loop, because you only have to prepare it once.
Second, you will get better performance by using a parameterized SQL statement by replacing the literal values with question marks and using parameters in place of the literal values. The parameters can then be reused as well, where you simply set the values during each iteration.
Third, you can enhance performance a little more if your provider supports the .Prepared property for the Command object. The .Prepared property pre-compiles the SQL statement and caches it, avoiding parses for invaalid SQL statements.
One more comment: you are not going to see a noticeable time difference between using parameterized SQL queries that simply call a stored proc over using a CommandType of StoredProc. The real difference occurs back on the server, where the procedure is parsed and cached.
Jesse Ezell:
In any case, ALWAYS use parameterized queries. Anything else is a rediculously [sic] lame decision.
Barry Gervin:
In terms of SQLServer and Oracle, there are no implicit performance improvements between a stored proc and a parameterized inline sql statement. Both of them have their execution plans compiled at first execution - and then reused.
In SqlServer - with non-parameterized sql - the syntax will be parsed and auto-parameterized - so 2nd & nth executions on similar sql (different hard coded values) should exhibit performance on scale with parameterized sql (or stored procs).
Sometimes the auto-parming doing by Sql isn't as efficient as a developer could manually parm it. Think of a case when you have a constant in the query that neverchanges: where inactive="n" and state="CA". SqlServer will auto-parm the "n" and the "CA" but you know that "n" will never change. That would lead to not as efficient parameterization.
And Finally, an example of using the Prepare Statement--
public void SqlCommandPrepareEx() {
int id = 20;
string desc = "myFirstRegion" ;
SqlConnection rConn = new SqlConnection("Persist Security Info=False;Integrated Security=SSPI;database=northwind;server=mySQLServer");
rConn.Open();
SqlCommand command = new SqlCommand(null, rConn);
// Create and prepare an SQL statement.
command.CommandText = "insert into Region (RegionID, RegionDescription) values (@id, @desc)" ;
command.Parameters.Add ( "@id", id) ;
command.Parameters.Add ( "@desc", desc) ;
command.Prepare() ; // Calling Prepare after having set the Commandtext and parameters.
command.ExecuteNonQuery();
// Note- this command is now "Templatized- if you change parameter values and call it again,
// SQL Server will re-use the execution path without needing to recompile.
// Change parameter values and call ExecuteNonQuery.
command.Parameters[0].Value = 21;
command.Parameters[1].Value = "mySecondRegion";
command.ExecuteNonQuery();
}
Some useful links about all this:
SyscacheObjects( MSDN):
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/tsqlref/ts_sys-c_3dyr.asp
Kimberly Tripp:
http://www.sqlskills.com/blogs/kimberly/CategoryView.aspx?category=Optimizing%20Procedural%20Code
Kalen Delaney ("Inside SQL Server 2000"):
http://www.winnetmag.com/SQLServer/Articles/ArticleID/6113/pg/2/2.html
Andrew Novick:
http://www.databasejournal.com/features/mssql/article.php/2203601
SQL Server 2005:
http://www.microsoft.com/technet/prodtechnol/sql/2005/recomp.mspx
The Bottom Line!
The lesson from all this is that just because something "works" and you have gotten used to doing it doesn't mean you automatically have the right to become a self-proclaimed "guru" and recommend it to everybody else in the face of facts, figures and arithmetic that explain why what you are doing is NOT "best practices code". And if you are obstinate and refuse to accept the facts, digging in to protect your already indefensible position, shame on you! Don't use CommandType.Text to execute stored procedures. If you are lazy and you don't like to type, use the SqlHelper class with (connectionstring,spname,object[] parmvalues) and your SqlParameters will be automatically CACHED for you! Finally, ALWAYS parameterize your text sql queries with parameter placeholders and REAL SqlParameters!
Obstinacy is the result of the will forcing itself into the place of the intellect.
--Arthur Schopenhauer 1788-1860, German Philosopher .
--Arthur Schopenhauer 1788-1860, German Philosopher .
Peter,
ReplyDeleteThanks for putting this all together. I've been fighting with other developers at my company over this for months. We have hundreds of ADO.NET stored proc and non-parameterized Sql calls that really need to be re-written. Now I have some ammunition.
Jim McMahon
Does the SqlHelper have an override that allows you to use parameterized sql statements?
ReplyDeleteYes. Download yourself a copy and try it out. All your questions will be magically answered!
ReplyDeletePeter,
ReplyDeleteThanks for the detailed information regarding this issue. Sometimes developers do not see the need for this optimization as there is always a large iron sitting in the server room and on top of there are no hard figures on the benefit. I have developed a solution where the DAL only calls a single procedure sp_executesql. The real business SP becomes a paramter and input data also added as additional parameters to sp_executesql. This solution freed me from using DAB and the overhead of maintaining the cache of sqlparameter objects in memory.
The sp_executesql stored procedure certainly permits SQL Server to reuse execution plans if the same query is executed again using different parameters.
ReplyDeleteHowever, what you describe still sounds like a bit more of a shortcut type of hack than a really robust implementation.
Peter, recently I have to use the WITH RECOMPILE option in a SP.
ReplyDeleteProblem: When I execute the SP form SSMS it works fine and quick. But when ADO.NET does, I get a Timeout.
After a couple of hours I tried this: http://arsalantamiz.blogspot.com/2008/08/query-timeout-in-adonet-but-not-in-sql.html
And I fixed it.
Why? well... really a I didn´t know, but since I disabled SQL Server execution plan cache for that SP, it works (not so quick) but works.
Can you please tell me why it happens?
I did use RPC events. CommandType.StoredProcedure.
Peter, recently I had to use WITH RECOMPILE option in a SQL 2005 SP.
ReplyDeleteProblem: When I executed the SP from SSMS it works fine and quick (near 1 second). But when ADO.NET does, I get a Timeout.
After a couple of hours I tried this: http://arsalantamiz.blogspot.com/2008/08/query-timeout-in-adonet-but-not-in-sql.html
And I fixed it.
Why? well... really I didn´t know, but since I disabled SQL Server execution plan cache for that SP, it works (not so quick) but works.
Can you please tell me why it happens with ADO.NET and dont with SSMS? Even when I ran the same command that ADO.NET does (captured from SQL Profiler).
I did use CommandType.StoredProcedure.
Best regards,
Alexander
Great article. Based on the date I am sure it was written for ASP.Net 1.1 and SQL Server 2000. How much has changed today given the upgrades to ASP.Net 3.5 and 4.0 and SQL Server 2008 R2? I am hoping you will catch this via comment moderation since the blog has been archived, since I was not able to find any other way to submit the question.
ReplyDeletePeter, I understood that it's wrong to use stored procedures with the CommandType.Text and also with literals because it's not just about performance issue, it's also about sql injection. I know it's better to use SqlParametrized SqlCommands with Procedures and CommandType.StoreProcedure. But what about statements without procedures like select, insert, update, delete with SqlParameters and CommandType.Text. I mean
ReplyDelete...
cn.Open();
SqlParameter prm = new SqlParameter("@type", "1");
SqlCommand cmd = new SqlCommand("select top 1000000 * from sales where inventorytype = @inv", cn);
cmd.Parameters.Add(prm);
cmd.CommandType = CommandType.Text;
SqlDataReader drd = cmd.ExecuteReader(CommandBehavior.CloseConnection);
...
Because I recently made a comparison between the example above and another with the statement inside a procedure called usp_sales and cmd.CommandType.StoredProcedure many times measured in milliseconds and the results showed me that there's a slight difference in time even the executed statement without procedure sometimes is faster. So In this scenario, is there a performance difference between this 2-methods calling? Besides my security and encapsulation code business rules from DB concerns, I also want for fast application for my users. What should I use?
Thanks in advance