Wednesday, January 4, 2012

Liking Linq-to-SQL

Prior to joining Rubicite, I hand-crafted every SQL query that I used in web development. In fact, I often tested my queries for accuracy of results and performance by replacing any variables with set values and then running them through a command-line interface to the database. I took steps to prevent problems like SQL Injection, but didn't really even take that as far as I should have; in relatively few cases did I use parameterized queries (preferring my own functions for white-list input validation, filtering, and testing variables in an SQL query). Part of this was based on the issues I have seen in past code from coworkers that was overly reliant on database frameworks, resulting in queries that retrieved millions of records to display a couple dozen entries.

As I moved into the ASP.Net world, there was a temptation to stick with what I've always done. The support for parameterized queries is more integrated into the base language structure, and initially I assumed that would be enough to satisfy me. Others at Rubicite disagreed, and pushed me to learn how to use LINQ to SQL. Now, I am glad they did.

I am not going to go over the details of the Linq-to-SQL methodology and implementation. I'll touch on a few highlights while discussing the things I like about it, but for full details check out Scott Gu's 9-part tutorial.

Basically, the components of a Linq-to-SQL implementation include a database model that is loaded into your .Net project, coupled with LINQ queries in your code. LINQ queries look rather different than a typical SQL query, but once you wrap your mind around the structure, they are just as easy to craft. Here's an example:

Rewriting that query in Linq-to-SQL looks like:


As you can see, the two are pretty similar. I included more than just the LINQ query there to illustrate that you must define and reference a datacontext on which to base the query. That comes from the database model you loaded into the project. I left out any protections from the SQL version, but clearly if you were using this query in real life, you would want to employ some of the techniques I mentioned in my first paragraph.

So, why would I choose the LINQ version over the SQL version? On the surface, the two look pretty similar. All things being equal, most would probably stick with the SQL version as it is more likely to be familiar. I certainly would. Except that all things aren't equal. You see, in the SQL version, there's the matter of implementing the security techniques noted above. That means wrapping the query with some kind of data structure or package to do the execution, and then adding logic and/or parameterization code to ensure we aren't opening ourselves up for a world of hurt. In the LINQ query, there is no need for a separate data structure. Instead, LINQ integrates into the base language to enable strong typing directly in the query itself. That means that if the type of `t.ValueField` does not match the type of `targetValue`, then the code will fail to complie. It knows these types because it is running directly in .Net and has loaded the types from the database into the datacontext. This is a big win.

Another big win is the simplicity of dealing with the data afterwards. Rather than having an obscure resultset with a generic set of columns that may or may not actually exist, the result of a LINQ query is an object (or, rather, a collection of objects). Each column of the results is automatically built into the resulting object as a parameter. So in later code, I could access the ValueField of the first result object through `Results.first().ValueField`. Types are preserved in these objects, improving the interaction in the related code quite a bit over what you would have in a generic resultset. Certainly, you could build your own object to create from a generic resultset of an SQL query, and that would provide you many of the same benefits, but there's no comparison to the speed and ease of use that the LINQ method gives you. And should you prefer to have a custom object for the LINQ query (to provide custom methods, perhaps), you can easily do that as well.

The two drawbacks I have found to LINQ are handling of complex queries and requirement to keep the database model up-to-date. Really complex queries can result in a mess of a query, and might even give give some performance issues. The solution I use is to move queries of this complexity to stored procedures, which is probably the right thing to do in a traditional SQL approach as well. Keeping the database model up-to-date is an extra step required any time you create a new table or alter the structure of an existing one. It is a simple as a few mouse-clicks to delete the old model of the table and add the new one(s) in. Definitely worth it for all the other benefits.

No comments:

Post a Comment