From SQL to Linq


Warning: Illegal string offset 'filter' in /var/sites/t/theproactiveprogrammer.com/public_html/wp-includes/taxonomy.php on line 1409

This week I decided that I wanted to know how to write queries in Linq. I am not talking about the Linq extension methods, which I have used extensively, but the query syntax which uses keywords such as “from” and “select”. I wasn’t sure if this would be a particularly useful skill, as I have always coped well enough using just the extension methods, however it bugged me a little that I had never really got to grips with Linq queries and wasn’t sure exactly how the syntax worked. I am comfortable writing queries in SQL so decided to create some SQL queries and learn how I would write the corresponding queries using Linq queries. This week’s blog post is essentially my notes from this exercise.

First things first, let’s get our definition out of the way. Linq is short for Language Integrated Query. It is a Microsoft .NET technology used for querying data. Whereas SQL can be used to update and delete data, Linq is just for querying data (although you can obviously update and delete query results using other .NET commands). In SQL we query database tables, and in Linq we query .NET collections. It is slightly odd to compare SQL to Linq, as the two operate in fundamentally different environments, however I found it useful to help me understand the Linq query syntax.

Setting Up the Data

Now for some sample data.

All the SQL below can be found here, and all the C# here.

Let’s say we have two types of objects: Products and Orders. An Order is associated with one or more Products.

In C# our classes look like this:

…or in SQL we have tables defined like this:

Notice how because we have a many-to-many relationship, we need to define 3 tables in SQL, but need only define 2 entities in C#.

Now for the actual data.

C#

SQL

Notice how I have used a ProductIds collection in the Order class, rather than a Products collection. This is purely to allow the exploration of the join capabilities of Linq.

Let’s now look at some of the most common use cases for querying our data.

Selecting Everything

SQL

C#

Nothing too interesting here. In Linq the select part comes at the end, rather than the beginning of the query. This feels less intuitive to me as the whole query sounds less like a natural English sentence than its SQL equivalent.

Selecting Specific Columns

SQL

C#

This case shows some of the inherent differences between Linq and SQL. Linq is built with object-orientation in mind. In SQL you can easily specify which columns you want to return, as your result is always a dataset. Because Linq is a part of .NET, which is an object-oriented framework, its queries return collections of objects (and the collections themselves are also objects). So if all we want are the Product Ids and Names, we would have to do something strange such as the anonymous type notation used here. In reality it is very unlikely that you would use C# in this way, you would instead return the complete Product objects and just access whichever properties you needed. This could however present performance problems if the objects being returned are large and complex, and you only need to access a property or two.

When we select something other than the actual objects being queried, this is called a projection.

Filtering (WHERE)

SQL

C#

Straightforward enough.

Ordering

SQL

C#

Again, not much difference here.

Aggregate Functions

SQL

C#

Regarding aggregate functions, I was slightly surprised to discover that there is no way to achieve the desired result without using extension methods. This is because statements built purely from the query syntax always return a collection, and since aggregate functions return a single value an extension method is required. In SQL, queries always return a dataset. If your SQL query returns a single value, then it actually returns a dataset consisting of one column and one row.

Joins

SQL

C#

Because of the many-to-many relationship between Orders and Products, we need to create an inner query in order to obtain all Products related to an Order. When we use the join keyword in Linq, we must use it with an equals operator.

Selecting Unique Values (Distinct)

SQL

C#

Again we need to use an extension method in Linq to retrieve distinct values.

Selecting the First / Top n Results

SQL

C#

Finally, to retrieve the “top” n results only, we again need to use an extension method.

Conclusions

Having now learnt the basics of the Linq query syntax, I must admit I am not a fan. Firstly, I find the Linq syntax to be less intuitive than SQL. It does not seem to match the structure of English sentences well enough. Secondly, it feels unnatural to write this kind of code in C#. The Linq extension methods are not only easier to read and write, but also more intuitively C#-shaped. Off the top of my head I can’t think of any scenario which would not be supported by extension methods.

Here are all the queries above written using extension methods only:

Consequently it may well be the case that this is the last time I will ever write a query using Linq query syntax. Still, it was fun to explore.

Share Button