A query is an expression that retrieves data from a data source. Queries are usually expressed in a specialized query language. Different languages have been developed over time for the various types of data sources, for example SQL for relational databases and XQuery for XML. Therefore, developers have had to learn a new query language for each type of data source or data format that they must support. LINQ simplifies this situation by offering a consistent model for working with data across various kinds of data sources and formats. In a LINQ query, you are always working with objects. You use the same basic coding patterns to query and transform data in XML documents, SQL databases, ADO.NET Datasets, .NET collections, and any other format for which a LINQ provider is available.
Three Parts of a Query Operation
All LINQ query operations consist of three distinct actions:
Three Parts of a Query Operation
All LINQ query operations consist of three distinct actions:
- Obtain the data source.
- Create the query.
- Execute the query.
class IntroToLINQ { static void Main() { // The Three Parts of a LINQ Query: // 1. Data source. int[] numbers = new int[7] { 0, 1, 2, 3, 4, 5, 6 }; // 2. Query creation. // numQuery is an IEnumerable<int> var numQuery = from num in numbers where (num % 2) == 0 select num; // 3. Query execution. foreach (int num in numQuery) { Console.Write("{0,1} ", num); } } }
The following illustration shows the complete query operation. In LINQ the execution of the query is distinct from the query itself; in other words you have not retrieved any data just by creating a query variable.
- The Data Source
In the previous example, because the data source is an array, it implicitly supports the generic IEnumerable(Of T) interface. This fact means it can be queried with LINQ. A query is executed in a foreach statement, and foreach requires IEnumerable or IEnumerable(Of T). Types that support IEnumerable(Of T) or a derived interface such as the generic IQueryable(Of T) are called queryable types.
A queryable type requires no modification or special treatment to serve as a LINQ data source. If the source data is not already in memory as a queryable type, the LINQ provider must represent it as such. For example, LINQ to XML loads an XML document into a queryable XElement type:
// Create a data source from an XML document. // using System.Xml.Linq; XElement contacts = XElement.Load(@"c:\myContactList.xml");
Note: Types such as ArrayList that support the non-generic IEnumerable interface can also be used as a LINQ data source. For more information, see How to: Query an ArrayList with LINQ.
The query specifies what information to retrieve from the data source or sources. Optionally, a query also specifies how that information should be sorted, grouped, and shaped before it is returned. A query is stored in a query variable and initialized with a query expression. To make it easier to write queries, C# has introduced new query syntax.
The query in the previous example returns all the even numbers from the integer array. The query expression contains three clauses: from, where and select. (If you are familiar with SQL, you will have noticed that the ordering of the clauses is reversed from the order in SQL.) The from clause specifies the data source, the where clause applies the filter, and the select clause specifies the type of the returned elements. These and the other query clauses are discussed in detail in the LINQ Query Expressions (C# Programming Guide) section. For now, the important point is that in LINQ, the query variable itself takes no action and returns no data. It just stores the information that is required to produce the results when the query is executed at some later point. For more information about how queries are constructed behind the scenes, see Standard Query Operators Overview.
Note: Queries can also be expressed by using method syntax. For more information, see LINQ Query Syntax versus Method Syntax (C#).
Deferred Execution
As stated previously, the query variable itself only stores the query commands. The actual execution of the query is deferred until you iterate over the query variable in a foreach statement. This concept is referred to as deferred execution and is demonstrated in the following example:
// Query execution. foreach (int num in numQuery) { Console.Write("{0,1} ", num); }
Because the query variable itself never holds the query results, you can execute it as often as you like. For example, you may have a database that is being updated continually by a separate application. In your application, you could create one query that retrieves the latest data, and you could execute it repeatedly at some interval to retrieve different results every time.
Forcing Immediate Execution
Queries that perform aggregation functions over a range of source elements must first iterate over those elements. Examples of such queries are Count, Max, Average, and First. These execute without an explicit foreach statement because the query itself must use foreach in order to return a result. Note also that these types of queries return a single value, not an IEnumerable collection. The following query returns a count of the even numbers in the source array:
var evenNumQuery = from num in numbers where (num % 2) == 0 select num; int evenNumCount = evenNumQuery.Count();
List<int> numQuery2 = (from num in numbers where (num % 2) == 0 select num).ToList(); // or like this: // numQuery3 is still an int[] var numQuery3 = (from num in numbers where (num % 2) == 0 select num).ToArray();