Leniel Maccaferri's blog: LINQ

Showing posts with label LINQ. Show all posts

Counting Value Frequency with LINQ (code snippet)

Posted by Leniel Maccaferri on 9/09/2010 12:26:00 AM

This post is a code snippet: a short code example to demonstrate a topic.

From now on I’ll write in this blog useful code snippets that I come to write.

A question on StackOverflow motivated me to write a simple C# console application that demonstrates a way of counting how often a value appears within a given set. I used LINQ to accomplish this.

I’ve commented the code in green so that it’s easier to understand what’s going on.

using System;
using System.Collections.Generic;
using System.Linq;

class Program
{
    static void Main(string[] args)
    {
        // Creating a list of Values that serves as the data source for LINQ query.
        List<int> values = new List<int>()
        {
            0, 1, 2, 3, 6, 2, 3, 1, 0, 4, 5, 7, 8, 9, 3, 7, 5
        };

        // For each Value in the data source, let's group by Value into g. g stores all Values grouped.
        // Using the select new constructor we create an Anonymous Type that has each distinct Value and its Count (how often it appears).
        var query = from v in values
                    group v by v into g
                    select new { Value = g.Key, Count = g.Count() };

        // A simple foreach that prints in the screen each Value and its frequency.
        foreach (var v in query)
        {
            Console.WriteLine("Value = {0}, Count = {1}", v.Value, v.Count);
        }
    }
}

This is the output:

Value = 0, Count = 2
Value = 1, Count = 2
Value = 2, Count = 2
Value = 3, Count = 3
Value = 6, Count = 1
Value = 4, Count = 1
Value = 5, Count = 2
Value = 7, Count = 2
Value = 8, Count = 1
Value = 9, Count = 1

Hope you enjoy and make good use of this code.

Feel free to comment and add any other way of doing this in a more efficient or elegant fashion.

Parallel LINQ (PLINQ) with Visual Studio 2010/2012 - Perf testing

Posted by Leniel Maccaferri on 11/02/2009 10:49:00 PM

On the last day of May I wrote about how to calculate prime numbers with LINQ in C#. To close that post I said that I’d use the PrimeNumbers delegate to evaluate PLINQ (Parallel LINQ) and measure the performance gains when the same calculation is done in parallel instead of in a sequential fashion.

PLINQ is LINQ executed in Parallel, that is, using as much processing power as you have in your current computer.

If you have a computer with 2 processor cores like a dual core processor you'll get your Language Integrated Query operators do the work in parallel using both cores.

Using "only" LINQ you won't get as much performance because the standard Language Integrated Query operators won't parallelize your code. That means your code will run in a serial fashion not taking advantage of all your available processor cores.

There are lots of PLINQ query operators capable of executing your code using well known parallel patterns.

After this brief introduction to PLINQ let’s get to the code.

As promised, today I show the performance gains when the PrimeNumbers delegate is run in 2 cores (parallel) instead of only 1 core (sequential).

Here’s the delegate code:

Func<int, IEnumerable<int>> PrimeNumbers = max =>
from i in Enumerable.Range(2, max - 1)
where Enumerable.Range(2, i - 2).All(j => i % j != 0)
select i;

To make it a candidate to parallelization we must just call the AsParallel() extension method on the data to enable parallelization for the query:

Func<int, IEnumerable<int>> PrimeNumbers = max =>
from i in Enumerable.Range(2, max - 1).AsParallel()
where Enumerable.Range(2, i - 2).All(j => i % j != 0)
select i;

I set up a simple test to measure the time elapsed when using the two possible ways of calling the delegate function, that is, sequentially in one core and parallelized in my two available cores (I have an Intel Pentium Dual Core E2180 @ 2.00 GHz / 2.00 GHz).

Let’s calculate the prime numbers that are less than 50000 sequentially and in parallel:

IEnumerable<int> result = PrimeNumbers(50000);

Stopwatch  stopWatch = new Stopwatch();

stopWatch.Start();

foreach(int i in result)
{
    Console.WriteLine(i);
}

stopWatch.Stop();

// Write time elapsed
Console.WriteLine("Time elapsed: {0}", stopWatch.Elapsed);

Now the results:

1 core
Time elapsed: 00:00:06.0252929

2 cores
Time elapsed: 00:00:03.2988351

8 cores*
Time elapsed: 00:00:00.8143775

* read the Update addendum bellow

When running in parallel using the #2 cores, the result was great - almost half the time it took to run the app in a sequential fashion, that is, in only #1 core.

The whole work gets divided into two worker threads/tasks as shown in Figure 1:

Prime Numbers PLINQ Parallel Stacks Window ( #2 cores )
Figure 1 - The Parallel Stacks window in Visual Studio 2010 ( #2 cores )

You can see that each thread is responsible for a range of values (data is partitioned among available cores). Thread 1 is evaluating the value 32983 and Thread 3 is evaluating 33073. This all occurs synchronously.

If I had a computer with 4 cores, the work would be divided into 4 threads/tasks and so on. If the time kept decreasing I’d achieve 1.5 seconds to run the app. Fantastic, isn’t it?

The new Microsoft Visual Studio 2010 (currently in Beta 2) comes with great debugging tooling for parallel applications as for example the Parallel Stacks shown in Figure 1 and the Parallel Tasks window shown in Figure 2:

Prime Numbers PLINQ Parallel Tasks Window ( #2 cores )
Figure 2 - The Parallel Tasks window in Visual Studio 2010 ( #2 cores )

This post gives you a rapid view of PLINQ and how it can leverage the power of you current and future hardware.

The future as foreseen by hardware industry specialists is a multicore future. So why not get ready to it right now? You certainly can with PLINQ. It abstracts all the low level code to get parallel and let’s you focus on what’s important: your business domain.

If you want to go deep using PLINQ, I advise you to read Patterns for Parallel Programming: Understanding and Applying Parallel Patterns with the .NET Framework 4 by Stephen Toub.

Updated on February 15, 2013

Running this same sample app on a Intel Core i7-3720QM 2.6GHz quad-core processor (with #4 cores and #8 threads) this is the result:

Time elapsed: 00:00:00.8143775

This is on a par with the #1 core and #2 cores tests shown above. The work is being divided "almost" evenly by 8 if we compare with the first benchmark ( only #1 core ).

00:00:06.0252929 / 8 = 0.7525

Of course there’s been lots of improvements between these different processor generations. The software algorithms used to parallelize the work have also been improved (now I’m running on Visual Studio 2012 with .NET 4.5). Both hardware/software specs are higher now. Nonetheless these numbers give a good insight about the performance gains both in terms of hardware and software. Software developers like me have many reasons to celebrate! Party smile

Prime Numbers PLINQ Parallel Tasks Window ( #8 threads ) Figure 3 - The Parallel Stacks window in Visual Studio 2012 ( #8 threads )

If I take out the .AsParallel() operator, the program runs on a single core and the time increases substantially:

Time elapsed: 00:00:03.4362160

If compared with the #4 cores benchmark above, we have:

00:00:03.4362160 / 4 = 0.8575

0.8575 – 0.8143 = 0.0432 (no difference at all)

Note: this faster processor running on a single core has a performance equivalent to the old Intel #2 core processor. Pretty interesting.

References
Features new to parallel debugging in VS 2010
Debugging Task-Based Parallel Applications in Visual Studio 2010 by Daniel Moth and Stephen Toub

Great lecture on what to expect from the multicore and parallel future…
Slides from Parallelism Tour by Stephen Toub

PLINQ documentation on MSDN
http://msdn.microsoft.com/en-us/library/dd460688%28VS.100%29.aspx

Parallel Computing Center on MSDN
http://msdn.microsoft.com/en-us/concurrency/default.aspx

Daniel Moth’s blog
http://www.danielmoth.com/Blog/index.htm

Microsoft Visual Studio 2010
http://www.microsoft.com/visualstudio/en-us/products/2010/default.mspx

Finding missing numbers in a list using LINQ with C#

Posted by Leniel Maccaferri on 10/12/2009 08:07:00 PM

Let’s say you have a list of integer values that represent the days of a month like this:

6, 2, 4, 1, 9, 7, 3, 10, 15, 19, 11, 18, 13, 22, 24, 20, 27, 31, 25, 28

Clearly we have missing numbers/days in the list above. They are:

5 8 12 14 16 17 21 23 26 29 30

It’s really easy to get a list of missing numbers using LINQ with C# and the Except operator. LINQ is the greatest addition to the C# language. I can imagine how life would be difficult if we hadn’t LINQ!

This is how I implemented a missing numbers finder using a C# extension method:

public static class MyExtensions
{
    /// <summary>
    /// Finds the missing numbers in a list.
    /// </summary>
    /// <param name="list">List of numbers</param>
    /// <returns>Missing numbers</returns>
    public static IEnumerable<int> FindMissing(this List<int> list)
    {
        // Sorting the list
        list.Sort();

        // First number of the list
        var firstNumber = list.First();

        // Last number of the list
        var lastNumber = list.Last();

        // Range that contains all numbers in the interval
        // [ firstNumber, lastNumber ]
        var range = Enumerable.Range(firstNumber, lastNumber - firstNumber);

        // Getting the set difference
        var missingNumbers = range.Except(list);

        return missingNumbers;
    }
}

Now you can call the extension method in the following way:

class Program
{
    static void Main(string[] args)
    {
        // List of numbers
        List<int> daysOfMonth =
            new List<int>() { 6, 2, 4, 1, 9, 7, 3, 10, 15, 19, 11, 18, 13, 22, 24, 20, 27, 31, 25, 28 };

        Console.Write("\nList of days: ");

        foreach(var num in daysOfMonth)
        {
            Console.Write("{0} ", num);
        }

        Console.Write("\n\nMissing days are: ");

        // Calling the Extension Method in the List of type int 
        foreach(var number in daysOfMonth.FindMissing())
        {
            Console.Write("{0} ", number);
        }
    }
}

This is the output:

Missing Numbers Finder output

In this simple program I’m using 3 concepts of the C# language that are really interesting: implicitly typed local variables, extension methods and collection initializers.

Hope this simple extension method to find the missing elements of a sequence helps the developers out there.

Visual Studio 2008 C# Console Application
You can get the Microsoft Visual Studio Project at:

http://leniel.googlepages.com/MissingNumbersFinder.zip

To try out the code you can use the free Microsoft Visual C# 2008 Express Edition that you can get at: http://www.microsoft.com/express/vcsharp/

Calculating prime numbers with LINQ in C#

Posted by Leniel Maccaferri on 5/31/2009 03:33:00 AM

If you want to see how to use PLINQ (Parallel LINQ) to get performance gains when performing this calculation, take a look at the post titled
Parallel LINQ (PLINQ) with Visual Studio 2010.

To serve as a processing test in a follow up post I’m posting how to calculate prime numbers using a LINQ query expression.

In mathematics, a prime number (or a prime) is a natural number which has exactly two distinct natural number divisors: 1 and itself. The first twenty-five prime numbers are:

2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97

The following code^[1] shows how to calculate the prime numbers that are less equal ( <= ) a given number “max”:

Func<int, IEnumerable<int>> primeNumbers = max =>
     from i in Enumerable.Range(2, max - 1)
     where Enumerable.Range(2, i - 2).All(j => i % j != 0)
     select i;

IEnumerable<int> result = primeNumbers(10);

foreach(int i in result)
{
  Console.WriteLine(i);
}

In my final computer engineering assignment I presented LINQ - Language Integrated Query and its constructs. With such constructs it’s possible to achieve a high degree of method chaining.

LINQ is declarative, not imperative. It allows us to simply state what we want to do without worrying about how it is done.

In the code above we declare a Func<(Of <(T, TResult>)>) generic delegate which has an int as input parameter and an IEnumerable<int> as the result.
Func delegates are very useful for encapsulating user-defined expressions that are applied to each element in a set of source data.

Using a lambda expression ( => ) we assign a query expression to the delegate.
A lambda expression is an anonymous function that can contain expressions and statements, and can be used to create delegates or expression tree types.

The query expression states that from each value i in the Enumerable.Range(2, max - 1) where all elements of the range Enumerable.Range(2, i – 2) satisfy the condition All(j => i % j != 0), we select i.

max is the delegate input parameter.

The % symbol is the modulus operator in C#. It computes the remainder after dividing its first operand by its second.

For example: considering max = 10 we’d have the following…

The range in the from clause is: { 2, 3, 4, 5, 6, 7, 8, 9, 10 }

Taking the 1st value out of the from range, we have i = 2.

The range in the where clause is Range (2, 2 - 2) = Range (2 , 0) = { }.

Since i = 2 is by default included in the result, what must be evaluated lies where the variable max is used. So, taking the 2nd value and assigning it to i we have i = 3.

The range in the where clause is Range (2, 3 - 2) = Range (2 , 1) = { 2 }.

Evaluating j => i % j we have 3 % 2 = 1, and so, 3 is a prime number.

Taking the 3rd value and assigning it to i we have i = 4.

The range in the where clause is Range (2, 4 - 2) = Range (2, 2) = { 2, 3 }.

Evaluating j => i % j we have 4 % 2 = 0 and 4 % 3 = 1, and so 4 is not a prime number because all elements of the where range must yield a result != 0 for the expression i % j != 0.

Now we have i = 5.

The range in the where clause is Range (2, 5 - 2) = Range (2 , 3) = { 2, 3, 4 }.

Evaluating j => i % j we have 5 % 2 = 1, 5 % 3 = 2 and 5 % 4 = 1. From this we have that 5 is a prime number.

Now we have i = 6.

The range in the where clause is Range (2, 6 - 2) = Range (2 , 4) = { 2, 3, 4, 5 }.

Evaluating j => i % j we have 6 % 2 = 0, 6 % 3 = 0, 6 % 4 = 2 and 6 % 5 = 1. From this we have that 6 is not prime number.

Now we have i = 7.

The range in the where clause is Range (2, 7 - 2) = Range (2 , 5) = { 2, 3, 4, 5, 6 }.

Evaluating j => i % j we have 7 % 2 = 1, 7 % 3 = 1, 7 % 4 = 3, 7 % 5 = 2 and 7 % 6 = 1. From this we have that 7 is prime number.

Now we have i = 8.

The range in the where clause is Range (2, 8 - 2) = Range (2 , 6) = { 2, 3, 4, 5, 6, 7 }.

Evaluating j => i % j we have 8 % 2 = 0, 8 % 3 = 2, 8 % 4 = 0, 8 % 5 = 3, 8 % 6 = 2 and 8 % 7 = 1. 8 isn’t a prime number.

Now we have i = 9.

The range in the where clause is Range (2, 9 - 2) = Range (2 , 7) = { 2, 3, 4, 5, 6, 7, 8 }.

Evaluating j => i % j we have 9 % 2 = 1, 9 % 3 = 0, 9 % 4 = 1, 9 % 5 = 4, 9 % 6 = 3, 9 % 7 = 2 and 9 % 8 = 1. 9 isn’t a prime number.

Now we have i = 10.

The range in the where clause is Range (2, 10 - 2) = Range (2 , 8) = { 2, 3, 4, 5, 6, 7, 8, 9 }.

Evaluating j => i % j we have 10 % 2 = 0, 10 % 3 = 1, 10 % 4 = 2, 10 % 5 = 0, 10 % 6 = 4, 10 % 7 = 3, 10 % 8 = 2 and 10 % 9 = 1. 10 isn’t a prime number.

Finally we have 4 prime numbers <= 10; they are: { 2, 3, 5, 7 }.

The following table illustrates the result:

i	where Range(2, i - 2)	All(j => i % j != 0)
2	{ }
3	{ 2 }	3 % 2 = 1
4	{ 2, 3 }	4 % 2 = 0 4 % 3 = 1
5	{ 2, 3, 4 }	5 % 2 = 1 5 % 3 = 2 5 % 4 = 1
6	{ 2, 3, 4, 5 }	6 % 2 = 0 6 % 3 = 0 6 % 4 = 2 6 % 5 = 1
7	{ 2, 3, 4, 5, 6 }	7 % 2 = 1 7 % 3 = 1 7 % 4 = 3 7 % 5 = 2 7 % 6 = 1
8	{ 2, 3, 4, 5, 6, 7 }	8 % 2 = 0 8 % 3 = 2 8 % 4 = 0 8 % 5 = 3 8 % 6 = 2 8 % 7 = 1
9	{ 2, 3, 4, 5, 6, 7, 8 }	9 % 2 = 1 9 % 3 = 0 9 % 4 = 1 9 % 5 = 4 9 % 6 = 3 9 % 7 = 2 9 % 8 = 1
10	{ 2, 3, 4, 5, 6, 7, 8, 9 }	10 % 2 = 0 10 % 3 = 1 10 % 4 = 2 10 % 5 = 0 10 % 6 = 4 10 % 7 = 3 10 % 8 = 2 10 % 9 = 1

I’ll use the primeNumbers delegate in a future post to evaluate PLINQ (Parallel LINQ) and measure the performance gains when the same calculation is done in parallel instead of in sequence.

References
[1] Perfetti, Michel. LINQ Quiz: The list of prime numbers in 3 clauses? 2007. Available at <http://blogs.developpeur.org/raptorxp/archive/2007/11/26/quizz-linq-la-liste-des-nombres-premiers-en-3-clauses.aspx>. Accessed on May 31, 2009.

Line prefixer suffixer in C#

Posted by Leniel Maccaferri on 1/20/2009 01:52:00 AM

I extracted a lot of Ids from a database table and needed to pass such Ids as a parameter to a webservice method. The webservice method was expecting a parameter of type List<long>. I didn’t find a way of passing such a list to the webservice using the built in webservice form constructed by Visual Studio. The cause is that a List<long> isn’t a primitive type.

Talking with my peers I learned of a tool called soapUI. It’s a tool used to test webservices. Using it I could pass the list of Ids.

I created a new project in soapUI passing to it the webservice WSDL URL and I was ready to go.

soapUI New Project

New soapUI Project

This is the value I’ve put in Initial WSDL/WADL:

http://localhost:7777/WebServices/MyWebserviceName.asmx?WSDL

After clicking OK, soapUI will then load the webservice definition.

Clicking in Request 1 as shown in the following picture, the XML of a SOAP envelope appears so that we can test the webservice method.

soapUI Request 1

The problem now was that I had a file called “input.txt” with only the Ids – each one in its proper line. The XML of the SOAP envelope expect that each id be passed in the format:

<ns:long>?</ns:long>

For example,

<ns:long>7</ns:long>

As we can observe, my input data don’t fit the pattern required by the XML.

To put my data in conformity with the XML I created a small but useful application called LinePrefixerSuffixer that receives the name of an input file containing the the initial data, the text to be “prefixed” in the start of each line, the text to be “suffixed” in the end of each line of the file and the name of the output file.

So for example, to comply with the above pattern, I’d call the console application with:

LinePrefixerSuffixer input.txt “<ns:long>” “</ns:long>” output.txt

Let’s say I have a file called input.txt with 1000 numbers in the same directory of the LinePrefixerSuffixer.exe executable.

Each line of the input.txt file has a number as:

1
2
3
4
5
6
7
.
.
.

Running the above command line in the command prompt I’d get a file called output.txt with each line now in the format I want, that is:

<ns:long>1</ns:long>
<ns:long>2</ns:long>
<ns:long>3</ns:long>
<ns:long>4</ns:long>
<ns:long>5</ns:long>
<ns:long>6</ns:long>
<ns:long>7</ns:long>
. 
. 
.

Line Prefixer Suffixer

The C# code of the app is as follow:

using System;
using System.Collections.Generic;
using System.Linq;
using System.IO;

namespace LinePrefixerSuffixer
{
    class Program
    {
        static void Main(string[] args)
        {
            try
            {
                // Read in all lines of the file using query expression (LINQ).
                // I "prefix" the start of each line with the content of args[1] and
                // "suffix" the end of each line with the content of args[2].
                IEnumerable<string> fileLines = from line in File.ReadAllLines(args[0])
                                                select args[1] + line + args[2];

                // Writing the prefixed and suffixed file lines to a file named with the content of args[3].
                File.WriteAllLines(args[3], fileLines.ToArray());

                Console.WriteLine("Operation done.");
            }
            catch(Exception e)
            {
                Console.WriteLine("Use: LinePrefixerSuffixer <input.txt> prefix suffix <output.txt>");
            }
        }
    }
}

Now I can pass the content of the output.txt file to soapUI without worrying about having to manually prefix/suffix each line of my input.txt file:

soapUI Request 1

Summary
In this post we saw how to build a simple but powerful application that prefixes and suffixes each line of a file.

We’ve used concepts related to file handling and LINQ and with only 4 lines of code we could manage to accomplish the task.

I think this shows how powerful modern programming languages as C# enables a clean and beautiful coding experience.

Hope this helps.

Visual Studio C# Console Application
You can get the Microsoft Visual Studio Project and the app executable at:

http://leniel.googlepages.com/LinePrefixerSuffixer.zip

Note: As this program uses LINQ, you must have Microsoft .NET Framework 3.5 runtime libraries installed on you computer. You can get it at:

http://www.microsoft.com/downloads/details.aspx?FamilyID=333325fd-ae52-4e35-b531-508d977d32a6&DisplayLang=en

References
[1] soapUI - the Web Services Testing tool. Available at <http://www.soapui.org>. Accessed on January 20, 2009.

[2] Sam Allen. File Handling - C#. Available at <http://dotnetperls.com/Content/File-Handling.aspx>. Accessed on January 20, 2009.

[3] LINQ. The LINQ Project. Available at <http://msdn.microsoft.com/en-us/netframework/aa904594.aspx>. Accessed on January 20, 2008.

[4] LINQ. Language Integrated Query. Available at <http://www.leniel.net/2008/01/linq-language-integrated-query.html>. Accessed on January 20, 2008.

LINQ - Language Integrated Query

Posted by Leniel Maccaferri on 1/29/2008 08:25:00 PM

LINQ
The LINQ Project is a codename for a set of extensions to the .NET Framework that encompass language-integrated query, set, and transform operations. It extends C# and Visual Basic with native language syntax for queries and provides class libraries to take advantage of these capabilities.

My bachelor's degree graduation project

As a result of my graduation project in the computer engineering course I ended up with a concise document describing the idea behind LINQ. The document is available only in Portuguese so that I think it's a valuable source of information to people that know Portuguese given the fact that great material about LINQ is only available in English.

In addition to the intrinsic subjects related to the integration of the query language (SQL) into the programming language (C#), in this paper you'll also find information about the great language extensions that form the base of LINQ:

Generics
Anonymous methods
Iterators
Partial types
Nullable types
Query expressions
Automatically implemented properties
Implicitly typed local variables
Extension methods
Partial methods
Lambda expressions
Object initializers
Collection initializers
Anonymous types
Implicitly typed arrays
Expression trees

See the paper's abstract below (English/Português):

ABSTRACT

Macaferi, Leniel Braz de Oliveira. Query language integrated into the programming language. 2007. 96f. Monograph (bachelor’s degree in Computer Engineering) - Barra Mansa University Center, Barra Mansa, 2007. www.ubm.br

Data is the raw material of computation and is processed via software. Software products are generally structured in tiers, typically three, the data tier, the middle or business tier and the presentation or client tier. Each of these tiers has its own data model. These different paradigms cause the impedance mismatch problem between these three disparate models.

Instead of trying to unify at the data model level, a better approach is to unify at the level of algebraic operations that can be defined the same way over each data model. This allows us to define a single query language that can be used to query and transform any data model. All the data model need to do is to implement a small set of standard query operators, and each data model can do so in a way natural to itself.

The industry has reached a stable point in the evolution of object-oriented (OO) programming technologies. Programmers now take for granted the facilities of oriented programming languages and their features like classes, objects, methods and events. Such languages support the creation and use of higher order, functional style class libraries. The support is the result of new language extensions being developed. These extensions enable the construction of compositional application program interfaces (APIs) that have equal expressive power of query languages inside the programming language syntax. This makes it possible to implement the standard query operators. The standard query operators can be then applied to all sources of data, not just relational or XML domains.

This work aims to present and use the most important aspects of the language integrated query with special focus on the integration of the SQL query language into the C# programming language. Aspects as simplification of the way of writing queries, unification of the syntax for querying any data source, reinforcement of the connection between relational data and the object oriented world and less time spent in the software development process.

Keywords: query language, programming language, data models, SQL, C#, LINQ

RESUMO

Macaferi, Leniel Braz de Oliveira. Linguagem de pesquisa integrada à linguagem de programação. 2007. 96f. Monografia (bacharelado em Engenharia de Computação) - Centro Universitário de Barra Mansa, Barra Mansa, 2007. www.ubm.br

Dados formam a matéria prima da computação e são processados via software. Produtos de software são geralmente estruturados em camadas, tipicamente três: a camada de dados, a camada intermediária ou de lógica e a camada de apresentação ou do cliente. Cada uma destas camadas possui seu próprio modelo de dados. Estes diferentes paradigmas causam o problema da combinação mal sucedida entre estes três modelos completamente diferentes.

Ao invés de tentar unificar no nível do modelo de dados, uma melhor alternativa é unificar no nível das operações algébricas que podem ser definidas do mesmo modo sobre cada modelo de dados. Isto nos permite definir uma única linguagem de pesquisa que pode ser usada para pesquisar e transformar qualquer modelo de dados. Tudo o que os modelos de dados precisam implementar é um pequeno conjunto de operadores de pesquisa padrão, e cada modelo de dados pode fazer isto de uma maneira natural.

A indústria chegou a um ponto estável na evolução das tecnologias de programação orientada a objetos (OO). Desenvolvedores agora têm por certo as facilidades das linguagens de programação OO e seus ricos recursos iguais a classes, objetos, métodos e eventos. Tais linguagens suportam a criação e uso de bibliotecas de classe de estilo funcional de ordem maior. O suporte é o resultado das novas extensões de linguagem de programação que estão sendo desenvolvidas. Estas extensões permitem a criação de interfaces para programação de aplicativos (APIs) composicionais que possuem poderosas capacidades de pesquisa dentro da sintaxe da linguagem de programação. Isto torna viável a implementação dos operadores de pesquisa padrão. Os operadores de pesquisa padrão podem ser aplicados em todas as fontes de informação, não somente em domínios de bancos de dados relacionais ou XML.

Este trabalho visa apresentar e utilizar os aspectos mais importantes da linguagem integrada de pesquisa com foco na integração da linguagem de pesquisa SQL à linguagem de programação C#. Aspectos como a simplificação da maneira de escrever pesquisas, unificação da sintaxe para pesquisar qualquer fonte de dados, reforço da conexão entre dados relacionais e o mundo orientado a objetos e o menor tempo gasto no processo de desenvolvimento de software.

Palavras-chave: linguagem de pesquisa, linguagem de programação, modelos de dados, SQL, C#, LINQ

SUMÁRIO
1 INTRODUÇÃO 15
  1.1 Delimitação do tema 16
  1.2 Problema 16
  1.3 Enunciado das hipóteses 17
  1.4 Objetivos específicos e geral 18
  1.5 Justificativa do trabalho 18
2 FUNDAMENTAÇÃO TEÓRICA 19
  2.1 Linguagem de pesquisa 19
      2.1.1 Pesquisa 19
  2.2 Linguagem de programação 19
  2.3 Combinação mal sucedida entre as linguagens de pesquisa e de programação 20
  2.4 Programação orientada a objetos 24
      2.4.1 Classe e objeto 25
      2.4.2 Variável e tipo 25
      2.4.3 Membro 25
      2.4.4 Acessibilidade 25
      2.4.5 Método 26
      2.4.6 Parâmetro 26
      2.4.7 Troca de mensagem 26
      2.4.8 Herança 26
      2.4.9 Encapsulamento 26
      2.4.10 Abstração 27
      2.4.11 Polimorfismo 27
      2.4.12 Interface 27
      2.4.13 Delegate 27
  2.5 Banco de dados relacional 28
      2.5.1 Relação ou tabela 28
      2.5.2 Restrição 28
      2.5.3 Domínio de dado 28
      2.5.4 Chave primária 29
      2.5.5 Chave estrangeira 29
      2.5.6 Stored procedure 29
      2.5.7 View 29
      2.5.8 User defined function 30
  2.6 .NET Framework 30
      2.6.1 Principais recursos 32
      2.6.2 Arquitetura 33
      2.6.3 Infra-estrutura de linguagem comum 33
      2.6.4 Assemblies 34
      2.6.5 Metadados 35
      2.6.6 Biblioteca de classes base 35
  2.7 SQL 35
  2.8 C# 35
3 METODOLOGIA 37
  3.1 Extensões de linguagem 37
      3.1.1 Genéricos 37
      3.1.2 Métodos anônimos 38
      3.1.3 Iteradores 38
      3.1.4 Tipos parciais 40
      3.1.5 Tipos anuláveis 41
      3.1.6 Expressões de pesquisa 43
      3.1.7 Propriedades automaticamente implementadas 44
      3.1.8 Variáveis locais implicitamente tipificadas 45
      3.1.9 Métodos de extensão 46
      3.1.10 Métodos parciais 47
      3.1.11 Expressões lambda 49
      3.1.12 Inicializadores de objeto 50
      3.1.13 Inicializadores de coleção 50
      3.1.14 Tipos anônimos 51
      3.1.15 Arrays implicitamente tipificados 52
      3.1.16 Árvores de expressão 53
4 DESENVOLVIMENTO 54
  4.1 Linguagem de pesquisa integrada à linguagem de programação 54
      4.1.1 Operadores de pesquisa padrão 56
      4.1.2 Fonte de dados 61
      4.1.3 Operação de pesquisa 62
      4.1.4 Modelo de objetos 63
  4.2 Estudo de caso 64
      4.2.1 Classes do modelo de objetos 68
      4.2.2 DataContext 68
      4.2.3 Relacionamentos 69
      4.2.4 Pesquisa de dados 70
      4.2.5 Operações de insert, update e delete 71
5 CONCLUSÃO 73
  5.1 Avanços 73
  5.2 Limitações 74
  5.3 Trabalhos relacionados 74
  5.4 Trabalhos futuros 76
6 BIBLIOGRAFIA 78
ANEXOS 81

You can get a PDF copy of the full paper at:

https://drive.google.com/file/d/1nDbZXqKsE_jzxz4qB1ZOlgKu3LulSIKi/view?usp=sharing (Portuguese - Brazil)