Showing posts with label thread safety. Show all posts
Showing posts with label thread safety. Show all posts

Creating Excel spreadsheets .XLS and .XLSX in C#

Interesting discussion at StackOverflow:
Create Excel (.XLS and .XLSX) file from C#

If you want to see how to combine NPOI + Excel Table and Chart,
take a look at the post titled NPOI with Excel Table and dynamic Chart.

NPOI 2.0 series of posts scheduled

Recently I had to implement some code to create an Excel spreadsheet/report using C#.

The task was: given an Excel spreadsheet template - a .XLS file (with formulas, pivot tables, macros, etc) I had to fill some data in one of the sheets of the spreadsheet and send this modified spreadsheet back to the user requesting such an operation (Excel report).

The following attests the need for Excel nowadays:

Excel has long been recognized as the de facto standard when it comes to presenting management reports. The unique combination of great calculation engine, excellent charting facilities, pivot tables and the possibility to perform “what if” analysis, make it the “must have” business intelligence tool.
by John Tunnicliffe

I had a great time while studying the possible ways of doing what the task asks for.

It appears to be a simple task at first but as the time passes by you get to know that this is not the case, well, till the moment this blog post was written at least, I think. :-)

Firstly I tried automating Excel through COM automation, but as the application was meant to be used in a web context it is not recommended to use automation. Why? Because COM automation for Excel is not thread safe, that is, EXCEL.EXE was not constructed to be used by concurrent users accessing the same process, in this case EXCEL.EXE; besides, Microsoft Excel must be installed on the server what is not always possible.

For more details on why you shouldn’t use Excel on the server, read this article on Microsoft Help and Support site: Considerations for server-side Automation of Office. The key part is this:

Microsoft does not currently recommend, and does not support, Automation of Microsoft Office applications from any unattended, non-interactive client application or component (including ASP, ASP.NET, DCOM, and NT Services), because Office may exhibit unstable behavior and/or deadlock when Office is run in this environment.

I’ve just experienced the above. EXCEL.EXE insisted in being alive in task manager even after processing the data and being closed in code. Each call to process a spreadsheet opens an EXCEL.EXE process on the server. With such EXCEL.EXE processes don’t being closed as they should you get lots of those processes on memory which could overload the server.

Do not use COM automation if you are developing server-side code.

ExcelPackageAfter struggling with COM automation I finally got to know ExcelPackage  which works with Office Open Document Format (OOXML). It can read an .XLSX (Microsoft Excel 2007) template and create another .XLSX file based on such template, giving you lots of possibilities to work with the template copy.

I’ve gotten really happy because I had found a way of doing what the task was asking for but with a minor detail: ExcelPackage works only with .XLSX file format letting the .XLS (Microsoft Excel 2003) format out of the game. Well it turned out to be a big impediment because the client (software buyer) wouldn’t allow us to install the famous Microsoft Office Compatibility Pack for Word, Excel, and PowerPoint 2007 File Formats on user machines so that users could open and save OOXML file formats even using Microsoft Office 2003 suite.

Discovering ExcelPackage was good but it didn’t do the trick that implies the use of an .XLS template.

NPOII got back Googling again and strived to find an open source library that would allow me to do what I wanted. After some time I finally discovered NPOI.  Wow, it could read the .XLS template and generate the end result I wanted. Great. I downloaded NPOI 1.2.1 for .NET 2.0 binaries immediately and started playing with it to see what it could really do.

OK, after this short story, I’ll show you how to use both open source projects (ExcelPackage and NPOI).

I’ve created a new ASP.NET MVC project as can be seen in this picture:

 Excel Writer Solution Explorer

In the Content folder I’ve placed the template spreadsheets.
In the Libs folder I’ve placed the DLLs necessary to use both ExcelPackage and NPOI open source projects.

The controller that interests us is the one called ExcelWriterController:

Excel Writer Controller

The methods that handle the creation of the spreadsheet are: ExcelPackageCreate and NPOICreate.

For each controller action (method) there’s a corresponding view that renders the UI to the user. Those views are the ones shown inside the ExcelWriter folder: ExcelPackage.aspx and NPOI.aspx.

This is the Home Page of the Excel Writer MVC Application - take a look at the tabs (ExcelPackage and NPOI) that lead you to the View pages:

Excel Writer Home Page

Each view has a button which when clicked calls the corresponding action method on the ExcelWriterController.

This is the NPOI view page:

Excel Writer NPOI View Page

I’ll play with a simple spreadsheet I filled with the data I got from Excel’s blog post titled Formula to Access a List of Values Interspersed with Zeros or Blanks.

Let’s see the code that goes into the ExcelPackageCreate method:

/// <summary>
/// Creates a new Excel spreadsheet based on a template using the ExcelPackage library.
/// A new file is created on the server based on a template.
/// </summary>
/// <returns>Excel report</returns>
[AcceptVerbs(HttpVerbs.Post)]
public ActionResult ExcelPackageCreate()
{
    try
    {
        FileInfo template = new FileInfo(Server.MapPath(@"\Content\ExcelPackageTemplate.xlsx"));

        FileInfo newFile = new FileInfo(Server.MapPath(@"\Content\ExcelPackageNewFile.xlsx"));

        // Using the template to create the newFile...
        using(ExcelPackage excelPackage = new ExcelPackage(newFile, template))
        {
            // Getting the complete workbook...
            ExcelWorkbook myWorkbook = excelPackage.Workbook;

            // Getting the worksheet by its name...
            ExcelWorksheet myWorksheet = myWorkbook.Worksheets["Sheet1"];

            // Setting the value 77 at row 5 column 1...
            myWorksheet.Cell(5, 1).Value = 77.ToString();

            // Saving the change...
            excelPackage.Save();
        }

        TempData["Message"] = "Excel report created successfully!";

        return RedirectToAction("ExcelPackage");
    }
    catch(Exception ex)
    {
        TempData["Message"] = "Oops! Something went wrong.";

        return RedirectToAction("ExcelPackage");
    }
}

Let’s see the code that goes into the NPOICreate method:

/// <summary>
/// Creates a new Excel spreadsheet based on a template using the NPOI library.
/// The template is changed in memory and a copy of it is sent to
/// the user computer through a file stream.
/// </summary>
/// <returns>Excel report</returns>
[AcceptVerbs(HttpVerbs.Post)]
public ActionResult NPOICreate()
{
    try
    {
        // Opening the Excel template...
        FileStream fs =
            new FileStream(Server.MapPath(@"\Content\NPOITemplate.xls"), FileMode.Open, FileAccess.Read);

        // Getting the complete workbook...
        HSSFWorkbook templateWorkbook = new HSSFWorkbook(fs, true);

        // Getting the worksheet by its name...
        HSSFSheet sheet = templateWorkbook.GetSheet("Sheet1");

        // Getting the row... 0 is the first row.
        HSSFRow dataRow = sheet.GetRow(4);

        // Setting the value 77 at row 5 column 1
        dataRow.GetCell(0).SetCellValue(77);

        // Forcing formula recalculation...
        sheet.ForceFormulaRecalculation = true;

        MemoryStream ms = new MemoryStream();

        // Writing the workbook content to the FileStream...
        templateWorkbook.Write(ms);

        TempData["Message"] = "Excel report created successfully!";

        // Sending the server processed data back to the user computer...
        return File(ms.ToArray(), "application/vnd.ms-excel", "NPOINewFile.xls");
    }
    catch(Exception ex)
    {
        TempData["Message"] = "Oops! Something went wrong.";

        return RedirectToAction("NPOI");
    }
}

One drawback of the ExcelPackage library is that it must create a file on the server. There are some modifications to the library that enables you to create the template copy on memory and send it to the user as the NPOI library does. Take a look at the ExcelPackage’s discussion page at CodePlex and specifically this thread: Why create Excel spreadsheets on the server?

The great thing about NPOI of course is that it enables you to work with the template in code and then send a copy of the spreadsheet directly to the user. The template remains intact and the user receives a modified copy of the template which contains the data processed by the application.

With NPOI when you click on the Create Excel report button you get the download dialog window:

Excel Writer NPOI Download dialog

With ExcelPackage you’d have to get the path of the file created on the server, in this case \Content\ExcelPackageNewFile.xlsx and then send that file to the user. This is an extra step and adds an additional burden to the server. I didn’t implement it and so I let this as an exercise to you.

Well, the spreadsheet included in this simple project has only formulas but you can for sure have an Excel template with lots of formulas, pivot tables, macros, etc. This gives you the power of Excel in code in a clean fashion.

Hope this helps shed some light on this topic!

Note
Using the open source libraries presented in this post you won’t need Microsoft Excel installed on the server.

Updated on 9/22/2010

There’s now EPPlus that extends ExcelPackage.

EPPlus is a .net library that reads and writes Excel 2007 files using the Open Office XML format (XLSX).

EPPlus supports ranges, cell styling, charts, pictures, shapes, named ranges, autofilters and a lot of other stuff.

Updated on 6/2/2010

I’m working on an ASP.NET project that uses .NET Framework 1.1. In such a case I needed to use NPOI 1.2.1 for .NET 1.1.

The code you’ll use with NPOI 1.2.1 for .NET 1.1 is the same presented on this post. Just pay attention to include this using directive inside your C# code:

using NPOI.HSSF.UserModel

Updated on 1/3/2010

If you like NPOI, help spread the word about it voting up on this ad:

Help make NPOI even more Awesome!

Visual Studio 2008 C# ASP.NET MVC Web Application
You can get the Microsoft Visual Studio Project at:

http://sites.google.com/site/leniel/blog/ExcelWriterMvcProject.zip

To try out the code you can use the free Microsoft Visual Web Developer 2008 Express Edition that you can get at: http://www.microsoft.com/express/vwd/Default.aspx

References
ExcelPackage: Office Open XML Format file creation
http://excelpackage.codeplex.com/

ExcelPackage binaries download
http://excelpackage.codeplex.com/Release/ProjectReleases.aspx

NPOI
http://npoi.codeplex.com/

NPOI 1.2.1 for .NET 1.1 
http://npoi.codeplex.com/Release/ProjectReleases.aspx?ReleaseId=33203

NPOI 1.2.1 for .NET 2.0
http://npoi.codeplex.com/Release/ProjectReleases.aspx?ReleaseId=19351

NPOI samples
http://npoi.codeplex.com/Release/ProjectReleases.aspx?ReleaseId=19351#DownloadId=70100

Thread Safe Circular Queue in C#

While completing a screen for a Software Development Engineer in Test (SDTE) position for Microsoft I had to implement a thread safe circular queue.

The question was the following:

Implement a circular queue of integers of user-specified size using a simple array. Provide routines to initialize(), enqueue() and dequeue() the queue. Make it thread safe. A really good observation was: Please do not to use existing class libraries for this question. Thanks!

My first attempt was obviously get more information about a circular queue and bring it back to my mind. That's because I had studied it a long time ago when I was in the beginning of the computer engineering course. If you don't use it you forget it. That's the truth.

This post aims to present an implementation of a thread safe circular queue in C#. I won't deal with theory. For theory, I advise you to read the Wikipedia article about circular queue (buffer) to get started.

I reused some code from the links I present in the references section at the end of this post.

I'll break down the code I implemented so that you can go through the functions that were asked in the problem statement.

class ThreadSafeCircularQueue
{
  private int[] queue;
  private int head;
  private int tail;
  private int length;

  static Object thisLock = new Object();

  public CircularQueue()
  {
     Initialize();
  }

  ...
}

The class ThreadSafeCircularQueue has five properties: an integer array (the queue) and the queue's head, tail and length. A static object thisLock is used to make the queue thread safe.

private void Initialize()
{
  head = tail = -1;

  Console.Write("Circular Queue size: ");

  string length = Console.ReadLine();

  this.length = int.Parse(length);

  queue = new int[this.length];
}

The function Initialize() does what its name means. The queue's head and tail are set to appropriate values and the user has the opportunity of specifying the queue's size (length) by entering an integer value. A new queue is then created and has the user specified size.

The Enqueue() and Dequeue() functions shown bellow do the hard work, that is, they control all the circular queue functionality.

The Enqueue() function receives an integer value to be enqueued. Note the use of the thisLock variable. It is used so that we get thread safety, that is, all the code section inside the lock statement can't be executed by two threads at the same time. This avoids the problem of two concurrent threads trying to access the queue data structure simultaneously. If it's not controlled in the source code level we can get invalid values in the queue what could lend to a corrupted queue. That's not what is expected. When the code section that is inside the lock statement is being executed by a thread and other thread reaches the same section of code, this second thread waits till the first thread exits such code section. This way the queue has a valid set of values and its data is maintained secure.

Firstly is checked if the queue is full. It is full when it's head points to index 0 and its tail points to length - 1 or when its tail + 1 equals the same value of head. Remember that it is a circular queue.

In case the queue is full a message is shown to the user. If it still has room we check if the "end" of the queue (tail) that points to the last element has an index of length minus 1. If it is then the tail will now point to the the "start" of the queue. The other possibility is to increment the tail variable. The position referenced by the tail variable is then used to enqueue the value parameter into the queue. Then the just enqueued value is printed in the screen. A check is done to verify if the head variable has a value of -1. If it's the case then head will point to index 0.

private void Enqueue(int value)
{
  lock(thisLock)
  {
    if((head == 0 && tail == length - 1)  (tail + 1 == head))
    {
      Console.WriteLine("Circular queue is full.");

      return;
    }
    else
    {
      if(tail == length - 1)
        tail = 0;
      else
        tail++;

      queue[tail] = value;

      Console.WriteLine("In -> {0}", value);
    }

    if(head == -1)
      head = 0;
  }
}

The Dequeue() function also uses the lock statement because it must be a thread safe function.

Firstly is checked if the queue's head points to -1. Remember that in the Initialize() function when the queue is initialized its head and tail variables are set to -1. If head has a value of -1 then the queue is empty and the user is informed about that. A value of -1 is returned to conform with the function signature that needs to return an integer value.

If the queue is not empty the integer variable v receives the value that is indexed in the queue's head position. The value 0 is then assigned to the position that was just dequeued.

Now it's necessary to update the head position inside the queue. To accomplish that it's checked if head is equal tail, if it is, then head = tail = -1. This last operation states that the queue is empty after the dequeue operation. Else it's checked if head is equal length - 1. If it is the case then head receives 0. The last possible condition increments the value of head. In the end the value being dequeued (v) is printed in the screen.

Printing the values in the screen is a good way of keeping track of what is going on while the program is running.

private void Dequeue()
{
  lock(thisLock)
  {
    int value;

    if(head == -1)
    {
      Console.WriteLine("Circular queue is empty.");

      value = -1;
    }
    else
    {
      value = queue[head];
      queue[head] = 0;

      if(head == tail)
        head = tail = -1;
      else
        if(head == length - 1)
          head = 0;
        else
          head++;
    }

    Console.WriteLine("Out -> {0}", value);
  }
}

Now take a look at the Show() function bellow. Again the lock is present. Why? In an instant you'll get to the point. Hold your horses! :-)

A check is done to verify if the queue is empty. If it is the case the callee function just returns to the caller function. If it is not the case we proceed to show the values already present in the queue. If tail is less than head a for loop starting in 0 and ending in length - 1 iterates the queue and each value is written in the screen. Otherwise a for loop starting in 0 end ending in tail iterates the queue and each value is written in the screen.

private void Show()
{
  lock(thisLock)
  {
    int i;

    if(head == -1)
    {
      Console.WriteLine("Circular queue is empty.");

      return;
    }
    else
    {
      if(tail < head)
      {
        //for(i = head; i <= size - 1; i++)
        for(i = 0; i <= length - 1; i++)
          Console.Write("{0} ", queue[i]);
      }
      else
        //for(i = head; i <= tail; i++)
        for(i = 0; i <= tail; i++)
          Console.Write("{0} ", queue[i]);

      Console.WriteLine();
    }
  }
}

The following is the EnqueueDequeue() function responsible for calling the Queue(), Dequeue() and Show() functions. In this function I execute a simple test case and comment the operations being carried out.

public void EnqueueDequeue()
{
  Enqueue(1);
  Enqueue(2);
  Enqueue(3);
  Enqueue(4);
  Show();
  Enqueue(5); // Circular queue is full!
  Dequeue();
  Dequeue();
  Dequeue();
  Dequeue();
  Dequeue(); // Circular queue is empty!
  Dequeue(); // Circular queue is empty!
  Show();
  Enqueue(6);
  Show();
  Enqueue(7);
  Show();
  Dequeue();
  Dequeue();
  Enqueue(8);
  Enqueue(9);
  Show();

  // Supposing a 4 size queue:  0 ¦ 0 ¦ 0 ¦ 0
  //
  //                            1 ¦ 0 ¦ 0 ¦ 0  <- Enqueue 1
  //
  //                            1 ¦ 2 ¦ 0 ¦ 0  <- Enqueue 2
  //
  //                            1 ¦ 2 ¦ 3 ¦ 0  <- Enqueue 3
  //
  //                            1 ¦ 2 ¦ 3 ¦ 4  <- Enqueue 4
  //              
  //                            1 ¦ 2 ¦ 3 ¦ 4  <- Circular queue is full when trying to enqueue 5.
  //
  //                            0 ¦ 2 ¦ 3 ¦ 4  <- Dequeue 1
  //
  //                            0 ¦ 0 ¦ 3 ¦ 4  <- Dequeue 2
  //
  //                            0 ¦ 0 ¦ 0 ¦ 4  <- Dequeue 3
  //                   
  //                            0 ¦ 0 ¦ 0 ¦ 0  <- Dequeue 4
  //
  //                            0 ¦ 0 ¦ 0 ¦ 0  <- Circular queue is empty when trying to dequeue.
  //
  //                            0 ¦ 0 ¦ 0 ¦ 0  <- Circular queue is empty when trying to dequeue.
  //
  //                            6 ¦ 0 ¦ 0 ¦ 0  <- Enqueue 6
  //
  //                            6 ¦ 7 ¦ 0 ¦ 0  <- Enqueue 7
  //
  //                            0 ¦ 7 ¦ 0 ¦ 0  <- Dequeue 6
  //
  //                            0 ¦ 0 ¦ 0 ¦ 0  <- Dequeue 7
  //
  //                            8 ¦ 0 ¦ 0 ¦ 0  <- Enqueue 8
  //
  //                            8 ¦ 9 ¦ 0 ¦ 0  <- Enqueue 9
  //
  // * 0 is set in a position when dequeueing so that it's easier to watch the queue variable.
}

Now the main start thread:

class Program
{
  static void Main(string[] args)
  {
    ThreadSafeCircularQueue circularQueue = new ThreadSafeCircularQueue();

    Thread[] threads = new Thread[10];

    for(int i = 0; i < threads.Length; i++)
    {
      threads[i] = new Thread(new ThreadStart(circularQueue.EnqueueDequeue));
    }

    for(int i = 0; i < threads.Length; i++)
    {
      threads[i].Start();
    }

    Console.ReadLine();
  }
}

As you can see above I declare an object of type ThreadSafeCircularQueue. An array of 10 objects of type Thread is then created. In a for loop I instantiate each Thread passing to it a delegate that represents the method that'll be invoked when the thread begins executing.

In the subsequent for loop I call the Start() method of each thread and they start executing simultaneously, tough they won't concur when accessing the functions Enqueue(), Dequeue() and Show().

Visual Studio C# Console Application
You can get the Microsoft Visual Studio Project and the app executable at: http://leniel.googlepages.com/ThreadSafeCircularQueueCSharp.zip

References
During the research phase of this implementation I recurred to some sites to get more information regarding the circular queue data structure. The following list can provide you a better understanding of such a data structure.

Thread Safety articles
[1] Venners, Bill. Designing for Thread Safety: Using Synchronization, Immutable Objects, and Thread-Safe Wrappers. 1998. Available at <http://www.artima.com/designtechniques/threadsafety.html>. Accessed on April 29, 2008.

[2] Suess, Michael. A Short Guide to Mastering Thread-Safety. 2006. Available at <http://www.thinkingparallel.com/2006/10/15/a-short-guide-to-mastering-thread-safety/>. Accessed on April 29, 2008.

[3] Allen, K. Scott. Statics & Thread Safety: Part II. 2004. Available at <http://www.odetocode.com/Articles/314.aspx>. Accessed on April 29, 2008.

Circular Queue sample code
[4] Kumar, Nunna Santhosh. Circular Queue Implementation using Arrays in C++. Available at <http://www.sourcecodesworld.com/source/show.asp?ScriptID=887>. Accessed on April 29, 2008.

Thread material
[5] Albahari, Joseph. Threading in C#. 2007. Available at <http://www.albahari.com/threading/>. Accessed on April 29, 2008.