Using IOptions from .NET Core in .NET Framework 4.5.1+ - Ninject dependency injection version

This post is based on this awesome article I found @ Medium:

Configuration Management in Legacy .NET Applications (.NET Framework 4.5.1+)

I wanted to use the cool IOptions interface to grab some configs from appsettings.json. The sad thing is that it's intended to be used with .NET Core but the article mentioned above tells us that we can circumvent that with some extension code.

I won't be able to use it anymore because I just found out that I can't in the current project setup I'm working on.

As I had already converted the AutoFac code to Ninject, I'm just sharing it here in case it's useful...

You'll need to test it.

Note: make sure you have installed the following NuGet packages:

- Microsoft.Extensions.Configuration
- Microsoft.Extensions.Options
- Microsoft.Extensions.Options.ConfigurationExtensions

Using Power BI parameter to specify Excel workbook path in Power Query

This is a quick post to share something that I did just now and that worked. I tried to search for some sample but couldn't find. Maybe my search query was not that right...

Let's say you use an Excel workbook as your data source in Power BI. Than you collaborate with a team and as we know, not everyone has the same folder structure setup. This can cause a problem when trying to Refresh data in Power BI. How can we "overcome" this in a simple way?

Just create a Power BI parameter and pass it to the Power Query responsible for opening the Excel workbook.

In the following printscreen you see that I created a parameter called Excel workbook path. I set this path to where the workbook is located in my machine.


Then we can use this parameter in Power Query like this:

= Excel.Workbook(File.Contents(#"Excel workbook path"&"Contingency Drawdown.xlsx"), null, true)

I was having some difficulty trying to make the string concatenation work, that is, joining the parameter with the workbook name. Tried some variants and got it working with the string concatenation token &.

As you see the workbook name doesn't change (it's hardcoded) however the path is customizable by the parameter "Excel workbook path".


We just need to change this parameter value to point to our folder structure and we're done. Now we can make use of this parameter to open all Excel sheets (tables in Power BI) in a single shot.

Hope it helps.

References:

Power BI Introduction: Working with Parameters in Power BI Desktop —Part 4

Impersonation in Windows Service with ASP.NET Core

In a project one of the requirements was to be able to impersonate the current logged on Windows user.

The ASP.NET Core 2.1 project was hosted in a Windows Service. More info about this here:
Host ASP.NET Core in a Windows Service

The problem was that when doing WindowsIdentity.GetCurrent().Name; what was being returned was the account name used as Log On as for the service. That makes sense since it is that account that is executing the service.

This particular project needed to get the current logged on Windows user and check their permissions system wide, that is, check which claims the current user making use of the service has. Claims are the key component by which the outcome of all Web-based authentication and authorization requests are determined. More info on that here: The Role of Claims.

This way we can assure a user belongs to a specific User Group, the user has read permission on some folder, etc.

After trying some options, nothing worked. Then I decided to search for “Windows Authentication in ASP.NET Core” using Google and it lead me to this page: Configure Windows Authentication in ASP.NET Core.
Since we’re not hosting the app in IIS, that was a no go. However there’s another option below that page and it’s called HTTP.sys. HTTP.sys server supports self-hosted scenarios on Windows and that was the way to go.


First thing we need to do is make use of it when creating the WebHost object:

public static IWebHostBuilder CreateWebHostBuilder(string[] args,
    IConfigurationRoot config) =>
        WebHost.CreateDefaultBuilder(args)
               .UseConfiguration(config)
               .UseStartup<Startup>()
               // Using HTTP.sys for Windows Authentication support
               .UseHttpSys(options =>
               {
                   options.Authentication.Schemes =
                   AuthenticationSchemes.NTLM | AuthenticationSchemes.Negotiate;
                   options.Authentication.AllowAnonymous = false;
               });

Then in Startup.cs inside the method ConfigureServices we do:

// This method gets called by the runtime.
// Use this method to add services to the container.
public void ConfigureServices(IServiceCollection services)
{
    services.AddLogging(builder =>
    {
        builder.AddFilter("Microsoft", LogLevel.Warning)
               .AddFilter("System", LogLevel.Warning)
               .AddConsole();
    });
    // HTTP.sys server
    // HttpSysDefaults requires the following import:
    // using Microsoft.AspNetCore.Server.HttpSys;
    services.AddAuthentication(HttpSysDefaults.AuthenticationScheme);
services.AddMvc();
}


That's it.

Now to impersonate the user we do the following:


[HttpGet]
public async Task<IActionResult> MethodWithImpersonatedWindowsUser(
[FromQueryAttribute] string filePath)
{
    // The user used as Log On as for the Windows Service
    var serviceUser = WindowsIdentity.GetCurrent().Name;
   
    // The user to be impersonated
    var userToImpersonate = (WindowsIdentity)HttpContext.User.Identity;
    // Impersonating the current Windows user [HttpContext.User.Identity]...
    var result = await WindowsIdentity.RunImpersonated(
                                     userToImpersonate.AccessToken, async () =>
    {
        // This time WindowsIdentity.GetCurrent() will retrieve the impersonated
        // user with its claims...
        var impersonatedUser = WindowsIdentity.GetCurrent().Name;
        // Your business logic code here...
    });
    return result;
}

I saw lots of people asking the same question I made prior to knowing how to handle this requirement: why are we seeing the service user returned instead of the current logged on user when doing WindowsIdentity.GetCurrent()? The answers were to the point: in Windows you can have many user sessions (Remote Desktop, etc) where all are logged on at the same time making use of the Windows Service. So, how to get the current logged on user while still making use of the service? That must come from the HttpContext, that is, from the request arriving from the browser. WindowsIdentity.RunImpersonated makes it all possible. It runs the specified lambda action as the impersonated Windows identity.

Using the power of Razor Views embedded in external class library with FluentEmail

Just got this working and would like to share how to get it configured since I could not find an explanation anywhere.

FluentEmail - All in one email sender for .NET and .NET Core is a great library to send e-mails.

Currently I have a full .NET 4.7.2 Console Application that makes use of FluentEmail.
My desire with it was to store the e-mail template (
Razor .cshtml view) in an external class library and be able to use this view\template in the Console Application.

FluentEmail GitHub page has a sample code that shows how to consume that:


var email = new Email(bob@hotmail.com)
    .Subject("Hey cool name!")
    .UsingTemplateFromEmbedded("Example.Project.Namespace.template-name.cshtml",
        new { Name = "Bob" },
        TypeFromYourEmbeddedAssembly.GetType().GetTypeInfo().Assembly);

Steps:

1 - Create a Razor Class Library project. More on that can be found here:
Create reusable UI using the Razor Class Library project in ASP.NET Core
It uses .NET Standard 2.0 as its Target framework.

2 - Add Microsoft.AspNetCore.Mvc.Core NuGet package to the class library.
This is to have Intellisense, etc in Visual Studio.

Install-Package Microsoft.AspNetCore.Mvc.Core -Version 2.1.3

3 - Add a Razor view called EmailTemplate.cshtml to your class library.

4 - Right click it in Solution Explorer and select Properties. In Build Action, select Embedded resource.

5 - Add the following code to your Razor View:


@model dynamic
<!DOCTYPE html>
<html>
<head>
    <meta charset="utf-8" />
    <title>Some Title</title>
    <base href="/" />
@*<link href="css/bootstrap/bootstrap.min.css" rel="stylesheet" />
      <link href="css/site.css" rel="stylesheet" />
      <script src="_framework/blazor.webassembly.js"></script>*@
</head>
<body>
    E-mail template
    Hey @Model.Name,<br>
    <font size="14">You are so @Model.Compliment.</font>
</body>
</html>
Note that I’m using dynamic in the view code. This is just for testing. You can have full blown Models and use them in the templates as well.

6 - In your Console Application add a reference to the Class Library created in step 1.

7 - In your FluentEmail code, do the following in UsingTemplateFromEmbedded method:

email.To(configuration["emailTo"])
     .SetFrom(configuration["emailFrom"])
     .Subject(configuration["emailSubject"])
     .UsingTemplateFromEmbedded(
      "YourCompany.YourRazorClassLibrary.EmailTemplate.cshtml",
    new { Name = "Leniel Maccaferri", Compliment = "Cool" },
    Assembly.Load("YourCompany.YourRazorClassLibrary"));


The first parameter is the full class library (DLL) name + dot + Razor view name with .cshtml extension. The second parameter is the anonymous type that will be passed as the Model to the view and the third parameter is the class library assembly itself.


You're good to go!

This gives you a lot of flexibility.

You can add images, customize the HTML, CSS, have
Layout pages, etc.


Hope it helps.

Mixing C# source code with PowerShell script code to speed your development

I had some C# code ready but needed to use it inside a PowerShell script.

At StackOverflow I found some code that put me in the right path. So here's a practical example when mixing codes inside a PowerShell script file:

. ".\Invoke-Parallel.ps1" # Importing another script into this PowerShell script file

$csharpSource = @"

using System;
using System.Collections.Generic;
using System.Linq;

public class DateExtensions
{
     public static List<Tuple<DateTime, DateTime>> GetWeeksBetweenDates(DateTime startDate, DateTime endDate)
     {
        var weeks = new List<Tuple<DateTime, DateTime>>();
        
    
         for (DateTime date = startDate; date <= endDate; date = date.AddDays(8))
         {
            var weekEnd = date.AddDays(7);
                            
            weeks.Add(new Tuple<DateTime, DateTime>(date, weekEnd <= endDate ? weekEnd : endDate));
         }

        return weeks;
    }
}
"@

Add-Type -TypeDefinition $csharpSource # Includes the C# code defined above to be called through the PowerShell script

$appPath = "C:\SomeFolder"
$startDate = [datetime]'6/1/2017'
$endDate = [datetime]'5/5/2018'

$weeks = [DateExtensions]::GetWeeksBetweenDates($startDate, $endDate)

#Calls MyApp in batches, that is, for each week...
$weeks | Invoke-Parallel -ImportVariables -ScriptBlock {   
    Set-Location -Path $appPath
           
    $date = [string]::Format("{0:yyyyMMdd},{1:yyyyMMdd}", $_.Item1, $_.Item2)

    #Write-Host $date

    dotnet MyApp.dll budat=$date cpudt=$date # Calls .NET Core "executable" DLL for each week
}

pause #Keeps PowerShell window open to see the results

As we see in the code above, the variable $csharpSource holds the C# source code that later will called in the PowerShell script.

We then add\mix the C# code to\with PowerShell with the command Add-Type passing to it the $csharpSource variable that holds the source code. Simple as that.

Inside PowerShell script code we call the method defined in C# code with:

$weeks = [DateExtensions]::GetWeeksBetweenDates($startDate, $endDate)

This is pretty useful because we don't need to convert our C# code to PowerShell idiom.

Hope it helps.

Reference:
How to convert C# code to a PowerShell Script?

Processing Stack Overflow data dump with Apache Spark

This post is about the final work I did for one of the disciplines of the Master's degree I'm currently attending at UFRJ - Federal University of Rio de Janeiro in the branch of Data and Knowledge Engineering (Databases) that is under the division of Computer and Systems Engineering department at COPPE\UFRJ.

The discipline is called Special Topics in Databases IV and is taught by professor Alexandre Bento de Assis Lima.

The presentation (PPT slides) is in Brazilian Portuguese. I'll translate the slides to English in this blog post. They give an overall view about the work done.

The final paper is written in English.

Files

Trabalho prático sobre Apache Spark envolvendo um problema típico de Big Data (apresentaĆ§Ć£o\presentation).pdf (in Portuguese)

Processing Stack Overflow data dump with Apache Spark (in English)

Abstract. This paper describes the process involved in building an ETL tool based on Apache Spark. It imports XML data from Stack Overflow data dump.
The XML files are processed using Spark XML library and converted to a DataFrame object. The DataFrame data is then queried with Spark SQL library.
Two applications were developed: spark-backend and spark-frontend. The first one contains the code responsible for dealing with Spark while the later one is user centric allowing the users to consume the data processed by Spark.

All the code developed is in English and should be easy to read.

Presentation
  1. Objective
  2. Problem
  3. Technologies
  4. Strategy used to acquire the data
  5. Development
  6. Conclusion
  7. Links
  1. Objective
    • Put into practice the concepts presented during the classes.
    • Have a closer contact with modern technologies used to process Big Data.
    • Automate the Extraction\Mining of valuable\interesting information hidden in the immensity of data.

  2. Problem
    • Analyse StackOverflow data dump available on the internet on a monthly basis.
    • The data dump is composed of a set of XML files compacted with the .7z extension.
    • Even after compaction the biggest file has 15.3 GB. This size is directly linked to the data volume handled by Big Data.
    • Spark at first will be used as an ETL tool (ETL = Extract > Transform > Load) to prepare the data consumed by a front-end web app.
    • "At first" because there's also the possibility of using Spark as a tool to process the data that'll be shown in the web app.

  3. Technologies
    • Apache Spark 2.0.1 +
    • Spark XML 0.4.1 +
    • Spark SQL 2.0.2
    • Ubuntu 16.04 LTS (Xenial Xerus)
    • Linux VM (virtual machine) running on Parallels Desktop 12 for Mac
    • Scala 2.11.8
    • XML (Extensible Markup Language)
    • XSL (Extensible Stylesheet Language)
    • Play Framework 2.5 (front end)
    • Eclipse Neon 4.6.1 with Scala IDE 4.5.0 plugin as the IDE

  4. Strategy used to acquire the data
    • Got the .torrent file that contains all the data dumps from Stack Exchange family of sites - https://archive.org/details/stackexchange
    • Selected the eight .7z files related to StackOverflow: stackoverflow.com-Badges.7z, stackoverflow.com-Comments.7z, stackoverflow.com-PostHistory.7z, stackoverflow.com-PostLinks.7z, stackoverflow.com-Posts.7z, stackoverflow.com-Tags.7z, stackoverflow.com-Users.7z, stackoverflow.com-Votes.7z

  5. Development
    • To make the work viable (running locally out of a cluster), a single .xml file [Users.xml] was used. A subset of 100.000 lines (32.7 MB) was selected. This file has a total of 5,987.287 lines (1.8 GB).
    • hadoop@ubuntu:/media/psf/FreeAgent GoFlex Drive/Downloads$ head -100000 Users.xml > Users100000.xml
    • The file Users.xsl was used covert Users100000.xml data to the format expected by spark-xml library. The result was saved to Users100000.out.xml.


    • The .xml and .xsl files were placed into the input folder of the Scala project [spark-backend] inside Eclipse.
    • The application spark-backend read the file Users100000.out.xml through Spark XML and transforms it into a DataFrame object.
    • The Spark SQL library is used subsequently to search the data. Some sample queries were created.
    • Each query generates a CSV file (SaveDfToCsv) to be consumed in a later stage by a web application [spark-frontend], that is, Spark is used as an ETL tool.
    • The result of each query is saved in multiple files in the folder output. This happens because Spark was conceived to execute jobs in a cluster (multiple nodes\computers).
    • For testing purposes, a method that renames the CSV file was created. This method copies the generated CSV to a folder called csv. The destiny folder can be configured in the file conf/spark-backend.properties.



    • The application [spark-backend] can be executed inside Eclipse or through the command line in Terminal using the command spark-submit.
    • In the command line we make use of the JAR file produced during the project build in Eclipse. We pass as parameters the necessary packages as below:
    • spark-submit --packages com.databricks:spark-xml_2.11:0.4.1 -- class com.lenielmacaferi.spark.ProcessUsersXml - -master local com.lenielmacaferi.spark- backend-0.0.1-SNAPSHOT.jar
    • The application [spark-frontend] was built with Play Framework (The High Velocity Web Framework For Java and Scala).
    • The user opens spark-frontend main page at localhost:9000 and has access to the list of CSV files generated by [spark-backend] application.
    • When clicking a file name, the CSV file is sent to the user's computer. The user can then use any spreadsheet software to open and post-process\analyse\massage the data.


  6. Conclusion
    • With Spark's help we can develop interesting solutions as for example: a daily job that can download and upload data to a folder "input" processing the data along the way in many different ways.
    • Using custom made code we can work with the data in a cluster (fast processing) using a rich API full of methods and resources. In addition, we have at our disposal inumerous additional libraries\plugins developed by the developer community. Put together all the power of Scala and Java and their accompanying libraries.
    • The application demonstrated can be easily executed in a cluster. We only need to change some parameters in the object SparkConf.

  7. Links

Using Zotero to convert Springer Link CSV search result to BibTex format

Today I needed to generate a BibTex file to serve as input to Parsif.al.

Parsif.al is an online tool designed to support researchers to perform systematic literature reviews within the context of Software Engineering.

I hit a brickwall while doing a search in Springer Link because it only gives us a CSV file with the entire search result. It caps the result to the first 1000 registries. It'd be a pain to click and open each and every search result to be able to export the corresponding BibTex.

Using Zotero it's easy to get a BibTex out of the CSV file generated by Springer Link.


Follow these simple steps:

1 - Open the CSV file in Excel for example and copy the column that contains the item DOI [ Digital Object Identifier ];

2 - Paste the DOI(s) into Zotero's Add item(s) by identifier (see Figure 1 above). Wait while it imports...

3 - Select the folder where you imported the DOI(s); (Player Modeling in Figure 1)

4 - Right click the folder and select Export collection... pick BibTex.

You're done.

Hope it helps.

References:

Adding Items to your Zotero Library