Functional Programming

The future of programming is functional

Some years ago it goes public that Professor Robert Hagner from the Carnegie Mellon University at Pittsburgh just canceled the freshmen lesson for Object Oriented Programming (OOP). His reason was, that OOP is from nature antimodular and antiparallel and therefore not suitable for modern curriculum of Computer Science Studies.

I thought: “Wow”. Just for generations the thinking of software programmers, analysts and architects and the whole industry was in “Objects” and now it’s simply canceled. But what is true is true. With the limitation of raising processor clock pulse to speed up computer, the trend is going to parallel work instead. With Windows Azure as example it’s easy to scale processor count used by an application from one to thousands, but the application must be able to make use of it. So, let’s start think functional by using Visual F#. Microsoft defines F# as “a multiparadigm language that supports functional programming in addition to traditional object-oriented programming and .NET concepts. It’s a first class member of the .NET Framework languages”.

There is no universally accepted definition of functional programming, but one of the most agreed attributes about the functional programming paradigm is, that programming is done with expressions or declarations instead of statements. To remember, a statement is doing something like assigning a value to a variable, jumping to a other line of the code or calling a subroutine, while an expression is any section of the code that evaluates to a value.

So, let’s start with giving names to values. In F# this will be done with the let keyword. It’s used to declare and to use identifiers:

// Integer and string.
let num = 10
let str = "F#"
// Storing integers and strings in a list.
let integerList = [ 1; 2; 3; 4; 5; 6; 7 ]
let stringList = [ "one"; "two"; "three" ]
// Perform a simple calculation and bind intSquared to the result.
let intSquared = num * num

Beside the simplification of implementing parallel and asynchronous tasks, the greatest benefit for enterprises to use functional programming is, that algorithm can be defined shorter and simpler. That lowers the maintenance costs of applications significant.

For a sample algorithm we will use a daily life example: we are searching on our laptop for documents, in our intranet for information related to our customers or in google for news and other information. This is called information retrieval and algorithms for that a mainly based on calculation of similarity which will be found out by calculating the distance between the searched terms existing in documents. One beside many of such algorithm is the Jaccard similarity coefficient.

The Jaccard similarity coefficient, also known as the Jaccard index, is a statistic method used for comparing the similarity and diversity of finite sample sets. It is defined as the size of the intersection divided by the size of the union of the sample sets.

Jaccard

Jaccard similarity coefficient

One point which should be mentioned is the difference between symmetric versus asymmetric nominal variables inside the data set.

A binary variable has two states, 0 and 1 and is called symmetric, when there is no preference for which outcome of the binary variable should be coded as 0 and which as 1. For example, the binary variable “gender” for a human has the possible states “female” and “male” Both are equally valuable and carry the same weight when a proximity measure is computed.

On the other side, a binary variable is asymmetric if the outcomes of the states are not equally important, such as the positive and negative outcomes of a disease test. It is usually the rares one by 1 (HIV positive) and the other by 0 (HIV negative).

Lets suppose that we have a bunch of sheets defining professional skills:

// A list of skills
// This employee is a system and data integration expert
let employee1Skills = [".NET"; "C#"; "WCF"; "WF"; "BizTalk"; "SOA"; "BPMN"; "EAI"; "ESB"]
// This employee is a project manager
let employee2Skills = ["PM"; "IPMA"; "PMI"; "PMP"; "PRINCE2"]
// This employee is a system engineer
let employee3Skills = ["SCCM"; "AD"; "DNS"; "DHCP"; "GPO"; "SAN"; "LAN"; "WAN"]

and further a set of skills desired by a vacant job of a company:

// A companies vacant job for a Network Project Manager
let vacantJob = ["BPMN"; "WF"; "DNS"; "DHCP"; "LAN"; "PM"; "PMP"; "PRINCE2"]

The Jaccard coefficient is now a useful measure of the overlap of attributes of one of the employee skill sets (we will call ‘A’) and the vacant job skill set (we will call ‘B’).

// Prepare the common set operations needed by the Jaccard Coefficent function
let jaccardCoeff (first : string list) (second : string list) =
    let all = Set.union (first |> Set.ofList) (second |> Set.ofList) |> Set.toList
    let firstMatches = all |> List.map (fun t -> first |> List.contains t)
    let secondMatches = all |> List.map (fun t -> second |> List.contains t)
    // Next, the total number if each combination will be calculated

Each attribute of a A and B can be either present (we will define ‘1’) or absent (we will define ‘0’). The total number of each combination of attributes for both A and B are specified as follows:

M11 represents the total number of attributes where A and B both have a value of 1.
M01 represents the total number of attributes where the attribute of A is 0 and the attribute of B is 1.
M10 represents the total number of attributes where the attribute of A is 1 and the attribute of B is 0.
M00 represents the total number of attributes where A and B both have a value of 0.

   // Now, calculate the total number if each combination
   let zipped = List.zip firstMatches secondMatches
   let M11 = zipped |> List.filter (fun t -> fst t = true && snd t = true ) |> List.length
   let M01 = zipped |> List.filter (fun t -> fst t = false && snd t = true ) |> List.length
   let M10 = zipped |> List.filter (fun t -> fst t = true && snd t = false ) |> List.length
   let M00 = zipped |> List.filter (fun t -> fst t = false && snd t = false ) |> List.length

Each attribute must fall into one of these four categories, meaning that
M11 + M01 + M10 + M00 = n

The Jaccard similarity coefficient, J, is given as:

Jaccard Similarity Coefficient

Jaccard Similarity Coefficient for asymmetric binary attributes

   // Calculate Jaccard Similarity Coefficient
   let J = float M11 / float (M01 + M10 + M11)
   // Return the calculated value
   J

To use the jaccardCoeff function:

[<EntryPoint>]
let main argv = 
    let simJobEmp1 = jaccardCoeff employee1Skills vacantJob
    let simJobEmp2 = jaccardCoeff employee2Skills vacantJob
    let simJobEmp3 = jaccardCoeff employee3Skills vacantJob
    printfn "Emp1: %fnEmp2: %fnEmp3: %f" simJobEmp1 simJobEmp2 simJobEmp3
    0 // Exitcode as int

The result will show, that the employee working as a System Engineer has the highest similarity followed by the employee working as a Project Manager:

Emp1: 0.125000
Emp2: 0.272727
Emp3: 0.307692

Original Post: https://www.redtoo.com/ch/blog/the-future-of-programming-is-functional/

Do you have problems with C# Tuple class because items are read only?

While I was working on a small Windows Form tool in C#, which should help me to save and load parameters for a command line application, I run into the problem that not the complete textbox controls should be saved, but only the name and the text content. So, I had the requirement to save only a subset of the form fields. After thinking a while, I came to the idea to use Linq in combination with the Tuple class and I did some research about Tuples at all.

In mathematics, a finite sequence of elements is called a Tuple and they rise popularity with the implementation of functional programming languages. Like in mathematics, in functional programming a certain variable has a single value at all the time. So, Tuples are immutable or in other words “read only” by design and implementations like in C# are following this design.

Beside other usage of Tuples in programming languages, it’s commonly used to return a subset of data. Think about following use case: i.e. there is a list of territory definitions with fields like ZIP Code, City, Region, Population etc. and this list must be filtered and only a subset of the territory fields should be returned.

Here a code snippet for the above mentioned class definition:

// Define a class to store data
class Territory { public int zip; public string city; public string region; };

With Linq we could pretty easy do the filtering requirement, but we would have difficulties to return only a subset of the fields. Here a sample how it could be done:

// Define a sample list of objects to work with
Territory[] myTerritories = new Territory[] {
                                new Territory{zip=4153,city="Reinach",region="BL"},
                                new Territory{zip=8304,city="Wallisellen",region="ZH"},
                                new Territory{zip=3018,city="Bern",region="BE"}};

// Select a complete Territory subset variant 1
var subset1 = from Territory t in myTerritories
                   where t.region.StartsWith("B")
                         select t;
// Select a complete Territory subset variant 2
var subset2 = myTerritories.Where(t => t.region.StartsWith("B"));

So, here comes the usage of Tuples. To return only a subset of the territory fields, this fields must be stored in new instances of the Tuple class. The result will be a list of Tuples instead of a list of Territories.

Here you find a code snippet, returning a subset of the territory structure in a list of Tuples after the filtering:

// Select only a subset of territory fields as a list of Tuples
var subset3 = myTerritories.Where(t => t.region.StartsWith("B"))
                           .Select(t => new Tuple<int, string>(t.zip, t.city));

The problem is now, that Tuples are immutable / read only and follwing code would not compile:

subset3.ElementAt(0).Item2 = "Reinach BL";

So, the best solution is, to define a new class simillar to the Tuple class with a propper constructor and use this class instead of the Tuple class. Here the complete sample code:

class Program
{
    // Define a class to store data
    class Territory { public int zip; public string city; public string region; };

    static void Main(string[] args)
    {
        // Define a sample list of objects to work with
        Territory[] myTerritories = new Territory[] {
                                      new Territory{zip=4153,city="Reinach",region="BL"},
                                      new Territory{zip=8304,city="Wallisellen",region="ZH"},
                                      new Territory{zip=3018,city="Bern",region="BE"}};

        // Select a complete Territory subset variant 1
        var subset1 = from Territory t in myTerritories
                           where t.region.StartsWith("B")
                                 select t;
        // Select a complete Territory subset variant 2
        var subset2 = myTerritories.Where(t => t.region.StartsWith("B"));

        // Select only a subset of territory fields as a list of Tuples
        var subset3 = myTerritories.Where(t => t.region.StartsWith("B"))
                                   .Select(t => new Tuple<int, string>(t.zip, t.city));

        // because tuples are immutable/read only, these
        // will not compile and therefore it's commented out:
        /* subset3.ElementAt(0).Item2 = "Reinach BL"; */

        // Select only a subset of territory fields as a list of entities
        var subset4 = myTerritories.Where(t => t.region.StartsWith("B"))
                                   .Select(t => new Entity<int, string>(t.zip, t.city));

        // because Entity is mutable these will work:
        subset4.ElementAt(0).Item2 = "Reinach BL";
    }
}
public class Entity<T1, T2>
{
    public Entity(T1 t1, T2 t2)
    {
        Item1 = t1;
        Item2 = t2;
    }
    public T1 Item1 { get; set; }
    public T2 Item2 { get; set; }
}
public class Entity<T1, T2, T3>
{
    public Entity(T1 t1, T2 t2, T3 t3)
    {
        Item1 = t1;
        Item2 = t2;
        Item3 = t3;
    }
    public T1 Item1 { get; set; }
    public T2 Item2 { get; set; }
    public T3 Item3 { get; set; }
}

So, I hope you had fun and please leave a comment.

Original Post: https://www.redtoo.com/ch/blog/do-you-have-problems-with-c-tuple-class-because-items-are-read-only/