Sunday, August 25, 2013

Chapter 3 Scala Pattern Matching, case classes and Structural Types

We all java programmers know the limitations of the switch statements and pattern matching.

In java to deal with the pattern matching based on the type of the object, we have two choices
1. Clutter the code with instance of conditions
2. Visitor pattern which needs an interface, a visitor implementation for each time and the visit method.

Scala's pattern matching is much more powerful. A lot of code written for visitor class is not required in scala.
A usual example is printing employee directory. For a manager, we have to print the number of reports and for a employee we don't. In java we would either write a if instance of or a visitor pattern t achieve this task. In most cases we would generate visitors to have open design.
All this boilerplate code can be done in a very concise fashion in scala.

The pattern matching can be used to match the type as well as the structural components of a object.
The class should be defined as a case class for structure based pattern matching.
Let us take a look at such example.

Let us define , Person and extend it to Manager.

case class Person (var id: String, var name:String, var citizen:String)

class Manager(id: String, name:String, citizen: String, val reports: Int)
extends Person(id,name, citizen)

Now let us write a class that will print reports for a Manager and validate the USA citizens by calling FBI fingerprint and ssn or Euro validations for euro citizens.

1.  static methods of Java, companion object methods in scala
Scala does not allow static methods like Java does, static methods violate the oo concepts.
In scala a companion object is defines to define static methods.
In java, we would typically define such validation methods as static methods, in scala we will define them as follows.
class ValidatePersons {
 
}
object ValidatePersons {
  def validateSsn(ssn: String): Boolean = { println("Will validate SSN"); true }

  def FBIFingerPrints(ssn: String): Boolean = { println("Will validate FBI Fingerprints"); true }

  def validateEuroPid(eid: String): Boolean = { println("Will validate Euro PID"); false }

}


Now let us see the power of the pattern matching in scala, it is very clear to a Java programmer how much boilerplate code is avoided in here.

    def validateCredentials(person: Person): Boolean = {

    var usalist = new ListBuffer[(String) => Boolean]

    usalist += ValidatePersons.FBIFingerPrints
    usalist += ValidatePersons.validateSsn

    var eurolist = new ListBuffer[(String) => Boolean]
    eurolist += ValidatePersons.validateEuroPid

    person match {

      case person: Manager => {
        println(" Manager Record : Reports " + person.reports)
       }    

      case Person(id, name, "USA") =>
        println("Person record in USA " + person.name)
        for (func <- usalist) applyValidations(func, id)
        true

      case Person(id, name, "EURO") =>
        println("Person record in EURO " + person.name)

        for (func <- eurolist) applyValidations(func, id)
        true

    }

    false;

  }

  def applyValidations(f: (String) => Boolean, v: String) = f(v)

}

Points to note here are

1. Look at the declaration     var usalist = new ListBuffer[(String) => Boolean]
    usalist defines a list of functions, a function that takes String as input and returns a Boolean.
    This is Structure Type declaration.

2.    The switch statement in scala  for the variable person is person match {
        There is no switch, no break for the cases.

3. Now we see two flavors of match.
    For a manger, the match is only for the type.case person: Manager =>
4. The next two match statements are more based on the structure of the person object, for which it is required to declare the class as Case class. If you do the javap for this class, you will see the hashcode and equals implemented.

5. The code just loops through the validation functions stored in the list (usalist or eurolist) and then applies these to the id. This is what the applyValidations does, def applyValidations(f: (String) => Boolean, v: String) = f(v)


In the next posts, I would like to explore the power of high order funtions, contra variance and co-variance in scala.


Sunday, August 11, 2013

Chapter 2: Scala Features for Java Developer

In this post, I am going to attempt to visit the scala features that help write succinct and intuitive code. As a Java developer I appreciate the productivity they bring.

1. Concise Class definition
2. Type Inference
3. Defining mutable/immutable objects by declaration
4. Static Typing
5. Function definitions
6. Traits

1. Concise Class definition

I always thought that my beans were too verbose due to the getters and setters.
Let us define a class Person, with attributes id and name in scala.

All java programmers know how many lines of code we will write just to define and access these values.
The same code in scala is

class Person (var id: String, var name:String, var citizen:String);

Notice that there are
    no getters and setters (since the attributes are defined var, they are implicitly mutable.
    The body is completely optional for a class.

2. Type inference

Now let us instantiate a new Person, assign a new id to and print it.

1. var p = new Person("20", "Mike", "EURO")
2. p.id = 20;
3. println (p.id)

We did not have to specify the type of the variable p, scala compiler can infer its type from its assignment.

3. Making objects mutable and/or immutable.

Let us say we wanted to make the Person class immutable, in Java we would have to take many actions
remove setters, make the attributes/fields final etc., a favorite interview question.

In scala, this needs just the declaration change from var to val, Person (val id: String, val name:String)
We replace var in the class definition with val. var is a variable, val is a value hence immutable.

Now the statement at line 2 above, will not compile. There is not setter available for this class.

To see the code generated by scala and java, go to the bin directory of the code where the class files are generated, now if you do

scalap Person, you see
class Person extends scala.AnyRef {
  val id : scala.Predef.String = { /* compiled code */ }
  var name : scala.Predef.String = { /* compiled code */ }
  def this(id : scala.Predef.String, name : scala.Predef.String) = { /* compiled
 code */ }
}

To see the java code, just do
javap Person.class and you will see

public class org.manu.blog.Person {
  public java.lang.String id();
  public java.lang.String name();
  public void name_$eq(java.lang.String);
  public org.manu.blog.Person(java.lang.String, java.lang.String);
}

As you can see, name has a getter and a setter, while id has just the getter defined.
Later we will see what $eq means in java code and what Predef means in the scala code.

4. Static Typing

Let us define another class Manager, class Manager (val reports: Int)

Now let us try the code below

  1.     var p = new Person ("10", "Larry")
  2.      p.id = "20;"
  3.      println (p.id);
  4.      p = new Manager(3);
The compiler throws error, since the type of p (Person) does not match with the type Manager.

Type inference does not make scala dynamically typed language. All the refactoring tools still work.

5. Function Definitions

Functions are the foundation of functional programming, to explore the power of functional programming, let us define some functions. I have person data consisting of US citizens and European Citizens. For a US citizen I will validate SSN, for a European person, I need to validate European passports
I define a trait called validatePerson as follows


trait ValidatePersons {
  def validateSsn(ssn: String): Boolean = { println("Will validate SSN"); true }
  def FBIFingerPrints(ssn: String): Boolean = { println("Will validate FBI Fingerprints"); true }
  def validateEuroPid(eid: String): Boolean = { println("Will validate Euro PID"); false }
}

We will see the trait in detail later, for now, trait is an interface with implemented methods.
How does scala define the functions, starting with the word def.
The important thing to note is it return Boolean, and not boolean.
This is because everything is an object in scala, there are no primitives and operators are methods.

6. Traits

This is another flexible feature. Traits are more like interfaces with some implemented methods, In java the interfaces are married at class definition.Once implemented for a class it is always there for all objects.
In scala it is different, it is attached to the behavior when a variable is defined.

Let us see how it works.
  1.       var p = new Person("20", "Mike", "EURO") with ValidatePersons;
  2.       p  = new Person("10", "Larry", "USA") ;
  3.       p.validateSsn(p.id)
 This code does not compile. This is because the variable p  at line 2 is not defined with ValidatePersons and so it does not have access to the validateSsn method. 
Now change it to the following and it will start working. I like the fact that one can mixin a trait any time.
It is not married to the class.
  1.       var p = new Person("20", "Mike", "EURO") with ValidatePersons;  
  2.       p  = new Person("10", "Larry", "USA") with ValidatePersons;
  3.       p.validateSsn(p.id)
Right now, my functions are not very different from the methods in terms of the usage. In the next post I will explore the power of passing of the function. 

Sunday, August 4, 2013

How I decided to learn Scala?

I started coding java in 1999 when I was working at the Sun Microsystems. It was the time when servlets was the newest and hottest technology. Before that I was developing perl-cgi and was  feeling the limitations of the perl paradigm and wanted to shift to a new paradigm, and oops felt the great option.

All these years,  I never felt comfortable with Java Threading model though, for concurrency coding the developer has to delve in lower levels, e.g. to set up thread priority you have to know how the operating system works, implementation of ++ operators, multi-core vs single core configurations.  I was also writing
 way too much boilerplate code, even the code for a simple bean is too bulky with all its getters and setters. The tools can help generate the code, but the classes are still too bulky. Six months ago I was writing a piece of code for a DSL, the idea was to use business analyst written methods, it was not that easy coding it in Java, Groovy made it work faster.


Considering all this, I thought it was time to explore other languages. I have of course been hearing a lot about Scala, and felt like it would be the best choice, but I wanted to make sure that it really was the best option for me. At the same time one of my friends  had become a champion for learning Ruby and another friend was complaining about the performance limitations he hit with JRuby for a large application. I was therefore looking for some objective information on the popular languages, and Walla, there it was 

Seven Languages in Seven Weeks: A Pragmatic Guide to Learning Programming Languages (Pragmatic Programmers) 

The book covered Ruby, Io, Prolog, Scala, Erlang, Clojure and Haskell.
From the book, one of the best books I read in a while, I slowly stated to come to a conclusion:

Ruby is all about the programmer productivity, very flexible, (open classes, missed method),object oriented with functional constructs,  and would be great to bring a product very quickly to the market. The concurrency is not very evolved. Duck typing is good, but absence of Static Typing can be a problem for IDEs and refactoring tools, and I am almost addicted to these tools, cannot imagine development without them.  I also gathered that my friend's experience about Ruby performance is not all that unusual, rather expected.
Io, a purely prototyping language, very small syntax, no syntactic sugar, reminded me of my graduate work using Lisp. On one hand, you don't have to spend a lot of time in learning the syntax, but on the other hand, you have to understand many concepts, did not find a large community. Helped me understand some Javascript concepts better, I don't really develop much in Javascript.
Prolog, I particularly enjoyed reading on Prolog a lot, nostalgia from my graduation thesis, which was in the area of Indian music recognition. Of course Polog is  the best choice when you can model your domain in rules, particularly for artificial intelligence. The best way to use Prolog would be inside a main application. It is not based on JVM.
Erlang, described in three words is Concurrency, Concurrency and Concurrency. It is also extremely reliable, since  reliability was a major requirement for Ericsson labs' Telecom applications. For a java Developer like me, it would be a totally new game. I am not sure about the community support and how easy it will be for me to debug, dig into language features.
Clojure, what I gathered from the book is very powerful and flexible. From the syntax and examples I thought that to exploit its power, one has to spend a long time in learning the language, and then I don't know how easy it would be to find a bunch of such programmers to build a project.
Haskell, a pure functional paradigm, it is not that easy for some one like me to shift my thinking from object oriented style. I think I would have to spend more time in design with Haskell, for most domains object oriented design fits so well. 

In a nutshell, I would learn any one of the languages mentioned above only if the app I want to develop warrants for it. E.g. To bring a web product very quickly to the market I will probably go for Ruby, to develop an extremely reliable, concurrent product I will think of using Erlang, for an artificial intelligence product, Prolog probably would be my first choice.

Right now I want to learn a language that I can learn rather quickly, with semblance of Java, with less boilerplate code, less bulky code, easier to understand and implement concurrency features, can use existing Java libraries and even home-grown code like JPA, JDBC etc, essentially runs on JVM, can let me think in OO paradigm, and yes Scala fits the bill.
Scala has coexisting functional and oo paradigms, it is easier to learn for a Java programmer compared to learning Haskell or Erlang, it is a very general purpose language, facilitates DSL, its static typing lets me use the IDEs and refactoring tools.

Once I came to this conclusion, I started looking for some good resources to learn Scala. It turned out that although comparatively shorter learning curve, Scala too needs some serious reading, and trying out programs. I have finally settled on the book, Scala in Action by Nilanjan RayChaudhuri.

I want to share my journey with scala on this blog. In the next post, I plan to cover the basic features of Scala.