Scala 101
Scala is a programming language created by Martin Odersky in Switzerland and released in 2003. It caters for both the Object Oriented and Functional paradigms. It runs in the JVM and is fully compatible with Java, meaning you can have both Scala and Java source within the same project file. It is the language Apache Spark is implemented in. In this article I will document the basics to get started with Scala.
Below is a list of some of resources consulted for this article:
- Get Programming with Scala (Daniela Sfregola, Manning Books)
- Apache Spark 2.0 with Scala (Frank Kane, Udemy)
- Rock the JVM! Scala and Functional Programming for Beginners (
Daniel Ciocîrlan & Andrei Taleanu, Udemy)
I will be using Jupyter Notebooks, configured to use a Scala Kernel, to illustrate the basics of using Scala.
First up, Hello World:
The above expressions already illustrates a few concepts of Scala: Firstly, although strongly typed, Scala uses** type inference** to deduce types. Also, semi-colons (;) are not required at the end of expressions.
Immutability, although incurring some performance penalties, makes code easier to reason about. In addition, it sidesteps problems such as deadlock and starvation when writing concurrent code. Scala separates immutable and mutable structures, and encourages the use of immutable structures wherever possible. The keyword val is used to define immutable values in Scala:
Types:
(Notice types are capitalized, e.g. Int, v.s. int in Java)
Scala has a feature called Lazy Evaluation, where the interpreter delays the initialization of lazy values until they are used. This is illustrated in the following example:
Conditional execution:
Scala's version of the Case statement:
The use of For and While loops are discouraged in Scala. Functional constructs such as foreach and map should be used instead.
Scala caters for outputting formatted text using printf:
Variables can be substituted in a string by prefixing the string with an s, and prefixing variables with the $ sign:
Scala caters for placeholders with the characters ???:
Due to the functional nature of Scala, expressions can be passed as values.
Function are first class citizens in Scala. They can be passed in as parameters to functions. Functions can also return other functions. The following is an example of a function definition:
Lambda (anonymous) functions are also supported:
Tuples are immutable lists in Scala:
(Notice, tuples are ONE-BASED indexed)
Tuples can contain different types, but are limited to 22 elements.
Map:
If you are not sure the key exists you can use this:
Maps can be created by passing in tuples.
**Sequences, **derived from the collection object, are data structures that have a well defined order and can be indexed. The following operations can be applied to them:
- Apply,iterator,length,reverse
- Concatenate, append, prepend
- Group, sort, zip, search, slice
- Map, flatmap, filter
List are a type of sequence, implemented as a singly-linked list. Head, tail and isEmpty are fast: O(1), with other operations O(n).
Lists methods:
- ++ To concatenate
- .reverse
- .sorted
- .distinct
- .max
- .sum
- .contains
Array:
Notice, Arrays are ZERO-INDEXED.
Arrays can be mutated:
Arrays also have access to all the sequence methods above.
Vector:
Used for immutable sequences. Effectively constant time indexed read and write: O(logn). Fast append and prepend. It offers good performance for large data sizes.
The following example (taken from Get Programming with Scala), illustrates the implementation of a class in Scala. It also shows access modifiers (i.e. private):