Our Sample Problem
Let's say we need to write a program which:
- Prompts the user for a file path
- Opens the file and begins reading lines
- Interprets each line as a URL
- Attempts an HTTP GET against each URL
- Prints the resulting status code (or else an error message)
Crucially we want to keep going in the event of failure, so erroneous values should not halt the program. Additionally, the final version will need some kind of rich UI - so we can begin with a console application but cannot scatter print statements throughout the code.
With a little thought we can foresee the following error conditions:
- The user may supply an invalid file path
- A line in the file may not be a valid URL
- A networking error may occur when we request the URL
Some types to help us
Consider the type declarations below, which use standard FP types as implemented in the Arrow library. It may help to read them out loud.
typealias InvalidTextOrUri = Validated<String, URI>
typealias ExceptionOrInputs = Either<Exception, Set<InvalidTextOrUri>>
typealias InvalidTextOrFuture = Either<String, CompletableFuture<String>>
We have declared shorthands for three very useful types, which the compiler can use to prevent us shooting ourselves in the foot.
Our types are as follows:
- An
InvalidTextOrUri
is made up of a URI (in the valid case) or aString
containing an error message (in the invalid case) - An
ExceptionOrInputs
is made up of either anException
(should our File IO fail) or a set of the type declared above - An
InvalidTextOrFuture
is made up of either aString
containing an error message or aFuture
which will hold a result
These types will guide us as we implement the functions required, I will introduce them one at a time but all the code can be found at this Git repository. Note that in FP the convention is that 'right is right'. In a Validated
the type on the right represents the correct result. In an Either
the type on the right is the preferred one, the left is typically an error, a default, a placeholder or a stale value.
The functions to read from the file
If you examine the code below you can clearly see how we create the Valid / Invalid cases for the Validated and the Left / Right cases for the either. The compiler will ensure that all cases are handled and we use the right handler for the right case. We cannot accidentally associate a function that expects an Exception with a scenario that provides a URI, or vice versa.
Note also how failure is being encoded within the control flow:
- We are handling the expected error of a faulty file name by storing the exception within the
Either
- Similarly the expected error of a badly formatted line is being stored within the
Validated
fun readPageNames(name: String): ExceptionOrInputs = try {
Either.right(readLinesFromFile(name))
} catch (ex: Exception) {
Either.left(ex)
}
fun readLinesFromFile(name: String): Set<InvalidTextOrUri> {
val siteRegex = "http://.+".toRegex()
return File(name).useLines { lines ->
lines.map { line ->
if (line.matches(siteRegex)) {
Valid(URI(line))
} else {
Invalid("$line does not match regex")
}
}.toSet()
}
}
The functions to ping the URL
The function below uses the Java 9 HttpClient
to ping the provided URLs.
Note how invalid text passes through the function without causing exceptions. We use the fold
method to handle both cases of the Validated
- with the compiler keeping us right thanks to the types.
fun pingSites(input: Set<InvalidTextOrUri>): List<InvalidTextOrFuture> {
val client = HttpClient.newBuilder().followRedirects(ALWAYS).build()
val handler = HttpResponse.BodyHandlers.ofString()
fun buildRequest(uri: URI) = HttpRequest.newBuilder().uri(uri).build()
fun pingSite(uri: URI) = client
.sendAsync(buildRequest(uri), handler)
.handle { result, error ->
result?.statusCode()?.toString() ?: errorToString(error)
}
return input.map { invalidOrUri ->
invalidOrUri.fold({ Left(it) }, { Right(pingSite(it)) })
}
}
The main function (plus two utilities)
Our main function below is very simple, thanks to the groundwork that we have already done. Essentially we are building a pipeline where invalid input, exceptions and futures are all converted to strings and passed to forEach
for outputting. In each case the compiler ensures that we are working with the type we expect, and corrects us if not.
Note that in the readFileName
utility we would normally make use of an Optional
type (which Arrow does provide) but thanks to the support for null checking in the Kotlin language this can be omitted.
fun main() {
fun wrapError(ex: Exception) = listOf(errorToString(ex))
fun processInputs(input: Set<InvalidTextOrUri>) = pingSites(input).map { textOrFuture ->
textOrFuture.fold({ it }, { it.get() })
}
val errorOrNames = readPageNames(readFileName())
errorOrNames
.fold(::wrapError, ::processInputs)
.forEach(::println)
println("All done...")
}
fun errorToString(ex: Throwable) = "Error: '${ex.message}'"
fun readFileName(defaultPath: String = "input/sites.txt"): String {
println("Enter the filename ('$defaultPath')")
val fileName = readLine() ?: ""
return if (fileName == "") defaultPath else fileName
}
The big payoff - refactoring for parallelism
I will admit that there are people smart enough to do all the above in dynamically typed languages. I am however not one of them. But the real power of static types is when we need to refactor.
Let's say I implement the solution above and all is well, but after a sound night's sleep and a pot of the good stuff (blessed coffee - ambrosia of our profession) I realise that I should be able to run all the HTTP requests in parallel by using CompletableFuture.allOf
. With dynamic types this would be a fearsome refactoring, but with our type declarations to guide us it's fairly trivial:
fun main() {
fun printError(ex: Exception) = println(errorToString(ex))
fun processInputs(input: Set<InvalidTextOrUri>): List<CompletableFuture<Unit>> {
fun wrapInFuture(str: String) = CompletableFuture.completedFuture(str)
val futures = pingSites(input).map { it.fold(::wrapInFuture, { it }) }
return futures.map { it.thenApply(::println) }
}
fun combineAndJoin(input: Set<InvalidTextOrUri>) = CompletableFuture
.allOf(*processInputs(input).toTypedArray())
.join()
val errorOrNames = readPageNames(readFileName())
errorOrNames.fold(::printError, ::combineAndJoin)
println("All done...")
}
The program in action
Given a file containing the following:
http://instil.co
www.twitter.com
http://www.slackoverflow.com
www.facebook.com
http://www.bbc.co.uk/news
http://www.slashdot.org
This will be the output from our original program...
...and this will be the output from the asynchronous version
Summing up
Scripting languages are sometimes referred to as 'programmers duct tape' because they enable simple tasks to be done rapidly with minimal overhead. In this role they still excel, and Moore's law has expanded the bounds of their applicability.
But when solving complex problems and/or in an unfamiliar domain there's nothing more beneficial to productivity than a well designed and enforced type system. The moment you start coding 'if this parameter is a T' in your script then you would be better of switching to a compiled language. Otherwise you will end up writing an ad-hoc compiler as you go.
As Brian Hurt put it 'the purpose of abstraction is to create a new level of semantics where you can be absolutely precise'. If you are programming in a dynamically typed language this precision can never be achieved at compile time, either in your head or (just as importantly) the IDE. So there will inevitably be an overhead imposed as we run around trying to juggle chainsaws in the dark.
It will be objected that tests can take the place of static typing, especially when there is a unit test suite continually running in the background. I see three objections to this:
- By making the tests do the work of a type checker you conflate errors that can be avoided in advance with those that can only be determined at runtime.
- The volume of code required for the extra tests will always exceed that required for the type declarations. Especially in modern languages (like Kotlin, Scala and F#) with concise syntax and type inference. As a general principle we should choose to write less code.
- We are all human and will forget to write tests. Whereas the compiler is remorseless.
Final conclusions and future directions
I hope this article has highlighted some of the benefits of strong and static typing. Fans of FP will note that it would not take much effort to make this demo fully lazy in its evaluation, and if we pushed on past that we could make it a completely pure application. Such will be the subject of another blog. Stay tuned :-)