Being a functional Pythonian

04 Jan 2006

In my first blog post, I’ll give an example of how functional programming (FP) in Python results in shorter, cleaner, and more concise code, compared to the equivalent in an imperative language like C#.

I’ll use a real-world build tool I’ve written as the driving example. Typical tasks for such a tool would include, but not be limited to, checking out sources from cvs, compiling the sources, and running NUnit against a number of generated targets.

Each such task may be defined as a function, taking any number of arguments and returning a list of the form:

[status, [(command, output), (...)]]

where status indicates binary success, command is the command run to complete the task and output is the output (stdout/stderr) of running the command. There may be more than one (command, output) tuple inside the sublist, if the task requires more than one command to complete.

In FP the list and tuple are fundamental and powerful concepts, because they can easily be used to define complex structures for you to use when passing data around. They also facilitate easy processing using list comprehension, the functions build into the language, or functions of your own, providing elegant syntax for common operations, such as insertion, deletion, searching, slicing, etc.

These lists and tuples are more dynamic in nature than their array and struct counterparts in, say, C#. In fact, rather than creating user defined types, a list, or a tuple, or a combination often suffices, eliminating the need to write a lot of boilerplate code.

Now, to provide an example where Python’s FP support really kicks in with the build tool, is in the implementation of the code responsible for running NUnit against a number of targets:

def NUnit(nunitconsole, projectsOrAssemblies):
    results = [Run("\"" + nunitconsole + "\" " + source) 
              for source in projectsOrAssemblies]
    commands = [command[1][0] for command in results]
    outputs = [output[1][1] for output in results]
    failures = Sigma(re.findall("Failures: (\d+)", str(outputs)))
    return [not bool(int(failures)), zip(commands, outputs)]

This function relies heavily on the use of lists and list comprehension, rather than imperative style loops (which, of course, is also possible in Python). List comprehension is a construct that allows you to easily create a new list as a result of some processing on the elements of another.

A highlevel walk-through of the function: first the NUnit executable is invoked one target at a time. The Run function itself returns a list of the form [exitcode, (command, output)]; hence with n targets to test, the commands and outputs variables end up containing n commands and their output. The Failures variable counts the total number of failures accumulated across all targets, extracted as a list of strings matching the regular expression and summarized by the Sigma function.

Python doesn’t allow summing over a list of strings using its build-in Sum function. However, you can easily implement a Sum function that does in something along the lines of:

def Sigma(list): 
    return reduce(lambda a, b: int(a) + int(b), list)

where reduce is a function present in most, if not all, FP languages, which reduces a list of values to a single value by applying a user-defined operation to pairwise values of the list.

Finally, the zip function is a build-in function, merging the elements of two lists into one.

Another important construct in FP is the lambda notation. It enables you to create an anonymous function, a function without an explicitly name, leading us to another key characteristic of FP: functions are treated as first-class citizens, in the sense that they may be passed around as just another variable, e.g., you can create a list of functions with parameters and then do the actual invocation at a later stage.

Using deferred invocation, the build tool manages the overall build process; in a config file, similar to a makefile, a list of lambdas are defined, each defining a function that executes a task. This provides us with a generic way of annotating a task and for each task to be run by a central runner, which takes care of logging description, command, and outcome to a html report. The prefs used as parameters to each task is an associated array of associated arrays of … well as many levels as you require.

In effect the config file defines a domain specific language, using regular Python syntax, which can easily be loaded into what corresponds to the make utility by an import statement:

prefs = {
    "cvs" :
    {
        "cvs" : "C:\\Program Files\\cvsnt\\cvs.exe", 
        "workingDirectory" : "C:\\Inetpub\\wwwroot",
        "root" : ":pserver:foo:bar@1.2.3.4:/foobar",
        "module" : "baz",
        "revision" : "HEAD"    
    },
    "nunit" :  
    {
        "nunitconsole" : "C:\\...\\nunit-console.exe",
        "projectsOrAssemblies" : [C:\\Inetpub\\...\\foobar.nunit" ]
    },
    ...
}

tasks = [ ("Checkout Foobar from cvs",
    lambda: Task.CvsCheckout(prefs["cvs"]["cvs"],
                             prefs["cvs"]["workingDirectory"],
                             prefs["cvs"]["root"],
                             prefs["cvs"]["module"],
                             prefs["cvs"]["revision"])),
    ...,
    ("Running NUnit",
    lambda: Task.NUnit(prefs["nunit"]["nunitconsole"],
                       prefs["nunit"]["projectsOrAssemblies"]))
]

This example, using the basic concepts of FP such as lists, list comprehension, tuples, build-in functions, and nice syntax, provided you with a glimpse of how a lot can be achieved in relatively few lines of Python code. Also, if you really want to explore FP from a Python perspective, I suggest you take a look at the Xoltar Toolkit (unfortunately, though, it isn’t under active development), described in David Mertz’s columns More Functional Programming in Python and Even More Functional Programming in Python

Bottom line is that the use of FP style in your programs, although it takes time getting use to, leads to fewer lines of code, which leads to fewer places where things can go wrong.

This is not to say that your entire program should be written using the FP style (as is the case with Haskell). However, learning the basic concepts, and how Python exposes these to the developer, you’ll know which patterns to look for in your code that may be better solved using FP constructs.