Functional Programming Fundamentals
Software applications are complicated beasts ( which over time evolve toward additional complexity). To keep everything manageable, we as software developers use a number of tricks:
- (Imperative Programming) We organize the code into re-usable components (functions), and try to structure them as logical libraries and modules to keep track of the bits and pieces.
- (Object Oriented Programming) For really complicated programs, we invent stories (object oriented classes) and tell ourselves lies (data models/abstraction) about how the program works and why.
Both of these strategies work, they are effective, and they are used widely. With that said, there are other approaches that can be used to create well-structured code that is easy to write, debug, reuse, and maintain. One approach that has become extremely popular, especially in the realm of working with data, is functional programming.
Functional programming has some things in common with imperative and object-oriented approaches, and extends some of their practices. At the same time, though, it also introduces many new ideas and opinions that provide a powerful paradigm for structuring computation, particularly for data processing, artificial intelligence, and machine learning.
In this article, we'll introduce functional programming, look at some of its tenants, and briefly explore how functional programming works in the Python programming language.
What is Functional Programming?
So, what is functional programming?
As noted earlier, functional programming is a method of writing code where an emphasis is placed on using functions as the building blocks of a larger program. There is typically a main
function which serves as the entry, which then delegates functionality to lower level functions until you arrive at a bottom layer where there are primitive types such as "strings", "containers", and "numbers" that get processed.
In some ways, a functional program is like any other type of program: a way to reuse code efficiently and control the flow of the logic. There's little difference from a function in an imperative program (or the class/static methods of an object oriented program). A function is a little code blob that takes a set of inputs and modifies data to produce a structured output.
In other ways, the program will look very different. For example:
- In a functional program, control structures like
if
andfor
are omitted and functional equivalents for sorting (filter
), transforming (map
), or iterating (each
) are used instead. The logical flow of the program is controlled by the inputs to later functions. Functional programming eschews branching in favor of consistent operations like filter as branch points can be a common source of bugs. - Data is passed through a set of operations (sometimes called a pipeline). Each step in the pipeline produces a new set of values as a result and data is immutable. Once it is declared, it doesn't change. A consequence of passing around immutable data is that it becomes easy to manage concurrency. Because the structures will not change, you can split the components of the calculation amongst multiple compute cores and aggregate the results at the end. Languages which incorporate components of functional programming into their design, such as Scala and Rust, use immutability explicitly as part of their concurrency model.
- Functions can themselves become inputs (parameters) for other functions and are usually implemented as full-blown objects. You can refer to them from variables and return them as results, meaning there is no meaningful difference between functions and data. This provides enormous flexibility in how you design the building blocks of your program. You take complicated problems, break them down into smaller tasks, and solve those tasks with succinct functions. With a little bit of planning, you can compose the simple functions together to solve very complex problems.
- Because there is an emphasis on solving problems in the most general fashion possible and composing low-level components into other components, testing becomes easier. All you need is a set of known inputs and an expected set of outputs. Because inputs are immutable and results will return new data, calling a function with the same inputs should always produce the same results (put another way, functional programs should be idempotent). The guarantee of a known output from known inputs, within the context of functional programming, is called referential transparency.
Example 1: Structure of a Functional Program
The easiest way to show the differences between functional programming and other styles is to see it in action. Consider the two programs below, which show how to split and count the lines of a multi-line string.
Imperative
This imperative example approaches the problem in the most straightforward manner possible: use the built-in method split
on the str
object to break the data apart and then count the resulting number of entries in the list (lines
).
# Example String example1 = '''The dog is black. The cat is gray. The horse is brown. ''' # Split the string and count the number of lines lines = example1.split('\n') line_count = len(lines) print('%s lines in str' % line_count)
Functional
This second program uses a more functional approach. It defines two functions, split
and count
, which execute the logic of the program. The two functions are then combined into a single statement (pipeline) to obtain the result. Output from the first function (which will be a list) is passed directly into the second (which will return an integer).
# Functional components of the program def split(s, sep='\n'): ''' Split the provided string into an iterable of new strings ''' return s.split(sep) def count(c): ''' Count the number of entries in the provided container ''' return len(c) # Example string example1 = '''The dog is black. The cat is gray. The horse is brown. ''' # Split the string and count the number of lines line_count = count(split(example1)) print('%s lines in str' % line_count)
Though simple, the examples show some of the tenets described above.
- The program logic (
count
andsplit
) is packaged into a set of self-contained, reusable, and stateless functions only dependent on their input. There is no global state and the two functions modify their inputs into a transformed type returned as a new object. Functions which meet these requirements are called pure functions. - The structure of the program is organized as a pipeline and the output of the first function
split
is piped directly to the secondcount
without declaring intermediate variables. Though the syntax may differ between programming languages, this style of writing code is very common in functional programs and may include dense statements where data is passed between many steps of a procedure before declaring a final result and assigning it to a variable or constant.
Example 2: Functions as Inputs
Now that we've seen some of the basics of a functional program, let's take a look at some more advanced techniques in the form of a second example. The example programs below show how you might sum all even values from 3 to 300.
Imperative
To be as efficient as possible, we use a variable c
to keep a running tally. We then iterate over each integer in the range from 1 to 100,. and for even values, scale the number and add it to the tally.
# Set initial count at zero c = 0 # Create iterable structure from 1 to 100 for i in range(1, 100): # Check for even values if i%2 == 0: # Scale and add to running tally c += i*3 print('Sum of all even values from 3 to 300: %d' % c)
Functional
The functional approach is more involved, but follows the same general structure. First, we create a range iterator which will provide values from 1 to 100. Then, we use filter
to locate even values and pass those to map
, where they are scaled. Finally, we use reduce
to aggregate the results by adding them together.
The signature of reduce
takes two parameters. The first a
is the running tally (the accumulated value) to which the "update" value b
is added. In comparison to map
and filter
which return new iterable collections, reduce
returns a single value.
from functools import reduce # Sum all values from 3 to 300. Pipeline: # 1. Use range to create an array of values from 1 to 100 # 2. Use filter to filter out all odd values (mod 2 == 0 is true) # 3. Scale values using map # 4. Use reduce to calculate a running sum s = reduce(lambda a, b: a+b, map(lambda y: y*3, filter(lambda x: x%2 == 0, range(1, 100)))) print('Sum of all even values from 3 to 300: %d' % s)
At each step in the functional program, we are passing a function as the first parameter. Functions that allow for this type of behavior are called first-class functions. First class functions in Python generally have a signature similar to function_a(function_b, values_iterable)
. The function that is being passed in serves as a special kind of operator that will be applied to each of the values within the iterable, while the first-class function collects the values into a new iterable.
In the example above, these operator functions are defined using a special syntax in Python called lambda
. Lambda functions are anonymous, and inherit the scope of where they were compiled which makes them a useful way to construct ad-hoc statements that may be part of a larger routine.
When the program executes, the logic will be applied from the inner set of methods to the outer set of methods. Range values will be generated first, which will then be filtered, transformed (map
), and the sum will be calculated. The filter/map/reduce flow is a very common pattern in working with data, and serves as the foundation of "Big Data."
As an implementation note, the iterable protocol in Python allows for the values to be lazily evaluated. This means that the entire chain of methods is invoked all the way through before moving to the next value in the range. This allows for the program to be more efficient, in that a full collection of values isn't necessary at each processing step.
Powerful Conventions
In summary, functional programming can be thought of as set of tenets, which can be formally expressed as:
- Functions are pure, the results are only dependent upon the inputs
- Functions use immutable data and return new values
- Functions guarantee referential integrity
- Functions are first class entities, which means that they can be treated like any other value in an expression
When used consistently, functional programming provides powerful conventions to help manage complexity. Programs written in a functional style are syntactically concise while at the same time highly expressive, strongly encourage code reuse, and provide a straightforward way to test the components. The examples we have looked at in this article show only the barest fundamentals.
The value of the approach really shines when working with data. Because functional programs are essentially organized as "processing pipelines" to manipulate, filter, and aggregate data; understanding the structure of a pipeline written as a functional program becomes easy and extending the logic straightforward. This is why data processing libraries like Pandas, Spark, and Dask use functional approaches (such as data immutability and method chaining) extensively.
In a future blog post, we'll take the ideas we introduced here and inspect how modules in Python's standard library such as functools
and itertools
can let you further adopt functional approaches in your code.
Comments
Loading
No results found