Understanding the Structure of Lists

Thanks for joining me on a deeper adventure into lists. I think that this information will give you a deeper, better understanding of how lists work in functional programming, and Elixir in particular.

If you've heard the saying “turtles all the way down,” that's a list, except for the pesky infinity aspect. Each tail of a list is, itself, also a list -- unless it’s an empty list, which is where the list terminates. To understand a list, you have to understand a single item of a list. When you understand single item of the list, you understand the list. The empty list, [ ], is terminator for all lists. An empty list has no head and no tail. Its only property is that it is an empty list.

Elixir list: [1 | [2 | [3 | [] ] ] ]     Haskell list: (1 : 2 : 3 : [] )

Historically, cons cells hearken back to the Lisp programming language. In Lisp, it was a function called cons, which was short for constructor. Cons is the basic building block of functional programming lists. Each list node is a cons cell. A cons cell is composed of two parts: the car and the cdr. Car can also be referred to as “first” and cdr, the “rest”. Car points to the value that the node represents. Cdr (pronounced “could-er”) points to the next cons cell in the list, or the empty list. In Lisp, the last list node points to nil.

Above, is the cons cell visualized. Looking at the cons cell diagram, one can see the arrangement of the car and the cdr.

Above is a visualization of a single node list from the cons perspective. We have the variable that is pointing to the cons. In this case, lst1. The car cell and is pointing to a value which is a in this case. The cdr cell is pointing to an empty list - [ ]. Looking at that in a REPL, you would see a single-item list displayed as: [“a”].

Next, we prepend another list item to the first list item. In this case, the list item is b, so we have car pointing to the value b. Our cdr is pointing to the next cons cell, which was the list item a. So we have the variable, lst2, pointing to the first cons cell, which has the value b; and have that cons cell pointing to the next cons cell with the value a. I have left lst1 pointing to the original list node.

It was just as possible to do this using a single list variable, as shown below.

Here we have lst2 bound to the cons of the first list item, b. Then that cons is bound to the next list item, a, with no variable pointing to that second one. Then finally, that cons is pointing to an empty list.

When building lists, consider the nature of cons based lists. If you think about how they're constructed, with the first item always points the next item and so on and so forth, then it makes sense why prepend is always preferred over append. You can watch a video presentation about this at: https://youtu.be/tDFPN18eM4I

Learn much more at my online course: Elixir 101 - Elixir for Everyone