Contents

  1. Source
  2. Map
  3. Filter
  4. Pipe DSL
  5. Artifacts

Important: Pyodide takes time to initialize. Initialization completion is indicated by a red border around Run all button.

Here is a how we write a normal for loop in Python.



What if you want to operate on the list, such as squaring each element, or perhaps only selecting values greater than 5? Python list comprehensions are the Pythonic solution, which is as below



But I have always found more complex list comprehensions a bit difficult to read. Is there a better solution? Here is an attempt to adapt a UNIX shell pipelines like solution to Python. Something like

[i1,2,3,4,5,6,7,8,9] | where(_ > 5) | map(_ * _)

Here is a possible solution. What we need is a way to stitch operations on a list together. So, we define a class Chains with the __or__ dunder method redefined. What it does is to connect the current object to the right object in the pipeline, and set the current object as the source of values..



Source

Next, we define the source. That is, an object that is at the start of the pipeline.



We can use it as follows:



Can we do better on the sink, by avoiding the parenthesis? Here is a possibility



Using



Map

Next, we define maps.



We use it as follows.



Filter

Finally, we implement filters as follows.



This is used as follows.



We can also have our original names



Pipe DSL

This is great, but can we do better? In particular, can we avoid having to specify the constructors? One way to do that is through introspection. We redefine Chains as below. (The other classes are simply redefined so that they inherit from the right Chain class)



What we are essentially saying here is that, a lambda within a list ([lambda s: ...]) is treated as a map, while within a set ({lambda s:, ..}) is treated as a filter.

It is used a follows



One final note here is that, the final iterator object that the for loop iterates on here is of kind M_ from the last object. This is a consequence of the precedence of the operator |. That is, when we have a | b | c, this is parenthesized as (a | b) | c, which is then taken as c(b(a())). This is also the reason why we have to wrap the initial value in S_, but not any others (Because we override __or__ only the right hand object in an | operation needs to be the type Chain). (We are also essentially pulling the values from previous pipe stages.)

If we want, we can make the last stage the required Chain type so that we can write a | b | c | Chain(). However, for that, we need to override the only right associative operator in python – **. That is, we have to write a ** b ** c ** Chain(), and have to override __rpow__(). We will then get the object corresponding to a, and we will then have to push the values to the later pipe stages.

Artifacts

The runnable Python source for this notebook is available here.