1. Operations¶

At a high level, an operation is a function in a computation pipeline, abstractly represented by the Operation class. This class specifies the dependencies of the operation in the pipeline.

The FunctionalOperation provides a concrete lightweight wrapper around any arbitrary function to define those dependencies. Instead of constructing it directly, prefer to instantiate it by calling the operation() factory.

Operations are just functions¶

At the heart of each operation is just a function, any arbitrary function. Indeed, you can wrap an existing function into an operation, and then call it just like the original function, e.g.:

>>> from operator import add
>>> from graphtik import operation
>>> add_op = operation(add,
...                    needs=['a', 'b'],
...                    provides=['a_plus_b'])
>>> add_op
FunctionalOperation(name='add', needs=['a', 'b'], provides=['a_plus_b'], fn='add')
>>> add_op(3, 4) == add(3, 4)
True
But __call__() is just to facilitate quick experimentation - it does not perform any checks or matching of needs/provides to function arguments & results (which happen when pipelines compute).

The way Graphtik works is by invoking their Operation.compute() method, which, among others, allow to specify what results you desire to receive back (read more on Running a computation graph).

Specifying graph structure: `provides` and `needs`¶

Each operation is a node in a computation graph, depending and supplying data from and to other nodes (via the solution), in order to compute.

This graph structure is specified (mostly) via the provides and needs arguments to the operation() factory, specifically:

needs

this argument names the list of (positionally ordered) inputs data the operation requires to receive from solution. The list corresponds, roughly, to the arguments of the underlying function (plus any sideffects).

It can be a single string, in which case a 1-element iterable is assumed.

seealso: needs, modifier, FunctionalOperation.needs, FunctionalOperation.op_needs, FunctionalOperation._fn_needs

provides

this argument names the list of (positionally ordered) outputs data the operation provides into the solution. The list corresponds, roughly, to the returned values of the fn (plus any sideffects & aliases).

It can be a single string, in which case a 1-element iterable is assumed.

If they are more than one, the underlying function must return an iterable with same number of elements (unless it returns dictionary).

seealso: provides, modifier, FunctionalOperation.provides, FunctionalOperation.op_provides, FunctionalOperation._fn_provides

Aliased provides¶

Sometimes, you need to interface operations where they name some dependency differently. This is doable without introducing “pipe-through” interface operation, either by annotating needs with kw modifiers (see docs) or with aliases on the provides side:

>>> op = operation(str,
...                name="`provides` with `aliases`",
...                needs="anything",
...                provides="real thing",
...                aliases=("real thing", "phony"))

Considerations for when building pipelines¶

When many operations are composed into a computation graph, Graphtik matches up the values in their needs and provides to form the edges of that graph (see Graph Composition for more on that), like the operations from the script in Quick start:

>>> from operator import mul, sub
>>> from functools import partial
>>> from graphtik import compose, operation

>>> def abspow(a, p):
...   """Compute |a|^p. """
...   c = abs(a) ** p
...   return c

>>> # Compose the mul, sub, and abspow operations into a computation graph.
>>> graphop = compose("graphop",
...    operation(mul, needs=["a", "b"], provides=["ab"]),
...    operation(sub, needs=["a", "ab"], provides=["a_minus_ab"]),
...    operation(name="abspow1", needs=["a_minus_ab"], provides=["abs_a_minus_ab_cubed"])
...    (partial(abspow, p=3))
... )
>>> graphop
NetworkOperation('graphop',
                 needs=['a', 'b', 'ab', 'a_minus_ab'],
                 provides=['ab', 'a_minus_ab', 'abs_a_minus_ab_cubed'],
                 x3 ops: mul, sub, abspow1)

Notice that if name is not given, it is deduced from the function name.
Notice the use of functools.partial() to set parameter p to a constant value.
And this is done by calling once more the returned “decorator* from operation(), when called without a functions.

The needs and provides arguments to the operations in this script define a computation graph that looks like this:

Tip

See Plotting on how to make diagrams like this.

Builder pattern¶

There 2 ways to instantiate an FunctionalOperations, each one suitable for different scenarios, and so far we have only seen the 1st one:

We’ve seen that calling manually operation() allows putting into a pipeline functions that are defined elsewhere (e.g. in another module, or are system functions).

But that method is also useful if you want to create multiple operation instances with similar attributes, e.g. needs:

>>> op_factory = operation(needs=['a'])

Notice that we specified a fn, in order to get back a FunctionalOperation instance (and not a decorator).

>>> from functools import partial

>>> def mypow(a, p=2):
...    return a ** p

>>> pow_op2 = op_factory.withset(fn=mypow, provides="^2")
>>> pow_op3 = op_factory.withset(fn=partial(mypow, p=3), name='pow_3', provides='^3')
>>> pow_op0 = op_factory.withset(fn=lambda a: 1, name='pow_0', provides='^0')

>>> graphop = compose('powers', pow_op2, pow_op3, pow_op0)
>>> graphop
NetworkOperation('powers', needs=['a'], provides=['^2', '^3', '^0'], x3 ops:
   mypow, pow_3, pow_0)

>>> graphop(a=2)
{'a': 2, '^2': 4, '^3': 8, '^0': 1}

Decorator specification¶

If you are defining your computation graph and the functions that comprise it all in the same script, the decorator specification of operation instances might be particularly useful, as it allows you to assign computation graph structure to functions as they are defined. Here’s an example:

>>> from graphtik import operation, compose

>>> @operation(name='foo_op', needs=['a', 'b', 'c'], provides='foo')
... def foo(a, b, c):
...   return c * (a + b)

>>> graphop = compose('foo_graph', foo)

Modifiers on operation needs and provides¶

Annotations on a dependency such as optionals & sideffects modify their behavior, and eventually the pipeline.

Read mod:.modifiers` for more.