1. Operations¶
At a high level, an operation is a function in a computation pipeline,
abstractly represented by the Operation
class.
This class specifies the dependencies of the operation
in the pipeline.
You may inherit this class and access the declared values in needs from solution
and produce the declared provides when Operation.compute()
method is called.
But there is an easier way…actually half of the code of this project is to retrofit
existing functions into operations.
Operations from existing functions¶
The FunctionalOperation
provides a concrete lightweight wrapper
around any arbitrary function to define those dependencies.
Instead of constructing it directly, prefer to instantiate it by calling
the operation()
factory:
>>> from operator import add
>>> from graphtik import operation
>>> add_op = operation(add,
... needs=['a', 'b'],
... provides=['a_plus_b'])
>>> add_op
FunctionalOperation(name='add', needs=['a', 'b'], provides=['a_plus_b'], fn='add')
You may still call the original function, by accessing the FunctionalOperation.fn
attribute:
>>> add_op.fn(3, 4) == add(3, 4) TrueBut that is just for a quick experimentation - it does not perform any checks or matching of needs/provides to function arguments & results (which happen when pipelines compute).
The way Graphtik works is by invoking their
Operation.compute()
method, which, among others, allow to specify what results you desire to receive back (read more on Running a pipeline).
Builder pattern¶
There are two ways to instantiate a FunctionalOperation
s, each one suitable
for different scenarios.
We’ve seen that calling manually operation()
allows putting into a pipeline
functions that are defined elsewhere (e.g. in another module, or are system functions).
But that method is also useful if you want to create multiple operation instances
with similar attributes, e.g. needs
:
>>> op_factory = operation(needs=['a'])
Notice that we specified a fn, in order to get back a FunctionalOperation
instance (and not a decorator).
>>> from graphtik import operation, compose
>>> from functools import partial
>>> def mypow(a, p=2):
... return a ** p
>>> pow_op2 = op_factory.withset(fn=mypow, provides="^2")
>>> pow_op3 = op_factory.withset(fn=partial(mypow, p=3), name='pow_3', provides='^3')
>>> pow_op0 = op_factory.withset(fn=lambda a: 1, name='pow_0', provides='^0')
>>> graphop = compose('powers', pow_op2, pow_op3, pow_op0)
>>> graphop
NetworkOperation('powers', needs=['a'], provides=['^2', '^3', '^0'], x3 ops:
mypow, pow_3, pow_0)
>>> graphop(a=2)
{'a': 2, '^2': 4, '^3': 8, '^0': 1}
Tip
See Plotting on how to make diagrams like this.
Decorator specification¶
If you are defining your computation graph and the functions that comprise it all in the same script,
the decorator specification of operation
instances might be particularly useful,
as it allows you to assign computation graph structure to functions as they are defined.
Here’s an example:
>>> from graphtik import operation, compose
>>> @operation(needs=['b', 'a', 'r'], provides='bar')
... def foo(a, b, c):
... return c * (a + b)
>>> graphop = compose('foo_graph', foo)
Notice that if
name
is not given, it is deduced from the function name.
Specifying graph structure: provides
and needs
¶
Each operation is a node in a computation graph, depending and supplying data from and to other nodes (via the solution), in order to compute.
This graph structure is specified (mostly) via the provides
and needs
arguments
to the operation()
factory, specifically:
needs
this argument names the list of (positionally ordered) inputs data the operation requires to receive from solution. The list corresponds, roughly, to the arguments of the underlying function (plus any sideffects).
It can be a single string, in which case a 1-element iterable is assumed.
provides
this argument names the list of (positionally ordered) outputs data the operation provides into the solution. The list corresponds, roughly, to the returned values of the fn (plus any sideffects & aliases).
It can be a single string, in which case a 1-element iterable is assumed.
If they are more than one, the underlying function must return an iterable with same number of elements (unless it returns dictionary).
Declarations of needs and provides is affected by modifiers like
mapped()
:
Map inputs to different function arguments¶
-
graphtik.modifiers.
mapped
(name: str, fn_kwarg: str)[source] Annotate a needs that (optionally) map inputs name –> argument-name.
- Parameters
fn_kwarg –
The argument-name corresponding to this named-input. If not given, a regular string is returned.
Example:
In case the name of the function arguments is different from the name in the inputs (or just because the name in the inputs is not a valid argument-name), you may map it with the 2nd argument of
mapped()
:>>> from graphtik import operation, compose, mapped
>>> @operation(needs=['a', mapped("name-in-inputs", "b")], provides="sum") ... def myadd(a, *, b): ... return a + b >>> myadd FunctionalOperation(name='myadd', needs=['a', mapped('name-in-inputs'-->'b')], provides=['sum'], fn='myadd')
>>> graph = compose('mygraph', myadd) >>> graph NetworkOperation('mygraph', needs=['a', 'name-in-inputs'], provides=['sum'], x1 ops: myadd)
>>> sol = graph.compute({"a": 5, "name-in-inputs": 4})['sum'] >>> sol 9
Execute operations with missing inputs¶
-
graphtik.modifiers.
optional
(name: str, fn_kwarg: str = None)[source] Annotate optionals needs corresponding to defaulted op-function arguments, …
received only if present in the inputs (when operation is invoked). The value of an optional is passed as a keyword argument to the underlying function.
Example:
>>> from graphtik import operation, compose, optional
>>> @operation(name='myadd', ... needs=["a", optional("b")], ... provides="sum") ... def myadd(a, b=0): ... return a + b
Notice the default value
0
to theb
annotated as optional argument:>>> graph = compose('mygraph', myadd) >>> graph NetworkOperation('mygraph', needs=['a', optional('b')], provides=['sum'], x1 ops: myadd)
The graph works both with and without
c
provided in the inputs:>>> graph(a=5, b=4)['sum'] 9 >>> graph(a=5) {'a': 5, 'sum': 5}
Like
mapped()
you may map input-name to a different function-argument:>>> operation(needs=['a', optional("quasi-real", "b")], ... provides="sum" ... )(myadd.fn) # Cannot wrap an operation, its `fn` only. FunctionalOperation(name='myadd', needs=['a', optional('quasi-real'-->'b')], provides=['sum'], fn='myadd')
Calling functions with varargs (*args
)¶
-
graphtik.modifiers.
vararg
(name: str)[source] Annotate a varargish needs to be fed as function’s
*args
.See also
Consult also the example test-case in:
test/test_op.py:test_varargs()
, in the full sources of the project.Example:
We designate
b
&c
as vararg arguments:>>> from graphtik import operation, compose, vararg
>>> @operation( ... needs=['a', vararg('b'), vararg('c')], ... provides='sum' ... ) ... def addall(a, *b): ... return a + sum(b) >>> addall FunctionalOperation(name='addall', needs=['a', vararg('b'), vararg('c')], provides=['sum'], fn='addall')
>>> graph = compose('mygraph', addall)
The graph works with and without any of
b
orc
inputs:>>> graph(a=5, b=2, c=4)['sum'] 11 >>> graph(a=5, b=2) {'a': 5, 'b': 2, 'sum': 7} >>> graph(a=5) {'a': 5, 'sum': 5}
-
graphtik.modifiers.
varargs
(name: str)[source] An varargish
vararg()
, naming a iterable value in the inputs.See also
Consult also the example test-case in:
test/test_op.py:test_varargs()
, in the full sources of the project.Example:
>>> from graphtik import operation, compose, varargs
>>> def enlist(a, *b): ... return [a] + list(b)
>>> graph = compose('mygraph', ... operation(name='enlist', needs=['a', varargs('b')], ... provides='sum')(enlist) ... ) >>> graph NetworkOperation('mygraph', needs=['a', optional('b')], provides=['sum'], x1 ops: enlist)
The graph works with or without b in the inputs:
>>> graph(a=5, b=[2, 20])['sum'] [5, 2, 20] >>> graph(a=5) {'a': 5, 'sum': [5]} >>> graph(a=5, b=0xBAD) Traceback (most recent call last): ... graphtik.base.MultiValueError: Failed preparing needs: 1. Expected needs[varargs('b')] to be non-str iterables! +++inputs: ['a', 'b'] +++FunctionalOperation(name='enlist', needs=['a', varargs('b')], provides=['sum'], fn='enlist')
Attention
To avoid user mistakes, varargs do not accept
str
inputs (though iterables):>>> graph(a=5, b="mistake") Traceback (most recent call last): ... graphtik.base.MultiValueError: Failed preparing needs: 1. Expected needs[varargs('b')] to be non-str iterables! +++inputs: ['a', 'b'] +++FunctionalOperation(name='enlist', needs=['a', varargs('b')], provides=['sum'], fn='enlist')
Aliased provides¶
Sometimes, you need to interface functions & operations where they name a
dependency differently.
This is doable without introducing “pipe-through” interface operation, either
by annotating certain needs with mapped()
modifiers (above), or
by aliassing certain provides to different names:
>>> op = operation(str,
... name="`provides` with `aliases`",
... needs="anything",
... provides="real thing",
... aliases=("real thing", "phony"))
Considerations for when building pipelines¶
When many operations are composed into a computation graph, Graphtik matches up the values in their needs and provides to form the edges of that graph (see Pipelines for more on that), like the operations from the script in Quick start:
>>> from operator import mul, sub
>>> from functools import partial
>>> from graphtik import compose, operation
>>> def abspow(a, p):
... """Compute |a|^p. """
... c = abs(a) ** p
... return c
>>> # Compose the mul, sub, and abspow operations into a computation graph.
>>> graphop = compose("graphop",
... operation(mul, needs=["a", "b"], provides=["ab"]),
... operation(sub, needs=["a", "ab"], provides=["a_minus_ab"]),
... operation(name="abspow1", needs=["a_minus_ab"], provides=["abs_a_minus_ab_cubed"])
... (partial(abspow, p=3))
... )
>>> graphop
NetworkOperation('graphop',
needs=['a', 'b', 'ab', 'a_minus_ab'],
provides=['ab', 'a_minus_ab', 'abs_a_minus_ab_cubed'],
x3 ops: mul, sub, abspow1)
Notice the use of
functools.partial()
to set parameterp
to a constant value.And this is done by calling once more the returned “decorator* from
operation()
, when called without a functions.
The needs
and provides
arguments to the operations in this script define
a computation graph that looks like this: