4. API Reference

Package: graphtik

Lightweight computation graphs for Python.

Module: base

Mostly utilities

class graphtik.base.Plotter[source]

Classes wishing to plot their graphs should inherit this and …

implement property plot to return a “partial” callable that somehow ends up calling plot.render_pydot() with the graph or any other args binded appropriately. The purpose is to avoid copying this function & documentation here around.

plot(filename=None, show=False, jupyter_render: Union[None, Mapping[KT, VT_co], str] = None, **kws)[source]

Entry-point for plotting ready made operation graphs.

Parameters:
  • filename (str) – Write diagram into a file. Common extensions are .png .dot .jpg .jpeg .pdf .svg call plot.supported_plot_formats() for more.
  • show – If it evaluates to true, opens the diagram in a matplotlib window. If it equals -1, it plots but does not open the Window.
  • inputs – an optional name list, any nodes in there are plotted as a “house”
  • outputs – an optional name list, any nodes in there are plotted as an “inverted-house”
  • solution – an optional dict with values to annotate nodes, drawn “filled” (currently content not shown, but node drawn as “filled”)
  • executed – an optional container with operations executed, drawn “filled”
  • title – an optional string to display at the bottom of the graph
  • node_props – an optional nested dict of Grapvhiz attributes for certain nodes
  • edge_props – an optional nested dict of Grapvhiz attributes for certain edges
  • clusters – an optional mapping of nodes –> cluster-names, to group them
  • jupyter_render – a nested dictionary controlling the rendering of graph-plots in Jupyter cells, if None, defaults to jupyter_render (you may modify it in place and apply for all future calls).
Returns:

a pydot.Dot instance (for for API reference visit: https://pydotplus.readthedocs.io/reference.html#pydotplus.graphviz.Dot)

Tip

The pydot.Dot instance returned is rendered directly in Jupyter/IPython notebooks as SVG images.

You may increase the height of the SVG cell output with something like this:

netop.plot(svg_element_styles="height: 600px; width: 100%")

Check default_jupyter_render for defaults.

Note that the graph argument is absent - Each Plotter provides its own graph internally; use directly render_pydot() to provide a different graph.

Graphtik Legend

NODES:

oval
function
egg
subgraph operation
house
given input
inversed-house
asked output
polygon
given both as input & asked as output (what?)
square
intermediate data, neither given nor asked.
red frame
evict-instruction, to free up memory.
blue frame
pinned-instruction, not to overwrite intermediate inputs.
filled
data node has a value in solution OR function has been executed.
thick frame
function/data node in execution steps.

ARROWS

solid black arrows
dependencies (source-data need-ed by target-operations, sources-operations provides target-data)
dashed black arrows
optional needs
blue arrows
sideffect needs/provides
wheat arrows
broken dependency (provide) during pruning
green-dotted arrows
execution steps labeled in succession

To generate the legend, see legend().

Sample code:

>>> from graphtik import compose, operation
>>> from graphtik.modifiers import optional
>>> from operator import add
>>> netop = compose("netop",
...     operation(name="add", needs=["a", "b1"], provides=["ab1"])(add),
...     operation(name="sub", needs=["a", optional("b2")], provides=["ab2"])(lambda a, b=1: a-b),
...     operation(name="abb", needs=["ab1", "ab2"], provides=["asked"])(add),
... )
>>> netop.plot(show=True);                 # plot just the graph in a matplotlib window # doctest: +SKIP
>>> inputs = {'a': 1, 'b1': 2}
>>> solution = netop(**inputs)             # now plots will include the execution-plan
>>> netop.plot('plot1.svg', inputs=inputs, outputs=['asked', 'b1'], solution=solution);           # doctest: +SKIP
>>> dot = netop.plot(solution=solution);   # just get the `pydoit.Dot` object, renderable in Jupyter
>>> print(dot)
digraph G {
  fontname=italic;
  label=netop;
  a [fillcolor=wheat, shape=invhouse, style=filled, tooltip=1];
...
graphtik.base.aslist(i, argname, allowed_types=<class 'list'>)[source]

Utility to accept singular strings as lists, and None –> [].

graphtik.base.astuple(i, argname, allowed_types=<class 'tuple'>)[source]
graphtik.base.jetsam(ex, locs, *salvage_vars, annotation='jetsam', **salvage_mappings)[source]

Annotate exception with salvaged values from locals() and raise!

Parameters:
  • ex – the exception to annotate
  • locs

    locals() from the context-manager’s block containing vars to be salvaged in case of exception

    ATTENTION: wrapped function must finally call locals(), because locals dictionary only reflects local-var changes after call.

  • annotation – the name of the attribute to attach on the exception
  • salvage_vars – local variable names to save as is in the salvaged annotations dictionary.
  • salvage_mappings – a mapping of destination-annotation-keys –> source-locals-keys; if a source is callable, the value to salvage is retrieved by calling value(locs). They take precendance over`salvae_vars`.
Raises:

any exception raised by the wrapped function, annotated with values assigned as atrributes on this context-manager

  • Any attrributes attached on this manager are attached as a new dict on the raised exception as new jetsam attrribute with a dict as value.
  • If the exception is already annotated, any new items are inserted, but existing ones are preserved.

Example:

Call it with managed-block’s locals() and tell which of them to salvage in case of errors:

try:
    a = 1
    b = 2
    raise Exception()
exception Exception as ex:
    jetsam(ex, locals(), "a", b="salvaged_b", c_var="c")

And then from a REPL:

import sys
sys.last_value.jetsam
{'a': 1, 'salvaged_b': 2, "c_var": None}

** Reason:**

Graphs may become arbitrary deep. Debugging such graphs is notoriously hard.

The purpose is not to require a debugger-session to inspect the root-causes (without precluding one).

Naively salvaging values with a simple try/except block around each function, blocks the debugger from landing on the real cause of the error - it would land on that block; and that could be many nested levels above it.

Module: op

About operation nodes (but not net-ops to break cycle).

class graphtik.op.FunctionalOperation(fn: Callable, name, needs=None, provides=None, *, returns_dict=None)[source]

An Operation performing a callable (ie function, method, lambda).

Use operation() factory to build instances of this class instead.

compute(named_inputs, outputs=None) → dict[source]

Compute (optional) asked outputs for the given named_inputs.

It is called by Network. End-users should simply call the operation with named_inputs as kwargs.

Parameters:named_inputs (list) – A list of Data objects on which to run the layer’s feed-forward computation.
Returns list:Should return a list values representing the results of running the feed-forward computation on inputs.
class graphtik.op.Operation(name, needs=None, provides=None)[source]

An abstract class representing a data transformation by compute().

compute(named_inputs, outputs=None)[source]

Compute (optional) asked outputs for the given named_inputs.

It is called by Network. End-users should simply call the operation with named_inputs as kwargs.

Parameters:named_inputs (list) – A list of Data objects on which to run the layer’s feed-forward computation.
Returns list:Should return a list values representing the results of running the feed-forward computation on inputs.
class graphtik.op.operation(fn: Callable = None, *, name=None, needs=None, provides=None, returns_dict=None)[source]

A builder for graph-operations wrapping functions.

Parameters:
  • fn (function) – The function used by this operation. This does not need to be specified when the operation object is instantiated and can instead be set via __call__ later.
  • name (str) – The name of the operation in the computation graph.
  • needs (list) – Names of input data objects this operation requires. These should correspond to the args of fn.
  • provides (list) – Names of output data objects this operation provides. If more than one given, those must be returned in an iterable, unless returns_dict is true, in which cae a dictionary with as many elements must be returned
  • returns_dict (bool) – if true, it means the fn returns a dictionary with all provides, and no further processing is done on them.
Returns:

when called, it returns a FunctionalOperation

Example:

This is an example of its use, based on the “builder pattern”:

>>> from graphtik import operation

>>> opb = operation(name='add_op')
>>> opb.withset(needs=['a', 'b'])
operation(name='add_op', needs=['a', 'b'], provides=[], fn=None)
>>> opb.withset(provides='SUM', fn=sum)
operation(name='add_op', needs=['a', 'b'], provides=['SUM'], fn='sum')

You may keep calling withset() till you invoke a final __call__() on the builder; then you get the actual FunctionalOperation instance:

>>> # Create `Operation` and overwrite function at the last moment.
>>> opb(sum)
FunctionalOperation(name='add_op', needs=['a', 'b'], provides=['SUM'], fn='sum')
withset(*, fn=None, name=None, needs=None, provides=None, returns_dict=None) → graphtik.op.operation[source]
graphtik.op.reparse_operation_data(name, needs, provides)[source]

Validate & reparse operation data as lists.

As a separate function to be reused by client code when building operations and detect errors aearly.

Module: netop

About network-operations (those based on graphs)

class graphtik.netop.NetworkOperation(net, name, *, inputs=None, outputs=None, method=None, overwrites_collector=None)[source]

An Operation performing a network-graph of other operations.

Tip

Use compose() factory to prepare the net and build instances of this class.

compute(named_inputs, outputs=None, recompile=None) → dict[source]

Solve & execute the graph, sequentially or parallel.

It see also Operation.compute().

Parameters:
  • named_inputs (dict) – A maping of names –> values that must contain at least the compulsory inputs that were specified when the plan was built (but cannot enforce that!). Cloned, not modified.
  • outputs – a string or a list of strings with all data asked to compute. If you set this variable to None, all data nodes will be kept and returned at runtime.
  • recompile
    • if False, uses fixed plan;
    • if true, recompiles a temporary plan from network;
    • if None, assumed true if outputs given (is not None).

    In all cases, the :attr:`last_plan is updated.

Returns:

a dictionary of output data objects, keyed by name.

Raises:

ValueError

  • If given inputs mismatched plan.needs, with msg:

    Plan needs more inputs…

  • If outputs asked do not exist in network, with msg:

    Unknown output nodes: …

  • If outputs asked cannot be produced by the graph, with msg:

    Impossible outputs…

  • If cannot produce any outputs from the given inputs, with msg:

    Unsolvable graph: …

inputs = None[source]

The inputs names (possibly None) used to compile the plan.

last_plan = None[source]

The execution_plan of the last call to compute(), stored as debugging aid.

method = None[source]

set execution mode to single-threaded sequential by default

narrow(inputs: Collection[T_co] = None, outputs: Collection[T_co] = None, name=None) → graphtik.netop.NetworkOperation[source]

Return a copy with a network pruned for the given needs & provides.

Parameters:
  • inputs – a collection of inputs that must be given to compute(); a WARNing is issued for any irrelevant arguments. If None, they are collected from the net. They become the needs of the returned netop.
  • outputs – a collection of outputs that will be asked from compute(); RAISES if those cannnot be satisfied. If None, they are collected from the net. They become the provides of the returned netop.
  • name

    the name for the new netop:

    • if None, the same name is kept;
    • if True, a distinct name is devised:
      <old-name>-<uid>
      
    • otherwise, the given name is applied.
Returns:

a cloned netop with a narrowed plan

Raises:

ValueError

  • If outputs asked do not exist in network, with msg:

    Unknown output nodes: …

  • If outputs asked cannot be produced by the graph, with msg:

    Impossible outputs…

  • If cannot produce any outputs from the given inputs, with msg:

    Unsolvable graph: …

outputs = None[source]

The outputs names (possibly None) used to compile the plan.

overwrites_collector = None[source]
plan = None[source]

The narrowed plan enforcing unvarying needs & provides when compute() called with recompile=False (default is recompile=None, which means, recompile only if outputs given).

set_execution_method(method)[source]

Determine how the network will be executed.

Parameters:method (str) – If “parallel”, execute graph operations concurrently using a threadpool.
set_overwrites_collector(collector)[source]

Asks to put all overwrites into the collector after computing

An “overwrites” is intermediate value calculated but NOT stored into the results, becaues it has been given also as an intemediate input value, and the operation that would overwrite it MUST run for its other results.

Parameters:collector – a mutable dict to be fillwed with named values
graphtik.netop.compose(name, op1, *operations, needs=None, provides=None, merge=False, method=None, overwrites_collector=None) → graphtik.netop.NetworkOperation[source]

Composes a collection of operations into a single computation graph, obeying the merge property, if set in the constructor.

Parameters:
  • name (str) – A optional name for the graph being composed by this object.
  • op1 – syntactically force at least 1 operation
  • operations – Each argument should be an operation instance created using operation.
  • merge (bool) – If True, this compose object will attempt to merge together operation instances that represent entire computation graphs. Specifically, if one of the operation instances passed to this compose object is itself a graph operation created by an earlier use of compose the sub-operations in that graph are compared against other operations passed to this compose instance (as well as the sub-operations of other graphs passed to this compose instance). If any two operations are the same (based on name), then that operation is computed only once, instead of multiple times (one for each time the operation appears).
  • method – either parallel or None (default); if "parallel", launches multi-threading. Set when invoking a composed graph or by set_execution_method().
  • overwrites_collector – (optional) a mutable dict to be fillwed with named values. If missing, values are simply discarded.
Returns:

Returns a special type of operation class, which represents an entire computation graph as a single operation.

Raises:

ValueError – If the net` cannot produce the asked outputs from the given inputs.

Module: network

Network-based computation of operations & data.

The execution of network operations is splitted in 2 phases:

COMPILE:
prune unsatisfied nodes, sort dag topologically & solve it, and derive the execution steps (see below) based on the given inputs and asked outputs.
EXECUTE:
sequential or parallel invocation of the underlying functions of the operations with arguments from the solution.

Computations are based on 5 data-structures:

Network.graph

A networkx graph (yet a DAG) containing interchanging layers of Operation and _DataNode nodes. They are layed out and connected by repeated calls of add_OP().

The computation starts with prune() extracting a DAG subgraph by pruning its nodes based on given inputs and requested outputs in compute().

ExecutionPlan.dag
An directed-acyclic-graph containing the pruned nodes as build by prune(). This pruned subgraph is used to decide the ExecutionPlan.steps (below). The containing ExecutionPlan.steps instance is cached in _cached_plans across runs with inputs/outputs as key.
ExecutionPlan.steps

It is the list of the operation-nodes only from the dag (above), topologically sorted, and interspersed with instruction steps needed to complete the run. It is built by _build_execution_steps() based on the subgraph dag extracted above. The containing ExecutionPlan.steps instance is cached in _cached_plans across runs with inputs/outputs as key.

The instructions items achieve the following:

  • _EvictInstruction: evicts items from solution as soon as
    they are not needed further down the dag, to reduce memory footprint while computing.
  • _PinInstruction: avoid overwritting any given intermediate
    inputs, and still allow their providing operations to run (because they are needed for their other outputs).
var solution:a local-var in compute(), initialized on each run to hold the values of the given inputs, generated (intermediate) data, and output values. It is returned as is if no specific outputs requested; no data-eviction happens then.
arg overwrites:The optional argument given to compute() to colect the intermediate calculated values that are overwritten by intermediate (aka “pinned”) input-values.
exception graphtik.network.AbortedException[source]

Raised from the Network code when abort_run() is called.

graphtik.network._execution_configs = <ContextVar name='execution_configs' default={'execution_pool': <multiprocessing.pool.ThreadPool object>, 'abort': False, 'skip_evictions': False}>[source]

Global configurations for all (nested) networks in a computaion run.

class graphtik.network.Network(*operations)[source]

Assemble operations & data into a directed-acyclic-graph (DAG) to run them.

Variables:
  • needs – the “base”, all data-nodes that are not produced by some operation
  • provides – the “base”, all data-nodes produced by some operation
class graphtik.network.ExecutionPlan[source]

The result of the network’s compilation phase.

Note the execution plan’s attributes are on purpose immutable tuples.

Variables:
  • net – The parent Network
  • needs – A tuple with the input names needed to exist in order to produce all provides.
  • provides – A tuple with the outputs names produces when all inputs are given.
  • dag – The regular (not broken) pruned subgraph of net-graph.
  • broken_edges – Tuple of broken incoming edges to given data.
  • steps – The tuple of operation-nodes & instructions needed to evaluate the given inputs & asked outputs, free memory and avoid overwritting any given intermediate inputs.
  • evict – when false, keep all inputs & outputs, and skip prefect-evictions check.

Module: plot

Plotting graphtik graps

graphtik.plot.build_pydot(graph, steps=None, inputs=None, outputs=None, solution=None, executed=None, title=None, node_props=None, edge_props=None, clusters=None) → <sphinx.ext.autodoc.importer._MockObject object at 0x7f12ec219cf8>[source]

Build a Graphviz out of a Network graph/steps/inputs/outputs and return it.

See Plotter.plot() for the arguments, sample code, and the legend of the plots.

graphtik.plot.default_jupyter_render = {'svg_container_styles': '', 'svg_element_styles': 'width: 100%; height: 300px;', 'svg_pan_zoom_json': '{controlIconsEnabled: true, zoomScaleSensitivity: 0.4, fit: true}'}[source]

A nested dictionary controlling the rendering of graph-plots in Jupyter cells,

as those returned from Plotter.plot() (currently as SVGs). Either modify it in place, or pass another one in the respective methods.

The following keys are supported.

Parameters:
  • svg_pan_zoom_json

    arguments controlling the rendering of a zoomable SVG in Jupyter notebooks, as defined in https://github.com/ariutta/svg-pan-zoom#how-to-use if None, defaults to string (also maps supported):

    "{controlIconsEnabled: true, zoomScaleSensitivity: 0.4, fit: true}"
    
  • svg_element_styles

    mostly for sizing the zoomable SVG in Jupyter notebooks. Inspect & experiment on the html page of the notebook with browser tools. if None, defaults to string (also maps supported):

    "width: 100%; height: 300px;"
    
  • svg_container_styles – like svg_element_styles, if None, defaults to empty string (also maps supported).
graphtik.plot.legend(filename=None, show=None, jupyter_render: Optional[Mapping[KT, VT_co]] = None)[source]

Generate a legend for all plots (see Plotter.plot() for args)

graphtik.plot.render_pydot(dot: <sphinx.ext.autodoc.importer._MockObject object at 0x7f12ec198208>, filename=None, show=False, jupyter_render: str = None)[source]

Plot a Graphviz dot in a matplotlib, in file or return it for Jupyter.

Parameters:
  • dot – the pre-built Graphviz pydot.Dot instance
  • filename (str) – Write diagram into a file. Common extensions are .png .dot .jpg .jpeg .pdf .svg call plot.supported_plot_formats() for more.
  • show – If it evaluates to true, opens the diagram in a matplotlib window. If it equals -1, it returns the image but does not open the Window.
  • jupyter_render – a nested dictionary controlling the rendering of graph-plots in Jupyter cells. If None, defaults to default_jupyter_render (you may modify those in place and they will apply for all future calls).
Returns:

the matplotlib image if show=-1, or the dot.

See Plotter.plot() for sample code.

graphtik.plot.supported_plot_formats() → List[str][source]

return automatically all pydot extensions