Supported Python versions of latest release in PyPi Development Status (src: 10.0.0, git: v10.0.0 , Jul 19, 2020) Latest release in GitHub Latest version in PyPI Travis continuous integration testing ok? (Linux) ReadTheDocs ok? cover-status Code Style Apache License, version 2.0

Github watchers Github stargazers Github forks Issues count

It’s a DAG all the way down!

solution_x9_nodes quarantine quarantine get_out_or_stay_home OP: get_out_or_stay_home ? } FN: get_out_or_stay_home quarantine:s->get_out_or_stay_home:n space space get_out_or_stay_home:s->space:n time time get_out_or_stay_home:s->time:n exercise OP: exercise FN: exercise space:s->exercise:n read_book OP: read_book FN: read_book time:s->read_book:n fun fun exercise:s->fun:n body body exercise:s->body:n read_book:s->fun:n brain brain read_book:s->brain:n legend legend

Lightweight computation graphs for Python

Graphtik is an an understandable and lightweight Python module for executing a graph of functions (a.k.a pipeline) on hierarchical data.

  • The API posits a fair compromise between features and complexity, without precluding any.

  • It can be used as is to build machine learning pipelines for data science projects.

  • It should be extendable to act as the core for a custom ETL engine, a workflow-processor for interdependent tasks & files like GNU Make, or an Excel-like spreadsheet.

Graphtik sprang from Graphkit (summer 2019, v1.2.2) to experiment with Python 3.6+ features, but has diverged significantly with enhancements ever since.

Table of Contents



Quick start

Here’s how to install:

pip install graphtik

OR with dependencies for plotting support (and you need to install Graphviz program separately with your OS tools):

pip install graphtik[plot]

Let’s build a graphtik computation pipeline that produces x3 outputs out of 2 inputs a and b:

\[ \begin{align}\begin{aligned}a \times b\\a - a \times b\\|a - a \times b| ^ 3\end{aligned}\end{align} \]
>>> from graphtik import compose, operation
>>> from operator import mul, sub
>>> @operation(name="abs qubed",
...            needs=["a_minus_ab"],
...            provides=["abs_a_minus_ab_cubed"])
... def abs_qubed(a):
...    return abs(a) ** 3

Compose the abspow function along with mul & sub built-ins into a computation graph:

>>> graphop = compose("graphop",
...    operation(mul, needs=["a", "b"], provides=["ab"]),
...    operation(sub, needs=["a", "ab"], provides=["a_minus_ab"]),
...    abs_qubed,
... )
>>> graphop
Pipeline('graphop', needs=['a', 'b', 'ab', 'a_minus_ab'],
                  provides=['ab', 'a_minus_ab', 'abs_a_minus_ab_cubed'],
                  x3 ops: mul, sub, abs qubed)

You may plot the function graph in a file like this (if in jupyter, no need to specify the file, see Jupyter notebooks):

>>> graphop.plot('graphop.svg')      # doctest: +SKIP

As you can see, any function can be used as an operation in Graphtik, even ones imported from system modules.

Run the graph-operation and request all of the outputs:

>>> sol = graphop(**{'a': 2, 'b': 5})
>>> sol
{'a': 2, 'b': 5, 'ab': 10, 'a_minus_ab': -8, 'abs_a_minus_ab_cubed': 512}

Solutions are plottable as well:

>>> solution.plot('solution.svg')      # doctest: +SKIP

Run the graph-operation and request a subset of the outputs:

>>> solution = graphop.compute({'a': 2, 'b': 5}, outputs=["a_minus_ab"])
>>> solution
{'a_minus_ab': -8}

… where the (interactive) legend is this:

>>> from graphtik.plot import legend
>>> l = legend()