Supported Python versions of latest release in PyPi Development Status (src: 4.4.0, git: v4.4.0 , Dec 21, 2019) Latest release in GitHub Latest version in PyPI Travis continuous integration testing ok? (Linux) ReadTheDocs ok? cover-status Code Style Apache License, version 2.0

Github watchers Github stargazers Github forks Issues count

It’s a DAG all the way down!

G pipeline a a mul1 mul1 a->mul1 sub1 sub1 a->sub1 ab ab mul1->ab b b b->mul1 ab->sub1 a_minus_ab a_minus_ab sub1->a_minus_ab

Lightweight computation graphs for Python

Graphtik is an an understandable and lightweight Python module for building and running ordered graphs of computations. The API posits a fair compromise between features and complexity, without precluding any. It can be used as is to build machine learning pipelines for data science projects. It should be extendable to act as the core for a custom ETL engine or a workflow-processor for interdependent files and processes.

Graphtik sprang from Graphkit to experiment with Python 3.6+ features.

Quick start

Here’s how to install:

pip install graphtik

OR with dependencies for plotting support (and you need to install Graphviz program separately with your OS tools):

pip install graphtik[plot]

Here’s a Python script with an example Graphtik computation graph that produces multiple outputs (a * b, a - a * b, and abs(a - a * b) ** 3):

>>> from operator import mul, sub
>>> from functools import partial
>>> from graphtik import compose, operation

# Computes |a|^p.
>>> def abspow(a, p):
...    c = abs(a) ** p
...    return c

Compose the mul, sub, and abspow functions into a computation graph:

>>> graphop = compose("graphop",
...    operation(name="mul1", needs=["a", "b"], provides=["ab"])(mul),
...    operation(name="sub1", needs=["a", "ab"], provides=["a_minus_ab"])(sub),
...    operation(name="abspow1", needs=["a_minus_ab"], provides=["abs_a_minus_ab_cubed"])
...    (partial(abspow, p=3))
... )

Run the graph-operation and request all of the outputs:

>>> graphop(**{'a': 2, 'b': 5})
{'a': 2, 'b': 5, 'ab': 10, 'a_minus_ab': -8, 'abs_a_minus_ab_cubed': 512}

Run the graph-operation and request a subset of the outputs:

>>> solution = graphop.compute({'a': 2, 'b': 5}, outputs=["a_minus_ab"])
>>> solution
{'a_minus_ab': -8}

… and plot the results (if in jupyter, no need to create the file):

>>> solution.plot('graphop.svg')      

G graphop cluster_after prunning after prunning abspow1 abspow1 abs_a_minus_ab_cubed abs_a_minus_ab_cubed abspow1->abs_a_minus_ab_cubed a a mul1 mul1 a->mul1 ab ab a->ab 4 sub1 sub1 a->sub1 b b mul1->b 1 mul1->ab b->mul1 b->sub1 2 ab->sub1 sub1->a 3 a_minus_ab a_minus_ab sub1->a_minus_ab a_minus_ab->abspow1 G cluster_legend Graphtik Legend operation operation insteps execution step executed executed failed failed reschedule reschedule canceled canceled data data input input output output inp_out inp+out evicted evicted sol in solution overwrite overwrite requirement requirement e1->requirement optional optional requirement->optional sideffect sideffect optional->sideffect sequence execution sequence sideffect->sequence 1

As you can see, any function can be used as an operation in Graphtik, even ones imported from system modules!