Graphtik¶
(src: 5.7.1, git: v5.7.1 , Apr 07, 2020)
It’s a DAG all the way down!
Lightweight computation graphs for Python¶
Graphtik is an an understandable and lightweight Python module for building and running ordered graphs of computations. The API posits a fair compromise between features and complexity, without precluding any. It can be used as is to build machine learning pipelines for data science projects. It should be extendable to act as the core for a custom ETL engine or a workflow-processor for interdependent files and processes.
Graphtik sprang from Graphkit to experiment with Python 3.6+ features.
- 1. Operations
- 2. Graph Composition
- 3. Plotting and Debugging
- 4. Architecture
- 5. API Reference
- 6. Changes
- TODOs
- GitHub Releases
- Changelog
- v5.7.1 (7 Apr 2020, @ankostis): Plot job, fix RTD deps
- v5.7.0 (6 Apr 2020, @ankostis): FIX +SphinxExt in Wheel
- v5.6.0 (6 Apr 2020, @ankostis): +check_if_incomplete
- v5.5.0 (1 Apr 2020, @ankostis): ortho plots
- v5.4.0 (29 Mar 2020, @ankostis): auto-name ops, dogfood quickstart
- v5.3.0 (28 Mar 2020, @ankostis): Sphinx plots, fail-early on bad op
- v5.2.2 (03 Mar 2020, @ankostis): stuck in PARALLEL, fix Impossible Outs, plot quoting, legend node
- v5.2.1 (28 Feb 2020, @ankostis): fix plan cache on skip-evictions, PY3.8 TCs, docs
- v5.2.0 (27 Feb 2020, @ankostis): Map needs inputs –> args, SPELLCHECK
- v5.1.0 (22 Jan 2020, @ankostis): accept named-tuples/objects provides
- v5.0.0 (31 Dec 2019, @ankostis): Method–>Parallel, all configs now per op flags; Screaming Solutions on fails/partials
- v4.4.1 (22 Dec 2019, @ankostis): bugfix debug print
- v4.4.0 (21 Dec 2019, @ankostis): RESCHEDULE for PARTIAL Outputs, on a per op basis
- v4.3.0 (16 Dec 2019, @ankostis): Aliases
- v4.2.0 (16 Dec 2019, @ankostis): ENDURED Execution
- v4.1.0 (13 Dec 2019, @ankostis): ChainMap Solution for Rewrites, stable TOPOLOGICAL sort
- v4.0.1 (12 Dec 2019, @ankostis): bugfix
- v4.0.0 (11 Dec 2019, @ankostis): NESTED merge, revert v3.x Unvarying, immutable OPs, “color” nodes
- v3.1.0 (6 Dec 2019, @ankostis): cooler
prune()
- v3.0.0 (2 Dec 2019, @ankostis): UNVARYING NetOperations, narrowed, API refact
- v2.3.0 (24 Nov 2019, @ankostis): Zoomable SVGs & more op jobs
- v2.2.0 (20 Nov 2019, @ankostis): enhance OPERATIONS & restruct their modules
- v2.1.1 (12 Nov 2019, @ankostis): global configs
- v2.1.0 (20 Oct 2019, @ankostis): DROP BW-compatible, Restruct modules/API, Plan perfect evictions
- v2.0.0b1 (15 Oct 2019, @ankostis): Rebranded as Graphtik for Python 3.6+
- v1.3.0 (Oct 2019, @ankostis): NEVER RELEASED: new DAG solver, better plotting & “sideffect”
- v1.2.4 (Mar 7, 2018)
- 1.2.2 (Mar 7, 2018, @huyng): Fixed versioning
- 1.2.1 (Feb 23, 2018, @huyng): Fixed multi-threading bug and faster compute through caching of find_necessary_steps
- 1.2.0 (Feb 13, 2018, @huyng)
- 1.1.0 (Nov 9, 2017, @huyng)
- 1.0.4 (Nov 3, 2017, @huyng): Networkx 2.0 compatibility
- 1.0.3 (Jan 31, 2017, @huyng): Make plotting dependencies optional
- 1.0.2 (Sep 29, 2016, @pumpikano): Merge pull request yahoo#5 from yahoo/remove-packaging-dep
- 1.0.1 (Aug 24, 2016)
- 1.0 (Aug 2, 2016, @robwhess)
- 7. Index
Quick start¶
Here’s how to install:
pip install graphtik
OR with dependencies for plotting support (and you need to install Graphviz program separately with your OS tools):
pip install graphtik[plot]
Let’s build a graphtik computation graph that produces x3 outputs out of 2 inputs a and b:
>>> from graphtik import compose, operation
>>> from operator import mul, sub
>>> @operation(name="abs qubed",
... needs=["a_minus_ab"],
... provides=["abs_a_minus_ab_cubed"])
... def abs_qubed(a):
... return abs(a) ** 3
Compose the abspow
function along with mul
& sub
built-ins
into a computation graph:
>>> graphop = compose("graphop",
... operation(needs=["a", "b"], provides=["ab"])(mul),
... operation(sub, needs=["a", "ab"], provides=["a_minus_ab"])(),
... abs_qubed,
... )
>>> graphop
NetworkOperation('graphop', needs=['a', 'b', 'ab', 'a_minus_ab'],
provides=['ab', 'a_minus_ab', 'abs_a_minus_ab_cubed'],
x3 ops: <built-in function mul>, <built-in function sub>, abs qubed)
You may plot the function graph in a file like this (if in jupyter, no need to specify the file):
>>> graphop.plot('graphop.svg') # doctest: +SKIP
As you can see, any function can be used as an operation in Graphtik, even ones imported from system modules.
Run the graph-operation and request all of the outputs:
>>> sol = graphop(**{'a': 2, 'b': 5})
>>> sol
{'a': 2, 'b': 5, 'ab': 10, 'a_minus_ab': -8, 'abs_a_minus_ab_cubed': 512}
Run the graph-operation and request a subset of the outputs:
>>> solution = graphop.compute({'a': 2, 'b': 5}, outputs=["a_minus_ab"])
>>> solution
{'a_minus_ab': -8}
Solutions are plottable as well:
>>> solution.plot('solution.svg') # doctest: +SKIP
… where the (interactive) legend is this:
>>> from graphtik.plot import legend
>>> l = legend()