The definition & execution of networked operation is split in 1+2 phases:
… it is constrained by these IO data-structures:
… populates these low-level data-structures:
network (COMPOSE time)
execution dag (COMPILE time)
execution steps (COMPILE time)
solution (EXECUTE time)
… and utilizes these main classes:
graphtik.fnop.FnOp(fn[, name, rescheduled, …])
An operation performing a callable (ie a function, a method, a lambda).
graphtik.pipeline.Pipeline(operations, name, *)
An operation that can compute a network-graph of operations.
A graph of operations that can compile an execution plan.
graphtik.execution.ExecutionPlan(net, needs, …)
A pre-compiled list of operation steps that can execute for the given inputs/outputs.
The solution chain-map and execution state (e.g.
… plus those for plotting:
graphtik.plot.Theme(*, _prototype, **kw)
The poor man’s css-like plot theme (see also
- combine pipelines
They are selected by the
- operation merging
Any identically-named operations override each other, with the operations added earlier in the
.compose()call (further to the left) winning over those added later (further to the right).
- operation nesting
The elaborate method to combine pipelines forming clusters.
The original pipelines are preserved intact in “isolated” clusters, by prefixing the names of their operations (and optionally data) by the name of the respective original pipeline that contained them (or the user defines the renames).
Currently there are 2 ways to execute:
(unstable) parallel, with a
Plans may abort their execution by setting the abort run global flag.
During planning the graph is pruned based on the given inputs, outputs & node predicate to extract the dag, and it is ordered, to derive the execution steps, stored in a new plan, which is then cached on the
- execution plan
The results of the last operation executed “win” in the outputs produced, and the base (least precedence) is the user-inputs given when the execution started.
Certain values may be extracted/populated with accessors.
- solution layer
This layering is disabled if a jsonp dependency exists in the network, assuming that
set_layered_solution()configurations has not been called with a
True/False, nor has the respective parameter been given to methods
- unsatisfied operation
- execution dag
- solution dag
There are 2 directed-acyclic-graphs instances used:
- execution steps
The only instruction step other than an operation is for performing an eviction.
These values are either:
Those values are either:
returned to user after the outer pipeline has finished computation.
An operation may return partial outputs.
This class is also an operation, so it specifies needs & provides but these are not fixed, in the sense that
Pipeline.compute()can potentially consume and provide different subsets of inputs/outputs.
Either the abstract notion of an action with specified needs and provides, dependencies, or the concrete wrapper
callable()), that feeds on inputs and update outputs, from/to solution, or given-by/returned-to the user by a pipeline.
The distinction between needs/provides and inputs/outputs is akin to function parameters and arguments during define-time and run-time, respectively.
inputs & outputs in solution are accessed by the needs & provides names of the operations;
operation needs & provides are zipped against the underlying function’s arguments and results.
Differences between various dependency operation attributes:
roughly corresponding to underlying function’s arguments (fn_needs).
Operation.compute()extracts input values from solution by these names, and matches them against function arguments, mostly by their positional order. Whenever this matching is not 1-to-1, and function-arguments differ from the regular needs, modifiers must be used.
roughly corresponding to underlying function’s results (fn_provides).
Operation.compute()“zips” this list-of-names with the output values produced when the operation’s function is called. Whenever this “zipping” is not 1-to-1, and function-results differ from the regular operation (op_provides) (or results are not a list), it is possible to:
You cannot alias an alias. See Aliased provides
- conveyor operation
- default identity function
Default conveyor operation &
- returns dictionary
When an operation is marked with
FnOp.returns_dictflag, the underlying function is not expected to return fn_provides as a sequence but as a dictionary; hence, no “zipping” of function-results –> fn_provides takes place.
For instance, a needs may be annotated as
keyword()and/or optionals function arguments, provides and needs can be annotated as “ghost” sideffects or assigned an accessor to work with hierarchical data.
representationof modifier-annotated dependencies utilize a combination of these diacritics:
In the underlying function it corresponds to either:
There are 2 kinds, both, by definition, optionals:
vararg()annotates any solution value to be appended once in the
varargs()annotates iterable values and all its items are appended in the
>>> graph(a=5, b="mistake") Traceback (most recent call last): ... ValueError: Failed preparing needs: 1. Expected needs['b'(+)] to be non-str iterables! +++inputs: ['a', 'b'] +++FnOp(name='enlist', needs=['a', 'b'(+)], provides=['sum'], fn='enlist')
In printouts, it is denoted either with
See also the elaborate example in Hierarchical data and further tricks section.
There are actually 2 relevant modifiers:
Both kinds of sideffects participate in the planning of the graph, and both may be given or asked in the inputs & outputs of a pipeline, but they are never given to functions. A function of a returns dictionary operation can return a falsy value to declare it as canceled.
To be precise, the “sideffected dependency” is the name held in
_Modifier.sideffectedattribute of a modifier created by
See also the elaborate example in Hierarchical data and further tricks section.
- doc chain
- hierarchical data
A subdoc is a dependency value nested further into another one (the superdoc), accessed with a json pointer path expression with respect to the solution, denoted with slashes like:
Note that if a nested output is asked, then all docs-in-chain are kept i.e. all superdocs till the root dependency (the “superdocs”) plus all its subdocs (the “subdocs”); as depicted below for a hypothetical dependency
For instance, if the root has been asked as output, no subdoc can be subsequently evicted.
:Hierarchical data and further tricks (example)
- json pointer path
- partial outputs
- canceled operation
the solution must then reschedule the remaining operations downstream, and possibly cancel some of those ( assigned in
Partial operations are usually declared with returns dictionary so that the underlying function can control which of the outputs are returned.
Keep executing as many operations as possible, even if some of them fail. Endurance for an operation is enabled if
set_endure_operations()is true globally in the configurations or if
- node predicate
- abort run
- parallel execution
- execution pool
- process pool
multiprocessing.pool.Poolclass is used for parallel execution, the tasks must be communicated to/from the worker process, which requires pickling, and that may fail. With pickling failures you may try marshalling with dill library, and see if that helps.
- thread pool
Note that sideffects do not work when this is enabled.
Such objects may render as SVG in Jupiter notebooks (through their
plot()method) and can render in a Sphinx site with with the
graphtikRsT directive. You may control the rendered image as explained in the tip of the Plotting section.
Zoom-and-pan does not work in Sphinx sites for Chrome locally - serve the HTML files through some HTTP server, e.g. launch this command to view the site of this project:
python -m http.server 8080 --directory build/sphinx/html/
Plotteris responsible for rendering plottables as images. It is the active plotter that does that, unless overridden in a
Plottable.plot()call. Plotters can be customized by various means, such plot theme.
- active plotter
- default active plotter
The default active plotter is the plotter instance that this project comes pre-configured with, ie, when no plot-customizations have yet happened.
It is recommended to use other means for Plot customizations instead of modifying directly theme’s class-attributes.
Themeclass-attributes are deep-copied when constructing new instances, to avoid modifications by mistake, while attempting to update instance-attributes instead (hint: allmost all its attributes are containers i.e. dicts). Therefore any class-attributes modification will be ignored, until a new
Themeinstance from the patched class is used .
- plot theme
- current theme
The current theme in-use is the
Plotter.default_themeattribute of the active plotter, unless overridden with the
themeparameter when calling
Plottable.plot()(conveyed internally as the value of the
- style expansion
A style is an attribute of a plot theme, either a scalar value or a dictionary.
Refinstances, first against the current nx_attrs and then against the attributes of the current theme.
Call any callables with current
plot_argsand replace them by their result (even more flexible than templates).
Any Nones results above are discarded.
Workaround pydot/pydot#228 pydot-cstor not supporting styles-as-lists.
DEBUGis enabled, the provenance of all style values appears in the tooltips of plotted graphs.
- graphtik configuration
The functions controlling compile & execution globally are defined in
configmodule and +1 in
graphtik.plotmodule; the underlying global data are stored in
contextvars.ContextVarinstances, to allow for nested control.
All boolean configuration flags are tri-state (
None, False, True), allowing to “force” all operations, when they are not set to the
Nonevalue. All of them default to
When operations fail, the original exception gets annotated with salvaged values from
locals()and raised intact.
See Jetsam on exceptions.