5. API Reference¶
computation graphs for Python & Pandas |
|
compose operation/dependency from functions, matching/zipping inputs/outputs during execution. |
|
modifiers (print-out with diacritics) change dependency behavior during planning & execution. |
|
compose network of operations & dependencies, compile the plan. |
|
plotting handled by the active plotter & current theme. |
|
configurations for network execution, and utilities on them. |
|
Generic utilities, exceptions and operation & plottable base classes. |
|
jetsam utility for annotating exceptions from |
|
Utility for json pointer path modifier |
|
Extends Sphinx with |
Package: graphtik¶
computation graphs for Python & Pandas
Tip
The module import-time dependencies have been carefully optimized so that importing all from package takes the minimum time (e.g. <10ms in a 2019 laptop):
>>> %time from graphtik import * # doctest: +SKIP
CPU times: user 8.32 ms, sys: 34 µs, total: 8.35 ms
Wall time: 7.53 ms
Still, constructing your pipelines on import time would take considerable more time (e.g. ~300ms for the 1st pipeline). So prefer to construct them in “factory” module functions (remember to annotate them with typing hints to denote their retun type).
See also
plot.active_plotter_plugged()
, plot.set_active_plotter()
&
plot.get_active_plotter()
configs, not imported, unless plot is needed.
Module: fnop¶
compose operation/dependency from functions, matching/zipping inputs/outputs during execution.
Note
This module (along with modifier
& pipeline
) is what client code needs
to define pipelines on import time without incurring a heavy price
(<5ms on a 2019 fast PC)
- class graphtik.fnop.FnOp(fn: Optional[Callable] = None, name=None, needs: Optional[Union[Collection, str]] = None, provides: Optional[Union[Collection, str]] = None, aliases: Optional[Mapping] = None, *, cwd=None, rescheduled=None, endured=None, parallel=None, marshalled=None, returns_dict=None, node_props: Optional[Mapping] = None)[source]¶
An operation performing a callable (ie a function, a method, a lambda).
Tip
Use
operation()
factory to build instances of this class instead.Call
withset()
on existing instances to re-configure new clones.See diacritics to understand printouts of this class.
Differences between various dependency operation attributes:
dependency attribute
dupes
sfx
alias
sfxed
needs
needs
✗
✓
SINGULAR
_user_needs
✓
✓
_fn_needs
✓
✗
STRIPPED
provides
provides
✗
✓
✓
SINGULAR
_user_provides
✓
✓
✗
_fn_provides
✓
✗
✗
STRIPPED
where:
“dupes=no” means the collection drops any duplicated dependencies
“SINGULAR” means
sfxed('A', 'a', 'b') ==> sfxed('A', 'b'), sfxed('A', 'b')
“STRIPPED” means
sfxed('A', 'a', 'b') ==> sfx('a'), sfxed('b')
- __init__(fn: Optional[Callable] = None, name=None, needs: Optional[Union[Collection, str]] = None, provides: Optional[Union[Collection, str]] = None, aliases: Optional[Mapping] = None, *, cwd=None, rescheduled=None, endured=None, parallel=None, marshalled=None, returns_dict=None, node_props: Optional[Mapping] = None)[source]¶
Build a new operation out of some function and its requirements.
See
operation()
for the full documentation of parameters, study the code for attributes (or read them from rendered sphinx site).
- __module__ = 'graphtik.fnop'¶
- _abc_impl = <_abc_data object>¶
- _fn_needs[source]¶
Value names the underlying function requires (DUPES preserved, NO-SFX, STRIPPED sideffected).
- _fn_provides[source]¶
Value names the underlying function produces (DUPES, NO-ALIASES, NO_SFX, STRIPPED sideffected).
- _prepare_match_inputs_error(missing: List, varargs_bad: List, named_inputs: Mapping) ValueError [source]¶
- _zip_results_plain(results, is_rescheduled) dict [source]¶
Handle result sequence: no-result, single-item, or many.
- _zip_results_with_provides(results) dict [source]¶
Zip results with expected “real” (without sideffects) provides.
- aliases[source]¶
an optional mapping of fn_provides to additional ones, together comprising this operations the provides.
You cannot alias an alias.
- compute(named_inputs=None, outputs: Optional[Union[Collection, str]] = None, *args, **kw) dict [source]¶
- Parameters
named_inputs – a
Solution
instanceargs – ignored – to comply with superclass contract
kw – ignored – to comply with superclass contract
- property deps: Mapping[str, Collection]¶
All dependency names, including internal _user_ & _fn_.
if not DEBUG, all deps are converted into lists, ready to be printed.
- endured[source]¶
If true, even if callable fails, solution will reschedule; ignored if endurance enabled globally.
- marshalled[source]¶
If true, operation will be marshalled while computed, along with its inputs & outputs. (usefull when run in (deprecated) parallel with a process pool).
- name: str[source]¶
a name for the operation (e.g. ‘conv1’, ‘sum’, etc..); any “parents split by dots(
.
)”. :seealso: Nesting
- needs: Optional[Union[Collection, str]][source]¶
Dependencies ready to lay the graph for pruning (NO-DUPES, SFX, SINGULAR sideffecteds).
- node_props[source]¶
Added as-is into NetworkX graph, and you may filter operations by
Pipeline.withset()
. Also plot-rendering affected if they match Graphviz properties, if they start withUSER_STYLE_PREFFIX
, unless they start with underscore(_
).
- prepare_plot_args(plot_args: PlotArgs) PlotArgs [source]¶
Delegate to a provisional network with a single op .
- provides: Optional[Union[Collection, str]][source]¶
Value names ready to lay the graph for pruning (NO DUPES, ALIASES, SFX, SINGULAR sideffecteds, +alias destinations).
- rescheduled[source]¶
If true, underlying callable may produce a subset of provides, and the plan must then reschedule after the operation has executed. In that case, it makes more sense for the callable to returns_dict.
- returns_dict[source]¶
If true, it means the underlying function returns dictionary , and no further processing is done on its results, i.e. the returned output-values are not zipped with provides.
It does not have to return any alias outputs.
Can be changed amidst execution by the operation’s function.
- validate_fn_name()[source]¶
Call it before enclosing it in a pipeline, or it will fail on compute().
- withset(fn: Callable = Ellipsis, name=Ellipsis, needs: Optional[Union[Collection, str]] = Ellipsis, provides: Optional[Union[Collection, str]] = Ellipsis, aliases: Mapping = Ellipsis, *, cwd=Ellipsis, rescheduled=Ellipsis, endured=Ellipsis, parallel=Ellipsis, marshalled=Ellipsis, returns_dict=Ellipsis, node_props: Mapping = Ellipsis, renamer=None) FnOp [source]¶
Make a clone with the some values replaced, or operation and dependencies renamed.
if renamer given, it is applied on top (and afterwards) any other changed values, for operation-name, needs, provides & any aliases.
- Parameters
renamer –
if a dictionary, it renames any operations & data named as keys into the respective values by feeding them into :func:.dep_renamed()`, so values may be single-input callables themselves.
if it is a
callable()
, it is given aRenArgs
instance to decide the node’s name.
The callable may return a str for the new-name, or any other false value to leave node named as is.
Attention
The callable SHOULD wish to preserve any modifier on dependencies, and use
dep_renamed()
if a callable is given.- Returns
a clone operation with changed/renamed values asked
- Raise
(ValueError, TypeError): all cstor validation errors
ValueError: if a renamer dict contains a non-string and non-callable value
Examples
>>> from graphtik import operation, sfx
>>> op = operation(str, "foo", needs="a", ... provides=["b", sfx("c")], ... aliases={"b": "B-aliased"}) >>> op.withset(renamer={"foo": "BAR", ... 'a': "A", ... 'b': "B", ... sfx('c'): "cc", ... "B-aliased": "new.B-aliased"}) FnOp(name='BAR', needs=['A'], provides=['B', sfx('cc'), 'new.B-aliased'], aliases=[('B', 'new.B-aliased')], fn='str')
Notice that
'c'
rename change the “sideffect name, without the destination name being ansfx()
modifier (but source name must match the sfx-specifier).Notice that the source of aliases from
b-->B
is handled implicitely from the respective rename on the provides.
But usually a callable is more practical, like the one below renaming only data names:
>>> from graphtik.modifier import dep_renamed >>> op.withset(renamer=lambda ren_args: ... dep_renamed(ren_args.name, lambda n: f"parent.{n}") ... if ren_args.typ != 'op' else ... False) FnOp(name='foo', needs=['parent.a'], provides=['parent.b', sfx('parent.c'), 'parent.B-aliased'], aliases=[('parent.b', 'parent.B-aliased')], fn='str')
Notice the double use of lambdas with
dep_renamed()
– an equivalent rename callback would be:dep_renamed(ren_args.name, f"parent.{dependency(ren_args.name)}")
- graphtik.fnop.NO_RESULT = <NO_RESULT>¶
A special return value for the function of a reschedule operation signifying that it did not produce any result at all (including sideffects), otherwise, it would have been a single result,
None
. Usefull for rescheduled who want to cancel their single result witout being delcared as returns dictionary.
- graphtik.fnop.NO_RESULT_BUT_SFX = <NO_RESULT_BUT_SFX>¶
Like
NO_RESULT
but does not cancel any :term;`sideffects` declared as provides.
- graphtik.fnop._process_dependencies(deps: Collection[str]) Tuple[Collection[str], Collection[str]] [source]¶
Strip or singularize any implicit/sideffects and apply CWD.
- Parameters
cwd – The current-working-document, when given, all non-root dependencies (needs, provides & aliases) become jsonps, prefixed with this.
- Returns
a x2 tuple
(op_deps, fn_deps)
, where any instances of sideffects in deps are processed like this:- op_deps
any
sfxed()
is replaced by a sequence of “singularized
” instances, one for each item in its sfx_list;any duplicates are discarded;
order is irrelevant, since they don’t reach the function.
- fn_deps
- graphtik.fnop.as_renames(i, argname)[source]¶
Parses a list of (source–>destination) from dict, list-of-2-items, single 2-tuple.
- Returns
a (possibly empty)list-of-pairs
Note
The same source may be repeatedly renamed to multiple destinations.
- graphtik.fnop.identity_fn(*args, **kwargs)[source]¶
Act as the default function for the conveyor operation when no fn is given.
Adapted from https://stackoverflow.com/a/58524115/548792
- graphtik.fnop.jsonp_ize_all(deps, cwd: Sequence[str])[source]¶
Auto-convert deps with slashes as jsonp (unless
no_jsonp
).
- graphtik.fnop.operation(fn: ~typing.Callable = <UNSET>, name=<UNSET>, needs: ~typing.Optional[~typing.Union[~typing.Collection, str]] = <UNSET>, provides: ~typing.Optional[~typing.Union[~typing.Collection, str]] = <UNSET>, aliases: ~typing.Mapping = <UNSET>, *, cwd=<UNSET>, rescheduled=<UNSET>, endured=<UNSET>, parallel=<UNSET>, marshalled=<UNSET>, returns_dict=<UNSET>, node_props: ~typing.Mapping = <UNSET>) FnOp [source]¶
An operation factory that works like a “fancy decorator”.
- Parameters
fn –
The callable underlying this operation:
if not given, it returns the the
withset()
method as the decorator, so it still supports all arguments, apart from fn.if given, it builds the operation right away (along with any other arguments);
if given, but is
None
, it will assign the :default identity function right before it is computed.
Hint
This is a twisted way for “fancy decorators”.
After all that, you can always call
FnOp.withset()
on existing operation, to obtain a re-configured clone.If the fn is still not given when calling
FnOp.compute()
, then default identity function is implied, if name is given and the number of provides match the number of needs.name (str) – The name of the operation in the computation graph. If not given, deduce from any fn given.
needs –
the list of (positionally ordered) names of the data needed by the operation to receive as inputs, roughly corresponding to the arguments of the underlying fn (plus any sideffects).
It can be a single string, in which case a 1-element iterable is assumed.
provides –
the list of (positionally ordered) output data this operation provides, which must, roughly, correspond to the returned values of the fn (plus any sideffects & aliases).
It can be a single string, in which case a 1-element iterable is assumed.
If they are more than one, the underlying function must return an iterable with same number of elements, unless param returns_dict is true, in which case must return a dictionary that containing (at least) those named elements.
Note
When joining a pipeline this must not be empty, or will scream! (an operation without provides would always be pruned)
aliases – an optional mapping of provides to additional ones
cwd – The current-working-document, when given, all non-root dependencies (needs, provides & aliases) become jsonp\s, prefixed with this.
rescheduled – If true, underlying callable may produce a subset of provides, and the plan must then reschedule after the operation has executed. In that case, it makes more sense for the callable to returns_dict.
endured – If true, even if callable fails, solution will reschedule. ignored if endurance enabled globally.
parallel – (deprecated) execute in parallel
marshalled – If true, operation will be marshalled while computed, along with its inputs & outputs. (usefull when run in (deprecated) parallel with a process pool).
returns_dict – if true, it means the fn returns dictionary with all provides, and no further processing is done on them (i.e. the returned output-values are not zipped with provides)
node_props – Added as-is into NetworkX graph, and you may filter operations by
Pipeline.withset()
. Also plot-rendering affected if they match Graphviz properties., unless they start with underscore(_
)
- Returns
when called with fn, it returns a
FnOp
, otherwise it returns a decorator function that accepts fn as the 1st argument.Note
Actually the returned decorator is the
FnOp.withset()
method and accepts all arguments, monkeypatched to support calling a virtualwithset()
method on it, not to interrupt the builder-pattern, but only that - besides that trick, it is just a bound method.
Example:
If no fn given, it returns the
withset
method, to act as a decorator:>>> from graphtik import operation, varargs
>>> op = operation() >>> op <function FnOp.withset at ...
- But if fn is set to None
>>> op = op(needs=['a', 'b']) >>> op FnOp(name=None, needs=['a', 'b'], fn=None)
If you call an operation without fn and no name, it will scream:
>>> op.compute({"a":1, "b": 2}) Traceback (most recent call last): ValueError: Operation must have a callable `fn` and a non-empty `name`: FnOp(name=None, needs=['a', 'b'], fn=None) (tip: for defaulting `fn` to conveyor-identity, # of provides must equal needs)
But if you give just a name with
None
as fn it will build an conveyor operation for some needs & provides:>>> op = operation(None, name="copy", needs=["foo", "bar"], provides=["FOO", "BAZ"]) >>> op.compute({"foo":1, "bar": 2}) {'FOO': 1, 'BAZ': 2}
You may keep calling
withset()
on an operation, to build modified clones:>>> op = op.withset(needs=['a', 'b'], ... provides='SUM', fn=lambda a, b: a + b) >>> op FnOp(name='copy', needs=['a', 'b'], provides=['SUM'], fn='<lambda>') >>> op.compute({"a":1, "b": 2}) {'SUM': 3}
>>> op.withset(fn=lambda a, b: a * b).compute({'a': 2, 'b': 5}) {'SUM': 10}
- graphtik.fnop.prefixed(dep, cwd)[source]¶
Converts dep into a jsonp and prepends prefix (unless dep was rooted).
TODO: make prefixed a TOP_LEVEL modifier.
- graphtik.fnop.reparse_operation_data(name, needs, provides, aliases=(), cwd: Optional[Sequence[str]] = None) Tuple[str, Collection[str], Collection[str], Collection[Tuple[str, str]]] [source]¶
Validate & reparse operation data as lists.
- Returns
name, needs, provides, aliases
As a separate function to be reused by client building operations, to detect errors early.
Module: pipeline¶
compose pipelines by combining operations into network.
Note
This module (along with op
& modifier
) is what client code needs
to define pipelines on import time without incurring a heavy price
(<5ms on a 2019 fast PC)
- class graphtik.pipeline.Pipeline(operations, name, *, outputs=None, predicate: NodePredicate = None, cwd: str = None, rescheduled=None, endured=None, parallel=None, marshalled=None, node_props=None, renamer=None, excludes=None)[source]¶
An operation that can compute a network-graph of operations.
Tip
- __call__(**input_kwargs) Solution [source]¶
Delegates to
compute()
, respecting any narrowed outputs.
- __init__(operations, name, *, outputs=None, predicate: NodePredicate = None, cwd: str = None, rescheduled=None, endured=None, parallel=None, marshalled=None, node_props=None, renamer=None, excludes=None)[source]¶
For arguments, ee
withset()
& class attributes.- Raises
if dupe operation, with msg:
Operations may only be added once, …
- __module__ = 'graphtik.pipeline'¶
- __name__ = 'Pipeline'¶
Fake function attributes.
- __qualname__ = 'Pipeline'¶
Fake function attributes.
- _abc_impl = <_abc_data object>¶
- compile(inputs=None, outputs=<UNSET>, recompute_from=None, *, predicate: NodePredicate = <UNSET>) ExecutionPlan [source]¶
Produce a plan for the given args or outputs/predicate narrowed earlier.
- Parameters
named_inputs – a string or a list of strings that should be fed to the needs of all operations.
outputs – A string or a list of strings with all data asked to compute. If
None
, all possible intermediate outputs will be kept. If not given, those set by a previous call towithset()
or cstor are used.recompute_from – Described in
Pipeline.compute()
.predicate – Will be stored and applied on the next
compute()
orcompile()
. If not given, those set by a previous call towithset()
or cstor are used.
- Returns
the execution plan satisfying the given inputs, outputs & predicate
- Raises
- Unknown output nodes…
if outputs asked do not exist in network.
- Unsolvable graph: …
if it cannot produce any outputs from the given inputs.
- Plan needs more inputs…
if given inputs mismatched plan’s
needs
.- Unreachable outputs…
if net cannot produce asked outputs.
- compute(named_inputs: ~typing.Mapping = None, outputs: ~typing.Optional[~typing.Union[~typing.Collection, str]] = <UNSET>, recompute_from: ~typing.Optional[~typing.Union[~typing.Collection, str]] = None, *, predicate: NodePredicate = <UNSET>, callbacks=None, solution_class: Type[Solution] = None, layered_solution=None) Solution [source]¶
Compile & execute the plan, log jetsam & plot plottable on errors.
Attention
If intermediate planning is successful, the “global abort run flag is reset before the execution starts.
- Parameters
named_inputs – A mapping of names –> values that will be fed to the needs of all operations. Cloned, not modified.
outputs – A string or a list of dependencies with all data asked to compute. If
None
, all possible intermediate outputs will be kept. If not given, those set by a previous call towithset()
or cstor are used.recompute_from –
recompute operations downstream from these (string or list) dependencies. In effect, before compiling, it marks all values strictly downstream (excluding themselves) from the dependencies listed here, as missing from named_inputs.
Traversing downstream stops when arriving at any dep in outputs.
Any dependencies here unreachable downstreams from values in named_inputs are ignored, but logged.
Any dependencies here unreachable upstreams from outputs (if given) are ignored, but logged.
Results may differ even if graph is unchanged, in the presence of overwrites.
predicate – filter-out nodes before compiling If not given, those set by a previous call to
withset()
or cstor are used.callbacks – If given, a 2-tuple with (optional) callbacks to call before/after computing operation, with
OpTask
as argument containing the op & solution. Can be one (scalar), less than 2, or nothing/no elements accepted.solution_class – a custom solution factory to use
layered_solution –
whether to store operation results or just keys into separate solution layers
Unless overridden by a True/False in
set_layered_solution()
of configurations, it accepts the following values:When True(False), always keep results(just the keys) in a separate layer for each operation, regardless of any jsonp dependencies.
If
None
, layers are used only if there are NO jsonp dependencies in the network.
- Returns
The solution which contains the results of each operation executed +1 for inputs in separate dictionaries.
- Raises
If outputs asked do not exist in network, with msg:
Unknown output nodes: …
If plan does not contain any operations, with msg:
Unsolvable graph: …
If given inputs mismatched plan’s
needs
, with msg:Plan needs more inputs…
If net cannot produce asked outputs, with msg:
Unreachable outputs…
See also
Operation.compute()
.
- property graph¶
- withset(outputs: ~typing.Optional[~typing.Union[~typing.Collection, str]] = <UNSET>, predicate: NodePredicate = <UNSET>, *, name=None, cwd=None, rescheduled=None, endured=None, parallel=None, marshalled=None, node_props=None, renamer=None) Pipeline [source]¶
Return a copy with a network pruned for the given needs & provides.
- Parameters
outputs – Will be stored and applied on the next
compute()
orcompile()
. If not given, the value of this instance is conveyed to the clone.predicate – Will be stored and applied on the next
compute()
orcompile()
. If not given, the value of this instance is conveyed to the clone.name –
the name for the new pipeline:
if None, the same name is kept;
if True, a distinct name is devised:
<old-name>-<uid>
if ellipses(
...
), the name of the function where this function call happened is used,otherwise, the given name is applied.
cwd – The current-working-document, when given, all non-root dependencies (needs, provides & aliases) on all contained operations become jsonps, prefixed with this.
rescheduled – applies rescheduled to all contained operations
endured – applies endurance to all contained operations
parallel – (deprecated) mark all contained operations to be executed in parallel
marshalled – mark all contained operations to be marshalled (usefull when run in (deprecated) parallel with a process pool).
renamer – see respective parameter in
FnOp.withset()
.
- Returns
A narrowed pipeline clone, which MIGHT be empty!*
- Raises
If outputs asked do not exist in network, with msg:
Unknown output nodes: …
- graphtik.pipeline.build_network(operations, cwd=None, rescheduled=None, endured=None, parallel=None, marshalled=None, node_props=None, renamer=None, excludes=None)[source]¶
The network factory that does operation merging before constructing it.
- Parameters
nest – see same-named param in
compose()
- graphtik.pipeline.compose(name: Optional[Union[str, ellipsis]], op1: Operation, *operations: Operation, excludes=None, outputs: Optional[Union[Collection, str]] = None, cwd: Optional[str] = None, rescheduled=None, endured=None, parallel=None, marshalled=None, nest: Optional[Union[Callable[[RenArgs], str], Mapping[str, str], bool, str]] = None, node_props=None) Pipeline [source]¶
Merge or nest operations & pipelines into a new pipeline.
Tip
The module import-time dependencies have been carefully optimized so that importing all from package takes the minimum time (e.g. <10ms in a 2019 laptop):
>>> %time from graphtik import * # doctest: +SKIP CPU times: user 8.32 ms, sys: 34 µs, total: 8.35 ms Wall time: 7.53 ms
Still, constructing your pipeline\s on import time would take considerable more time (e.g. ~300ms for the 1st pipeline). So prefer to construct them in “factory” module functions (remember to annotate them with typing hints to denote their retun type).
Operations given earlier (further to the left) override those following (further to the right), similar to set behavior (and contrary to dict).
- Parameters
name – An optional name for the graph being composed by this object. If ellipses(
...
), derrived from function name where the pipeline is defined.op1 – syntactically force at least 1 operation
operations – each argument should be an operation or pipeline instance
excludes – A single string or list of operation-names to exclude from the final network (particularly useful when composing existing pipelines).
nest –
a dictionary or callable corresponding to the renamer paremater of
Pipeline.withset()
, but the calable receives a ren_args withRenArgs.parent
set when merging a pipeline, and applies the default nesting behavior (nest_any_node()
) on truthies.Specifically:
if it is a dictionary, it renames any operations & data named as keys into the respective values, like that:
if a value is callable or str, it is fed into
dep_renamed()
(hint: it can be single-arg callable like:(str) -> str
)it applies default all-nodes nesting if other truthy;
Note that you cannot access the “parent” name with dictionaries, you can only apply default all-node nesting by returning a non-string truthy.
if it is a
callable()
, it is given aRenArgs
instance to decide the node’s name.The callable may return a str for the new-name, or any other true/false to apply default all-nodes nesting.
For example, to nest just operation’s names (but not their dependencies), call:
compose( ..., nest=lambda ren_args: ren_args.typ == "op" )
Attention
The callable SHOULD wish to preserve any modifier on dependencies, and use
dep_renamed()
forRenArgs.typ
not ending in.jsonpart
.If false (default), applies operation merging, not nesting.
if true, applies default operation nesting to all types of nodes.
In all other cases, the names are preserved.
See also
Nesting for examples
Default nesting applied by
nest_any_node()
cwd – The current-working-document, when given, all non-root dependencies (needs, provides & aliases) on all contained operations become jsonps, prefixed with this.
rescheduled – applies rescheduled to all contained operations
endured – applies endurance to all contained operations
parallel – (deprecated) mark all contained operations to be executed in parallel
marshalled – mark all contained operations to be marshalled (usefull when run in (deprecated) parallel with a process pool).
node_props – Added as-is into NetworkX graph, to provide for filtering by
Pipeline.withset()
. Also plot-rendering affected if they match Graphviz properties, unless they start with underscore(_
)
- Returns
Returns a special type of operation class, which represents an entire computation graph as a single operation.
- Raises
If the net` cannot produce the asked outputs from the given inputs.
If nest callable/dictionary produced an non-string or empty name (see (NetworkPipeline))
Module: modifier¶
modifiers (print-out with diacritics) change dependency behavior during planning & execution.
|
Annotate optionals needs corresponding to defaulted op-function arguments, ... |
|
Annotate a dependency that maps to a different name in the underlying function. |
|
|
|
sideffects denoting modifications beyond the scope of the solution. |
|
Annotates a sideffected dependency in the solution sustaining side-effects. |
|
Like |
|
Like |
|
Annotate a varargish needs to be fed as function's |
|
An varargish |
|
Provides-only, see pandas concatenation & generic |
|
Provides-only, see pandas concatenation & generic |
|
Generic modifier for term:json pointer path & implicit dependencies. |
|
Returns the underlying dependency name (just str) |
|
Check if a dependency is keyword (and get it, last step if jsonp). |
|
Check if a dependency is optional. |
|
Check if an optionals dependency is vararg. |
|
Check if an optionals dependency is varargs. |
|
|
|
Parse dep as jsonp (unless modified with |
|
Check if dependency is json pointer path and return its steps. |
|
Check if a dependency is sideffects or sideffected. |
|
Check if it is sideffects but not a sideffected. |
|
Check if it is sideffected. |
|
Return if it is a implicit dependency. |
|
Check if dependency has an accessor, and get it (if funcs below are unfit) |
|
Renames dep as ren or call ren` (if callable) to decide its name, |
|
Return one sideffected for each sfx in |
|
Return the |
|
Make a new modifier with changes -- handle with care. |
The needs and provides annotated with modifiers designate, for instance, optional function arguments, or “ghost” sideffects.
Note
This module (along with op
& pipeline
) is what client code needs
to define pipelines on import time without incurring a heavy price
(~7ms on a 2019 fast PC)
Diacritics
The representation
of modifier-annotated dependencies
utilize a combination of these diacritics:
> :keyword()
? :optional()
* :vararg()
+ :varargs()
$ : accessor (mostly for jsonp)
- class graphtik.modifier.Accessor(contains: Callable[[dict, str], Any], getitem: Callable[[dict, str], Any], setitem: Callable[[dict, str, Any], None], delitem: Callable[[dict, str], None], update: Optional[Callable[[dict, Collection[Tuple[str, Any]]], None]] = None)[source]¶
Getter/setter functions to extract/populate values from a solution layer.
Note
Don’t use its attributes directly, prefer instead the functions returned from
acc_contains()
etc on any dep (plain strings included).TODO: drop accessors, push functionality into jsonp alone.
- static __new__(_cls, contains: Callable[[dict, str], Any], getitem: Callable[[dict, str], Any], setitem: Callable[[dict, str, Any], None], delitem: Callable[[dict, str], None], update: Optional[Callable[[dict, Collection[Tuple[str, Any]]], None]] = None)¶
Create new instance of Accessor(contains, getitem, setitem, delitem, update)
- property contains¶
the containment checker, like:
dep in sol
;
- property delitem¶
the deleter, like:
delitem(sol, dep)
- property getitem¶
the getter, like:
getitem(sol, dep) -> value
- property setitem¶
the setter, like:
setitem(sol, dep, val)
,
- property update¶
mass updater, like:
update(sol, item_values)
,
- graphtik.modifier.HCatAcc()[source]¶
Read/write jsonp and concat columns (axis=1) if both doc & value are Pandas.
- graphtik.modifier.JsonpAcc()[source]¶
Read/write jsonp paths found on modifier’s “extra’ attribute jsonpath
- graphtik.modifier.VCatAcc()[source]¶
Read/write jsonp and concat columns (axis=1) if both doc & value are Pandas.
- class graphtik.modifier._Modifier(name, _repr, _func, keyword, optional: _Optionals, accessor, sideffected, sfx_list, **kw)[source]¶
Annotate a dependency with a combination of modifier.
This class is private, because client code should not need to call its cstor, or check if a dependency
isinstance()
, but use these facilities instead:the factory functions like
keyword()
,optional()
etc,the predicates like
is_optional()
,is_pure_sfx()
etc,the conversion functions like
dep_renamed()
,dep_stripped()
etc,and only rarely (and with care) call its
modifier_withset()
method or_modifier()
factor functions.
- Parameters
kw – any extra attributes not needed by execution machinery such as the
jsonpath
, which is used only by accessor.
Note
Factory function:func:_modifier() may return a plain string, if no other arg but
name
is given.- static __new__(cls, name, _repr, _func, keyword, optional: _Optionals, accessor, sideffected, sfx_list, **kw) _Modifier [source]¶
Warning, returns None!
- __weakref__¶
list of weak references to the object (if defined)
- _accessor: Accessor = None[source]¶
An accessor with getter/setter functions to read/write solution values. Any sequence of 2-callables will do.
- _keyword: str = None[source]¶
Map my name in needs into this kw-argument of the function.
get_keyword()
returns it.
- _optional: _Optionals = None[source]¶
required is None, regular optional or varargish?
is_optional()
returns it. All regulars are keyword.
- _sfx_list: Tuple[Optional[str]] = ()[source]¶
At least one name(s) denoting the sideffects modification(s) on the sideffected, performed/required by the operation.
- If it is an empty tuple`, it is an abstract sideffect,
and
is_pure_optional()
returns True.
If not empty
is_sfxed()
returns true (the_sideffected
).
- _sideffected: str = None[source]¶
Has value only for sideffects: the pure-sideffect string or the existing sideffected dependency.
- property cmd¶
the code to reproduce it
- graphtik.modifier._modifier(name, *, keyword=None, optional: Optional[_Optionals] = None, accessor=None, sideffected=None, sfx_list=(), jsonp=None, **kw) Union[str, _Modifier] [source]¶
A
_Modifier
factory that may return a plain str when no other args given.It decides the final name and _repr for the new modifier by matching the given inputs with the
_modifier_cstor_matrix
.
- graphtik.modifier._modifier_cstor_matrix = {700000: None, 700010: ("sfx('%(dep)s')", "sfx('%(dep)s')", 'sfx'), 700011: ("sfxed('%(dep)s', %(sfx)s)", "sfxed(%(acs)s'%(dep)s', %(sfx)s)", 'sfxed'), 700100: ('%(dep)s', "'%(dep)s'($)", 'accessor'), 700111: ("sfxed('%(dep)s', %(sfx)s)", "sfxed('%(dep)s'($), %(sfx)s)", 'sfxed'), 701010: ("sfx('%(dep)s')", "sfx('%(dep)s'(?))", 'sfx'), 702000: ('%(dep)s', "'%(dep)s'(*)", 'vararg'), 702011: ("sfxed('%(dep)s', %(sfx)s)", "sfxed(%(acs)s'%(dep)s'(*), %(sfx)s)", 'sfxed_vararg'), 702100: ('%(dep)s', "'%(dep)s'($*)", 'vararg'), 702111: ("sfxed('%(dep)s', %(sfx)s)", "sfxed('%(dep)s'($*), %(sfx)s)", 'sfxed_vararg'), 703000: ('%(dep)s', "'%(dep)s'(+)", 'varargs'), 703011: ("sfxed('%(dep)s', %(sfx)s)", "sfxed(%(acs)s'%(dep)s'(+), %(sfx)s)", 'sfxed_varargs'), 703100: ('%(dep)s', "'%(dep)s'($+)", 'varargs'), 703111: ("sfxed('%(dep)s', %(sfx)s)", "sfxed('%(dep)s'($+), %(sfx)s)", 'sfxed_varargs'), 710000: ('%(dep)s', "'%(dep)s'(%(acs)s>%(kw)s)", 'keyword'), 710011: ("sfxed('%(dep)s', %(sfx)s)", "sfxed(%(acs)s'%(dep)s'(>%(kw)s), %(sfx)s)", 'sfxed'), 710100: ('%(dep)s', "'%(dep)s'($>%(kw)s)", 'keyword'), 710111: ("sfxed('%(dep)s', %(sfx)s)", "sfxed('%(dep)s'($>%(kw)s), %(sfx)s)", 'sfxed'), 711000: ('%(dep)s', "'%(dep)s'(%(acs)s?%(kw)s)", 'optional'), 711011: ("sfxed('%(dep)s', %(sfx)s)", "sfxed(%(acs)s'%(dep)s'(?%(kw)s), %(sfx)s)", 'sfxed'), 711100: ('%(dep)s', "'%(dep)s'($?%(kw)s)", 'optional'), 711111: ("sfxed('%(dep)s', %(sfx)s)", "sfxed('%(dep)s'($?%(kw)s), %(sfx)s)", 'sfxed')}¶
Arguments-presence patterns for
_Modifier
constructor. Combinations missing raise errors.
- graphtik.modifier.acc_contains(dep) Callable[[Collection, str], Any] [source]¶
A fn like
operator.contains()
for any dep (with-or-without accessor)
- graphtik.modifier.acc_delitem(dep) Callable[[Collection, str], None] [source]¶
A fn like
operator.delitem()
for any dep (with-or-without accessor)
- graphtik.modifier.acc_getitem(dep) Callable[[Collection, str], Any] [source]¶
A fn like
operator.getitem()
for any dep (with-or-without accessor)
- graphtik.modifier.acc_setitem(dep) Callable[[Collection, str, Any], None] [source]¶
A fn like
operator.setitem()
for any dep (with-or-without accessor)
- graphtik.modifier.dep_renamed(dep, ren, jsonp=None) Union[_Modifier, str] [source]¶
Renames dep as ren or call ren` (if callable) to decide its name,
preserving any
keyword()
to old-name.- Parameters
jsonp – None (derrived from name),
False
, str, collection of str/callable (last one) See genericmodify()
modifier.
For sideffected it renames the dependency (not the sfx-list) – you have to do it that manually with a custom renamer-function, if ever the need arise.
- graphtik.modifier.dep_singularized(dep) Iterable[Union[str, _Modifier]] [source]¶
Return one sideffected for each sfx in
_sfx_list
, or iterate dep in other cases.
- graphtik.modifier.dep_stripped(dep) Union[str, _Modifier] [source]¶
Return the
_sideffected
if dep is sideffected, dep otherwise,conveying all other properties of the original modifier to the stripped dependency.
- graphtik.modifier.dependency(dep) str [source]¶
Returns the underlying dependency name (just str)
For non-sideffects, it coincides with str(), otherwise, the the pure-sideffect string or the existing sideffected dependency stored in
_sideffected
.
- graphtik.modifier.get_accessor(dep) bool [source]¶
Check if dependency has an accessor, and get it (if funcs below are unfit)
- Returns
the
_accessor
- graphtik.modifier.get_jsonp(dep) Optional[List[str]] [source]¶
Check if dependency is json pointer path and return its steps.
- graphtik.modifier.get_keyword(dep) Optional[str] [source]¶
Check if a dependency is keyword (and get it, last step if jsonp).
All non-varargish optionals are “keyword” (including sideffected ones).
- Returns
the
_keyword
- graphtik.modifier.hcat(name, *, keyword: Optional[str] = None, jsonp=None) _Modifier [source]¶
Provides-only, see pandas concatenation & generic
modify()
modifier.
- graphtik.modifier.is_optional(dep) Optional[_Optionals] [source]¶
Check if a dependency is optional.
Varargish & optional sideffects are included.
- Returns
the
_optional
- graphtik.modifier.is_pure_sfx(dep) bool [source]¶
Check if it is sideffects but not a sideffected.
- graphtik.modifier.is_sfx(dep) Optional[str] [source]¶
Check if a dependency is sideffects or sideffected.
- Returns
the
_sideffected
- graphtik.modifier.is_sfxed(dep) bool [source]¶
Check if it is sideffected.
- Returns
the
_sfx_list
if it is a sideffected dep, None/empty-tuple otherwise
- graphtik.modifier.jsonp_ize(dep)[source]¶
Parse dep as jsonp (unless modified with
jsnop=False
) or is pure sfx.
- graphtik.modifier.keyword(name: str, keyword: Optional[str] = None, accessor: Optional[Accessor] = None, jsonp=None) _Modifier [source]¶
Annotate a dependency that maps to a different name in the underlying function.
When used on needs dependencies:
The value of the
name
dependency is read from the solution, and thenthat value is passed in the function as a keyword-argument named
keyword
.
When used on provides dependencies:
The operation must be a returns dictionary.
The value keyed with
keyword
is read from function’s returned dictionary, and thenthat value is placed into solution named as
name
.
- Parameters
keyword –
The argument-name corresponding to this named-input. If it is None, assumed the same as name, so as to behave always like kw-type arg, and to preserve its fn-name if ever renamed.
accessor – the functions to access values to/from solution (see
Accessor
) (actually a 2-tuple with functions is ok)jsonp – None (derrived from name),
False
, str, collection of str/callable (last one) See genericmodify()
modifier.
- Returns
a
_Modifier
instance, even if no keyword is given OR it is the same as name.
Example:
In case the name of a function input argument is different from the name in the graph (or just because the name in the inputs is not a valid argument-name), you may map it with the 2nd argument of
keyword()
:>>> from graphtik import operation, compose, keyword
>>> @operation(needs=[keyword("name-in-inputs", "fn_name")], provides="result") ... def foo(*, fn_name): # it works also with non-positional args ... return fn_name >>> foo FnOp(name='foo', needs=['name-in-inputs'(>'fn_name')], provides=['result'], fn='foo')
>>> pipe = compose('map a need', foo) >>> pipe Pipeline('map a need', needs=['name-in-inputs'], provides=['result'], x1 ops: foo)
>>> sol = pipe.compute({"name-in-inputs": 4}) >>> sol['result'] 4
You can do the same thing to the results of a returns dictionary operation:
>>> op = operation(lambda: {"fn key": 1}, ... name="renaming `provides` with a `keyword`", ... provides=keyword("graph key", "fn key"), ... returns_dict=True) >>> op FnOp(name='renaming `provides` with a `keyword`', provides=['graph key'(>'fn key')], fn{}='<lambda>')
Hint
Mapping provides names wouldn’t make sense for regular operations, since these are defined arbitrarily at the operation level. OTOH, the result names of returns dictionary operation are decided by the underlying function, which may lie beyond the control of the user (e.g. from a 3rd-party object).
- graphtik.modifier.modifier_withset(dep, name=Ellipsis, keyword=Ellipsis, optional: _Optionals = Ellipsis, accessor=Ellipsis, sideffected=Ellipsis, sfx_list=Ellipsis, **kw) Union[_Modifier, str] [source]¶
Make a new modifier with changes – handle with care.
- Returns
Delegates to
_modifier()
, so returns a plain string if no args left.
- graphtik.modifier.modify(name: str, *, keyword=None, jsonp=None, implicit=None, accessor: Optional[Accessor] = None) _Modifier [source]¶
Generic modifier for term:json pointer path & implicit dependencies.
- Parameters
jsonp –
If given, it may be some other json-pointer expression, or the pre-splitted parts of the jsonp dependency – in that case, the dependency name is irrelevant – or a falsy (but not
None
) value, to disable the automatic interpeting of the dependency name as a json pointer path, regardless of any containing slashes.If accessing pandas, you may pass an already splitted path with its last part being a callable indexer (Selection by callable).
In addition to writing values, the
vcat()
orhcat()
modifiers (& respective accessors) support also pandas concatenation for provides (see example in Concatenating Pandas).implicit – implicit dependencies are not fed into/out of the function. You may use directly
implicit()
.accessor – Annotate the dependency with accessor functions to read/write solution (actually a 2-tuple with functions is ok)
Example:
Let’s use json pointer dependencies along with the default conveyor operation to build an operation copying values around in the solution:
>>> from graphtik import operation, compose, modify
>>> copy_values = operation( ... fn=None, # ask for the "conveyor op" ... name="copy a+b-->A+BB", ... needs=["inputs/a", "inputs/b"], ... provides=["RESULTS/A", "RESULTS/BB"] ... )
>>> results = copy_values.compute({"inputs": {"a": 1, "b": 2}}) Traceback (most recent call last): ValueError: Failed matching inputs <=> needs for FnOp(name='copy a+b-->A+BB', needs=['inputs/a'($), 'inputs/b'($)], provides=['RESULTS/A'($), 'RESULTS/BB'($)], fn='identity_fn'): 1. Missing compulsory needs['inputs/a'($), 'inputs/b'($)]! +++inputs: ['inputs']
>>> results = copy_values.compute({"inputs/a": 1, "inputs/b": 2}) >>> results {'RESULTS/A'($): 1, 'RESULTS/BB'($): 2}
Notice that the hierarchical dependencies did not yet worked, because jsonp modifiers work internally with accessors, and
FnOp
is unaware of them – it’s theSolution
class that supports accessors*, and this requires the operation to be wrapped in a pipeline (see below).Note also that it we see the “representation’ of the key as
'RESULTS/A'($)
but the actual string value simpler:>>> str(next(iter(results))) 'RESULTS/A'
The results were not nested, because this modifer works with accessor functions, that act only on a real
Solution
, given to the operation only when wrapped in a pipeline (as done below).Now watch how these paths access deep into solution when the same operation is wrapped in a pipeline:
>>> pipe = compose("copy pipe", copy_values) >>> sol = pipe.compute({"inputs": {"a": 1, "b": 2}}, outputs="RESULTS") >>> sol {'RESULTS': {'A': 1, 'BB': 2}}
- graphtik.modifier.optional(name: str, keyword: Optional[str] = None, accessor: Optional[Accessor] = None, jsonp=None, implicit=None) _Modifier [source]¶
Annotate optionals needs corresponding to defaulted op-function arguments, …
received only if present in the inputs (when operation is invoked).
The value of an optional dependency is passed in as a keyword argument to the underlying function.
- Parameters
keyword – the name for the function argument it corresponds; if a falsy is given, same as name assumed, to behave always like kw-type arg and to preserve its fn-name if ever renamed.
accessor – the functions to access values to/from solution (see
Accessor
) (actually a 2-tuple with functions is ok)jsonp – None (derrived from name),
False
, str, collection of str/callable (last one) See genericmodify()
modifier.implicit – implicit dependencies are not fed into/out of the function. You may use directly
implicit()
.
Example:
>>> from graphtik import operation, compose, optional
>>> @operation(name='myadd', ... needs=["a", optional("b")], ... provides="sum") ... def myadd(a, b=0): ... return a + b
Notice the default value
0
to theb
annotated as optional argument:>>> graph = compose('mygraph', myadd) >>> graph Pipeline('mygraph', needs=['a', 'b'(?)], provides=['sum'], x1 ops: myadd)
The graph works both with and without
c
provided in the inputs:>>> graph(a=5, b=4)['sum'] 9 >>> graph(a=5) {'a': 5, 'sum': 5}
Like
keyword()
you may map input-name to a different function-argument:>>> operation(needs=['a', optional("quasi-real", "b")], ... provides="sum" ... )(myadd.fn) # Cannot wrap an operation, its `fn` only. FnOp(name='myadd', needs=['a', 'quasi-real'(?'b')], provides=['sum'], fn='myadd')
- graphtik.modifier.sfx(name, optional: Optional[bool] = None) _Modifier [source]¶
sideffects denoting modifications beyond the scope of the solution.
Both needs & provides may be designated as sideffects using this modifier. They work as usual while solving the graph (planning) but they have a limited interaction with the operation’s underlying function; specifically:
input sideffects must exist in the solution as inputs for an operation depending on it to kick-in, when the computation starts - but this is not necessary for intermediate sideffects in the solution during execution;
input sideffects are NOT fed into underlying functions;
output sideffects are not expected from underlying functions, unless a rescheduled operation with partial outputs designates a sideffected as canceled by returning it with a falsy value (operation must returns dictionary).
Hint
If modifications involve some input/output, prefer the
sfxed()
modifier.You may still convey this relationships by including the dependency name in the string - in the end, it’s just a string - but no enforcement of any kind will happen from graphtik, like:
>>> from graphtik import sfx
>>> sfx("price[sales_df]") sfx('price[sales_df]')
Example:
A typical use-case is to signify changes in some “global” context, outside solution:
>>> from graphtik import operation, compose, sfx
>>> @operation(provides=sfx("lights off")) # sideffect names can be anything ... def close_the_lights(): ... pass
>>> graph = compose('strip ease', ... close_the_lights, ... operation( ... name='undress', ... needs=[sfx("lights off")], ... provides="body")(lambda: "TaDa!") ... ) >>> graph Pipeline('strip ease', needs=[sfx('lights off')], provides=[sfx('lights off'), 'body'], x2 ops: close_the_lights, undress)
>>> sol = graph() >>> sol {'body': 'TaDa!'}
Note
Something has to provide a sideffect for a function needing it to execute - this could be another operation, like above, or the user-inputs; just specify some truthy value for the sideffect:
>>> sol = graph.compute({sfx("lights off"): True})
- graphtik.modifier.sfxed(dependency: str, sfx0: str, *sfx_list: str, keyword: Optional[str] = None, optional: Optional[bool] = None, accessor: Optional[Accessor] = None, jsonp=None) _Modifier [source]¶
Annotates a sideffected dependency in the solution sustaining side-effects.
- Parameters
dependency – the actual dependency receiving the sideffect, which will be fed into/out of the function.
sfx0 – the 1st (arbitrary object) sideffect marked as “acting” on the dependency.
sfx_list – more (arbitrary object) sideffects (like the sfx0)
keyword – the name for the function argument it corresponds. When optional, it becomes the same as name if falsy, so as to behave always like kw-type arg, and to preserve fn-name if ever renamed. When not optional, if not given, it’s all fine.
accessor – the functions to access values to/from solution (see
Accessor
) (actually a 2-tuple with functions is ok)jsonp – None (derrived from name),
False
, str, collection of str/callable (last one) See genericmodify()
modifier.
Like
sfx()
but annotating a real dependency in the solution, allowing that dependency to be present both in needs and provides of the same function.Example:
A typical use-case is to signify columns required to produce new ones in pandas dataframes (emulated with dictionaries):
>>> from graphtik import operation, compose, sfxed
>>> @operation(needs="order_items", ... provides=sfxed("ORDER", "Items", "Prices")) ... def new_order(items: list) -> "pd.DataFrame": ... order = {"items": items} ... # Pretend we get the prices from sales. ... order['prices'] = list(range(1, len(order['items']) + 1)) ... return order
>>> @operation( ... needs=[sfxed("ORDER", "Items"), "vat rate"], ... provides=sfxed("ORDER", "VAT") ... ) ... def fill_in_vat(order: "pd.DataFrame", vat: float): ... order['VAT'] = [i * vat for i in order['prices']] ... return order
>>> @operation( ... needs=[sfxed("ORDER", "Prices", "VAT")], ... provides=sfxed("ORDER", "Totals") ... ) ... def finalize_prices(order: "pd.DataFrame"): ... order['totals'] = [p + v for p, v in zip(order['prices'], order['VAT'])] ... return order
To view all internal dependencies, enable DEBUG in configurations:
>>> from graphtik.config import debug_enabled
>>> with debug_enabled(True): ... finalize_prices FnOp(name='finalize_prices', needs=[sfxed('ORDER', 'Prices'), sfxed('ORDER', 'VAT')], _user_needs=[sfxed('ORDER', 'Prices', 'VAT')], _fn_needs=['ORDER'], provides=[sfxed('ORDER', 'Totals')], _user_provides=[sfxed('ORDER', 'Totals')], _fn_provides=['ORDER'], fn='finalize_prices')
Notice that declaring a single sideffected with many items in sfx_list, expands into multiple “singular”
sideffected
dependencies in the network (checkneeds
vs_user_needs
above).>>> proc_order = compose('process order', new_order, fill_in_vat, finalize_prices) >>> sol = proc_order.compute({ ... "order_items": ["toilet-paper", "soap"], ... "vat rate": 0.18, ... }) >>> sol {'order_items': ['toilet-paper', 'soap'], 'vat rate': 0.18, 'ORDER': {'items': ['toilet-paper', 'soap'], 'prices': [1, 2], 'VAT': [0.18, 0.36], 'totals': [1.18, 2.36]}}
Notice that although many functions consume & produce the same
ORDER
dependency, something that would have formed cycles, the wrapping operations need and provide different sideffected instances, breaking thus the cycles.See also
The elaborate example in Hierarchical data and further tricks section.
- graphtik.modifier.sfxed_vararg(dependency: str, sfx0: str, *sfx_list: str, accessor: Optional[Accessor] = None, jsonp=None) _Modifier [source]¶
Like
sideffected()
+vararg()
.
- graphtik.modifier.sfxed_varargs(dependency: str, sfx0: str, *sfx_list: str, accessor: Optional[Accessor] = None, jsonp=None) _Modifier [source]¶
Like
sideffected()
+varargs()
.
- graphtik.modifier.vararg(name: str, accessor: Optional[Accessor] = None, jsonp=None) _Modifier [source]¶
Annotate a varargish needs to be fed as function’s
*args
.- Parameters
See also
Consult also the example test-case in:
test/test_op.py:test_varargs()
, in the full sources of the project.Example:
We designate
b
&c
as vararg arguments:>>> from graphtik import operation, compose, vararg
>>> @operation( ... needs=['a', vararg('b'), vararg('c')], ... provides='sum' ... ) ... def addall(a, *b): ... return a + sum(b) >>> addall FnOp(name='addall', needs=['a', 'b'(*), 'c'(*)], provides=['sum'], fn='addall')
>>> graph = compose('mygraph', addall)
The graph works with and without any of
b
orc
inputs:>>> graph(a=5, b=2, c=4)['sum'] 11 >>> graph(a=5, b=2) {'a': 5, 'b': 2, 'sum': 7} >>> graph(a=5) {'a': 5, 'sum': 5}
- graphtik.modifier.varargs(name: str, accessor: Optional[Accessor] = None, jsonp=None) _Modifier [source]¶
An varargish
vararg()
, naming a iterable value in the inputs.- Parameters
See also
Consult also the example test-case in:
test/test_op.py:test_varargs()
, in the full sources of the project.Example:
>>> from graphtik import operation, compose, varargs
>>> def enlist(a, *b): ... return [a] + list(b)
>>> graph = compose('mygraph', ... operation(name='enlist', needs=['a', varargs('b')], ... provides='sum')(enlist) ... ) >>> graph Pipeline('mygraph', needs=['a', 'b'(?)], provides=['sum'], x1 ops: enlist)
The graph works with or without b in the inputs:
>>> graph(a=5, b=[2, 20])['sum'] [5, 2, 20] >>> graph(a=5) {'a': 5, 'sum': [5]} >>> graph(a=5, b=0xBAD) Traceback (most recent call last): ValueError: Failed matching inputs <=> needs for FnOp(name='enlist', needs=['a', 'b'(+)], provides=['sum'], fn='enlist'): 1. Expected varargs inputs to be non-str iterables: {'b'(+): 2989} +++inputs: ['a', 'b']
Attention
To avoid user mistakes, varargs do not accept
str
inputs (though iterables):>>> graph(a=5, b="mistake") Traceback (most recent call last): ValueError: Failed matching inputs <=> needs for FnOp(name='enlist', needs=['a', 'b'(+)], provides=['sum'], fn='enlist'): 1. Expected varargs inputs to be non-str iterables: {'b'(+): 'mistake'} +++inputs: ['a', 'b']
See also
The elaborate example in Hierarchical data and further tricks section.
Module: planning¶
compose network of operations & dependencies, compile the plan.
- class graphtik.planning.Network(*operations, graph=None)[source]¶
A graph of operations that can compile an execution plan.
- needs[source]¶
the “base”, all data-nodes that are not produced by some operation, decided on construction.
- __init__(*operations, graph=None)[source]¶
- Parameters
operations – to be added in the graph
graph – if None, create a new.
- Raises
if dupe operation, with msg:
Operations may only be added once, …
- __module__ = 'graphtik.planning'¶
- _abc_impl = <_abc_data object>¶
- _append_operation(graph, operation: Operation)[source]¶
Adds the given operation and its data requirements to the network graph.
Invoked during constructor only (immutability).
Identities are based on the name of the operation, the names of the operation’s needs, and the names of the data it provides.
Adds needs, operation & provides, in that order.
- Parameters
graph – the networkx graph to append to
operation – operation instance to append
- _build_execution_steps(pruned_dag, sorted_nodes, inputs: Collection, outputs: Collection) List [source]¶
Create the list of operations and eviction steps, to execute given IOs.
- Parameters
pruned_dag – The original dag, pruned; not broken.
sorted_nodes – an
IndexedSet
with all graph nodes topo-sorted (including pruned ones) by execution order & operation-insertion to break ties (see_topo_sort_nodes()
).inputs – Not used(!), useless inputs will be evicted when the solution is created.
outputs – outp-names to decide whether to add (and which) evict-instructions
- Returns
the list of operation or dependencies to evict, in computation order
IMPLEMENTATION:
The operation steps are based on the topological sort of the DAG, therefore pruning must have eliminated any cycles.
Then the eviction steps are introduced between the operation nodes (if enabled, and outputs have been asked, or else all outputs are kept), to reduce asap solution’s memory footprint while the computation is running.
An evict-instruction is inserted on 2 occasions:
whenever a need of a an executed op is not used by any other operation further down the DAG.
whenever a provide falls beyond the pruned_dag.
For doc chains, it is either evicted the whole chain (from root), or nothing at all.
For eviction purposes,
sfxed
dependencies are equivalent to their stripped sideffected ones, so these are also inserted in the graph (after sorting, to evade cycles).
- _cached_plans[source]¶
Speed up
compile()
call and avoid a multithreading issue(?) that is occurring when accessing the dag in networkx.
- _deps_tuplized(deps, arg_name) Tuple[Optional[Tuple[str, ...]], Optional[Tuple[str, ...]]] [source]¶
Stabilize None or string/list-of-strings, drop names out of graph.
- Returns
a 2-tuple (stable-deps, deps-in-graph) or
(None, None)
- _prune_graph(inputs: Optional[Union[Collection, str]], outputs: Optional[Union[Collection, str]], predicate: Optional[Callable[[Any, Mapping], bool]] = None) Tuple[DiGraph, Tuple, Tuple, Mapping[Operation, Any]] [source]¶
Determines what graph steps need to run to get to the requested outputs from the provided inputs: - Eliminate steps that are not on a path arriving to requested outputs; - Eliminate unsatisfied operations: partial inputs or no outputs needed; - consolidate the list of needs & provides.
- Parameters
inputs – The names of all given inputs.
outputs – The desired output names. This can also be
None
, in which case the necessary steps are all graph nodes that are reachable from the provided inputs.predicate – the node predicate is a 2-argument callable(op, node-data) that should return true for nodes to include; if None, all nodes included.
- Returns
a 4-tuple:
the pruned execution dag,
net’s needs & outputs based on the given inputs/outputs and the net (may overlap, see
collect_requirements()
),an {op, prune-explanation} dictionary
Use the returned needs/provides to build a new plan.
- Raises
if outputs asked do not exist in network, with msg:
Unknown output nodes: …
- compile(inputs: Optional[Union[Collection, str]] = None, outputs: Optional[Union[Collection, str]] = None, recompute_from=None, *, predicate=None) ExecutionPlan [source]¶
Create or get from cache an execution-plan for the given inputs/outputs.
See
_prune_graph()
and_build_execution_steps()
for detailed description.- Parameters
inputs – A collection with the names of all the given inputs. If None`, all inputs that lead to given outputs are assumed. If string, it is converted to a single-element collection.
outputs – A collection or the name of the output name(s). If None`, all reachable nodes from the given inputs are assumed. If string, it is converted to a single-element collection.
recompute_from – Described in
Pipeline.compute()
.predicate – the node predicate is a 2-argument callable(op, node-data) that should return true for nodes to include; if None, all nodes included.
- Returns
the cached or fresh new execution plan
- Raises
- Unknown output nodes…
if outputs asked do not exist in network.
- Unsolvable graph: …
if it cannot produce any outputs from the given inputs.
- Plan needs more inputs…
if given inputs mismatched plan’s
needs
.- Unreachable outputs…
if net cannot produce asked outputs.
- graph: networkx.Graph[source]¶
The
networkx
(Di)Graph containing all operations and dependencies, prior to planning.
- prepare_plot_args(plot_args: PlotArgs) PlotArgs [source]¶
Called by
plot()
to create the nx-graph and other plot-args, e.g. solution.Clone the graph or merge it with the one in the plot_args (see
PlotArgs.clone_or_merge_graph()
.For the rest args, prefer
PlotArgs.with_defaults()
over_replace()
, not to override user args.
- graphtik.planning._optionalized(graph, data)[source]¶
Retain optionality of a data node based on all needs edges.
- graphtik.planning._topo_sort_nodes(graph) IndexedSet [source]¶
Topo-sort graph by execution order & operation-insertion order to break ties.
This means (probably!?) that the first inserted win the needs, but the last one win the provides (and the final solution).
Inform user in case of cycles.
- graphtik.planning._yield_also_chained_docs(dig_dag: List[Tuple[str, int]], dag, doc: str, stop_set=()) Iterable[str] [source]¶
Dig the doc and its sub/super docs, not recursing in those already in stop_set.
- Parameters
dig_dag – a sequence of 2-tuples like
("in_edges", 0)
, with the name of a networkx method and which edge-node to pick, 0:= src, 1:= dststop_set – Stop traversing (and don’t return) doc if already contained in this set.
- Returns
the given doc, and any other docs discovered with dig_dag linked with a “subdoc” attribute on their edge, except those sub-trees with a root node already in stop_set. If doc is not in dag, returns empty.
- graphtik.planning._yield_chained_docs(dig_dag: Union[Tuple[str, int], List[Tuple[str, int]]], dag, docs: Iterable[str], stop_set=()) Iterable[str] [source]¶
Like
_yield_also_chained_docs()
but digging for many docs at once.- Returns
the given docs, and any other nodes discovered with dig_dag linked with a “subdoc” attribute on their edge, except those sub-trees with a root node already in stop_set.
- graphtik.planning.clone_graph_with_stripped_sfxed(graph)[source]¶
Clone graph including ALSO stripped sideffected deps, with original attrs.
- graphtik.planning.collect_requirements(graph) Tuple[IndexedSet, IndexedSet] [source]¶
Collect & split datanodes in (possibly overlapping) needs/provides.
- graphtik.planning.inputs_for_recompute(graph, inputs: Sequence[str], recompute_from: Sequence[str], recompute_till: Optional[Sequence[str]] = None) Tuple[IndexedSet, IndexedSet] [source]¶
Clears the inputs between recompute_from >–<= recompute_till to clear.
- Parameters
graph – MODIFIED, at most 2 helper nodes inserted
inputs – a sequence
recompute_from – None or a sequence, including any out-of-graph deps (logged))
recompute_till – (optional) a sequence, only in-graph deps.
- Returns
a 2-tuple with the reduced inputs by the dependencies that must be removed from the graph to recompute (along with those dependencies).
It works by temporarily adding x2 nodes to find and remove the intersection of:
strict-descendants(recompute_from) & ancestors(recompute_till)
FIXME: merge recompute() with travesing unsatisfied (see
test_recompute_NEEDS_FIX
) bc it clears inputs of unsatisfied ops (cannot be replaced later)
- graphtik.planning.log = <Logger graphtik.planning (WARNING)>¶
If this logger is eventually DEBUG-enabled, the string-representation of network-objects (network, plan, solution) is augmented with children’s details.
- graphtik.planning.root_doc(dag, doc: str) str [source]¶
Return the most superdoc, or the same doc is not in a chin, or raise if node unknown.
- graphtik.planning.unsatisfied_operations(dag, inputs: Iterable) Tuple[Mapping[Operation, Any], IndexedSet] [source]¶
Traverse topologically sorted dag to collect un-satisfied operations.
Unsatisfied operations are those suffering from ANY of the following:
- They are missing at least one compulsory need-input.
Since the dag is ordered, as soon as we’re on an operation, all its needs have been accounted, so we can get its satisfaction.
- Their provided outputs are not linked to any data in the dag.
An operation might not have any output link when
_prune_graph()
has broken them, due to given intermediate inputs.
- Parameters
dag – a graph with broken edges those arriving to existing inputs
inputs – an iterable of the names of the input values
- Returns
a 2-tuple with ({pruned-op, unsatisfied-explanation}, topo-sorted-nodes)
- graphtik.planning.yield_also_chaindocs(dag, doc: str, stop_set=()) Iterable[str] [source]¶
Calls
_yield_also_chained_docs()
for both subdocs & superdocs.
- graphtik.planning.yield_also_subdocs(dag, doc: str, stop_set=()) Iterable[str] [source]¶
Calls
_yield_also_chained_docs()
for subdocs.
- graphtik.planning.yield_also_superdocs(dag, doc: str, stop_set=()) Iterable[str] [source]¶
Calls
_yield_also_chained_docs()
for superdocs.
- graphtik.planning.yield_chaindocs(dag, docs: Iterable[str], stop_set=()) Iterable[str] [source]¶
Calls
_yield_chained_docs()
for both subdocs & superdocs.
- graphtik.planning.yield_ops(nodes) List[Operation] [source]¶
May scan (preferably)
plan.steps
or dag nodes.
Module: execution¶
execute the plan to derrive the solution.
- class graphtik.execution.ExecutionPlan(net, needs, provides, dag, steps, asked_outs, comments)[source]¶
A pre-compiled list of operation steps that can execute for the given inputs/outputs.
It is the result of the network’s planning phase.
Note the execution plan’s attributes are on purpose immutable tuples.
- net¶
The parent
Network
- needs¶
An
IndexedSet
with the input names needed to exist in order to produce all provides.
- provides¶
An
IndexedSet
with the outputs names produces when all inputs are given.
- dag¶
The regular (not broken) pruned subgraph of net-graph.
- steps¶
The tuple of operation-nodes & instructions needed to evaluate the given inputs & asked outputs, free memory and avoid overwriting any given intermediate inputs.
- asked_outs¶
When true, evictions may kick in (unless disabled by configurations), otherwise, evictions (along with prefect-evictions check) are skipped.
- comments¶
an {op, prune-explanation} dictionary
- __dict__ = mappingproxy({'__module__': 'graphtik.execution', '__doc__': "\n A pre-compiled list of operation steps that can :term:`execute` for the given inputs/outputs.\n\n It is the result of the network's :term:`planning` phase.\n\n Note the execution plan's attributes are on purpose immutable tuples.\n\n .. attribute:: net\n\n The parent :class:`Network`\n .. attribute:: needs\n\n An :class:`.IndexedSet` with the input names needed to exist in order to produce all `provides`.\n .. attribute:: provides\n\n An :class:`.IndexedSet` with the outputs names produces when all `inputs` are given.\n .. attribute:: dag\n\n The regular (not broken) *pruned* subgraph of net-graph.\n .. attribute:: steps\n\n The tuple of operation-nodes & *instructions* needed to evaluate\n the given inputs & asked outputs, free memory and avoid overwriting\n any given intermediate inputs.\n .. attribute:: asked_outs\n\n When true, :term:`eviction`\\s may kick in (unless disabled by :term:`configurations`),\n otherwise, *evictions* (along with prefect-evictions check) are skipped.\n .. attribute:: comments\n\n an {op, prune-explanation} dictionary\n ", 'graph': <property object>, 'prepare_plot_args': <function ExecutionPlan.prepare_plot_args>, '__repr__': <function ExecutionPlan.__repr__>, 'validate': <function ExecutionPlan.validate>, '_check_if_aborted': <function ExecutionPlan._check_if_aborted>, '_prepare_tasks': <function ExecutionPlan._prepare_tasks>, '_handle_task': <function ExecutionPlan._handle_task>, '_execute_thread_pool_barrier_method': <function ExecutionPlan._execute_thread_pool_barrier_method>, '_execute_sequential_method': <function ExecutionPlan._execute_sequential_method>, 'execute': <function ExecutionPlan.execute>, '__dict__': <attribute '__dict__' of 'ExecutionPlan' objects>, '__abstractmethods__': frozenset(), '_abc_impl': <_abc_data object>, '__annotations__': {'graph': 'networkx.Graph'}})¶
- __module__ = 'graphtik.execution'¶
- _abc_impl = <_abc_data object>¶
- _execute_sequential_method(solution: Solution)[source]¶
This method runs the graph one operation at a time in a single thread
- Parameters
solution – must contain the input values only, gets modified
- _execute_thread_pool_barrier_method(solution: Solution)[source]¶
(deprecated) This method runs the graph using a parallel pool of thread executors. You may achieve lower total latency if your graph is sufficiently sub divided into operations using this method.
- Parameters
solution – must contain the input values only, gets modified
- _handle_task(future: Union[OpTask, AsyncResult], op, solution) None [source]¶
Un-dill parallel task results (if marshalled), and update solution / handle failure.
- _prepare_tasks(operations, solution, pool, global_parallel, global_marshal) Union[Future, OpTask, bytes] [source]¶
Combine ops+inputs, apply marshalling, and submit to execution pool (or not) …
based on global/pre-op configs.
- execute(named_inputs, outputs=None, *, name='', callbacks: Optional[Tuple[Callable[[OpTask], None], ...]] = None, solution_class=None, layered_solution=None) Solution [source]¶
- Parameters
named_inputs – A mapping of names –> values that must contain at least the compulsory inputs that were specified when the plan was built (but cannot enforce that!). Cloned, not modified.
outputs – If not None, they are just checked if possible, based on
provides
, and scream if not.name – name of the pipeline used for logging
callbacks – If given, a 2-tuple with (optional) callbacks to call before/after computing operation, with
OpTask
as argument containing the op & solution. Can be one (scalar), less than 2, or nothing/no elements accepted.solution_class – a custom solution factory to use
layered_solution –
whether to store operation results into separate solution layer
Unless overridden by a True/False in
set_layered_solution()
of configurations, it accepts the following values:When True(False), always keep(don’t keep) results in a separate layer for each operation, regardless of any jsonp dependencies.
If
None
, layers are used only if there are NO jsonp dependencies in the network.
- Returns
The solution which contains the results of each operation executed +1 for inputs in separate dictionaries.
- Raises
- Unsolvable graph…
if it cannot produce any outputs from the given inputs.
- Plan needs more inputs…
if given inputs mismatched plan’s
needs
.- Unreachable outputs…
if net cannot produce asked outputs.
- property graph¶
- prepare_plot_args(plot_args: PlotArgs) PlotArgs [source]¶
Called by
plot()
to create the nx-graph and other plot-args, e.g. solution.Clone the graph or merge it with the one in the plot_args (see
PlotArgs.clone_or_merge_graph()
.For the rest args, prefer
PlotArgs.with_defaults()
over_replace()
, not to override user args.
- validate(inputs: ~typing.Optional[~typing.Union[~typing.Collection, str]] = <UNSET>, outputs: ~typing.Optional[~typing.Union[~typing.Collection, str]] = <UNSET>)[source]¶
Scream on invalid inputs, outputs or no operations in graph.
- Parameters
- Raises
- Unsolvable graph…
if it cannot produce any outputs from the given inputs.
- Plan needs more inputs…
if given inputs mismatched plan’s
needs
.- Unreachable outputs…
if net cannot produce asked outputs.
- class graphtik.execution.OpTask(op, sol, solid, result=<UNSET>)[source]¶
Mimic
concurrent.futures.Future
for sequential execution.This intermediate class is needed to solve pickling issue with process executor.
- __module__ = 'graphtik.execution'¶
- __slots__ = ('op', 'sol', 'solid', 'result')¶
- logname = 'graphtik.execution'¶
- op¶
the operation about to be computed.
- result¶
Initially would
UNSET
, will be set after execution with operation’s outputs or exception.
- sol¶
the solution (might be just a plain dict if it has been marshalled).
- solid¶
the operation identity, needed if sol is a plain dict.
- class graphtik.execution.Solution(plan, input_values: dict, callbacks: Optional[Tuple[Callable[[OpTask], None], Callable[[OpTask], None]]] = None, is_layered=None)[source]¶
The solution chain-map and execution state (e.g. overwrite or canceled operation)
It inherits
collections.ChainMap
, to keep a separate dictionary for each operation executed, +1 for the user inputs.- __annotations__ = {'broken': typing.Mapping[graphtik.base.Operation, graphtik.base.Operation], 'canceled': typing.Mapping[graphtik.base.Operation, typing.Any], 'dag': <class 'networkx.classes.digraph.DiGraph'>, 'executed': typing.Mapping[graphtik.base.Operation, typing.Any], 'graph': "'networkx.Graph'", 'is_layered': <class 'bool'>, 'solid': <class 'str'>}¶
- __init__(plan, input_values: dict, callbacks: Optional[Tuple[Callable[[OpTask], None], Callable[[OpTask], None]]] = None, is_layered=None)[source]¶
Initialize a ChainMap by setting maps to the given mappings. If no mappings are provided, a single empty dictionary is used.
- __module__ = 'graphtik.execution'¶
- _abc_impl = <_abc_data object>¶
- _populate_op_layer_with_outputs(op, outputs) dict [source]¶
Installs & populates a new 1st chained-map, if layered, or use named_inputs.
- _reschedule(dag, reason, op)[source]¶
Re-prune dag, and then update and return any newly-canceled ops.
- Parameters
dag – The dag to discover unsatisfied operations from.
reason – for logging
op – for logging
- broken: Mapping[Operation, Operation] = {}[source]¶
A map of {rescheduled operation -> dynamically pruned ops, downstream}.
- canceled: Mapping[Operation, Any] = {}[source]¶
A {op, prune-explanation} dictionary with canceled operation\s due to upstream failures.
- check_if_incomplete() Optional[IncompleteExecutionError] [source]¶
Return a
IncompleteExecutionError
if pipeline operations failed/canceled.
- dag: DiGraph[source]¶
Cloned from plan will be modified, by removing the downstream edges of:
any partial outputs not provided, or
all provides of failed operations.
- executed: Mapping[Operation, Any] = {}[source]¶
A dictionary with keys the operations executed, and values their layer, or status:
no key: not executed yet
value == dict: execution ok, produced those outputs
value == Exception: execution failed
Keys are ordered as operations executed (last, most recently executed).
When
is_layered
, its value-dicts are inserted, in reverse order, into mymaps
(from chain-map).
- property graph¶
- is_layered: bool[source]¶
- Command solution layer, by
default, false if not any jsonp in dependencies.
See
executed
below
- property layers: List[Mapping[Operation, Any]]¶
Outputs by operation, in execution order (last, most recently executed).
- operation_executed(op, outputs)[source]¶
Invoked once per operation, with its results.
It will update
executed
with the operation status and if outputs were partials, it will updatecanceled
with the unsatisfied ops downstream of op.- Parameters
op – the operation that completed ok
outputs – The named values the op` actually produced, which may be a subset of its provides. Sideffects are not considered.
- operation_failed(op, ex)[source]¶
Invoked once per operation, with its results.
It will update
executed
with the operation status and thecanceled
with the unsatisfied ops downstream of op.
- property overwrites: Mapping[Any, List]¶
The data in the solution that exist more than once (refreshed on every call).
A “virtual” property to a dictionary with keys the names of values that exist more than once, and values, all those values in a list, ordered in reverse compute order (1st is the last one computed, last (any) given-inputs).
- plan = 'ExecutionPlan'¶
the plan that produced this solution
- graphtik.execution._do_task(task)[source]¶
Un-dill the simpler
OpTask
& Dill the results, to pass through pool-processes.
- graphtik.execution.log = <Logger graphtik.execution (WARNING)>¶
If this logger is eventually DEBUG-enabled, the string-representation of network-objects (network, plan, solution) is augmented with children’s details.
- graphtik.execution.task_context: ContextVar = <ContextVar name='task_context'>¶
(unstable API) Populated with the
OpTask
for the currently executing operation. It does not work for (deprecated) parallel execution.See also
The elaborate example in Hierarchical data and further tricks section
Module: plot¶
plotting handled by the active plotter & current theme.
- class graphtik.plot.Plotter(theme: Optional[Theme] = None, **styles_kw)[source]¶
a plotter renders diagram images of plottables.
- default_theme[source]¶
The customizable
Theme
instance controlling theme values & dictionaries for plots.
- build_pydot(plot_args: PlotArgs) Dot [source]¶
Build a
pydot.Dot
out of a Network graph/steps/inputs/outputs and return itto be fed into Graphviz to render.
See
Plottable.plot()
for the arguments, sample code, and the legend of the plots.
- legend(filename=None, jupyter_render: Optional[Mapping] = None, theme: Optional[Theme] = None)[source]¶
Generate a legend for all plots (see
Plottable.plot()
for args)See
Plotter.render_pydot()
for the rest arguments.
- render_pydot(dot: Dot, filename=None, jupyter_render: Optional[str] = None)[source]¶
Render a
pydot.Dot
instance with Graphviz in a file and/or in a matplotlib window.- Parameters
dot – the pre-built
pydot.Dot
instancefilename (str) –
Write a file or open a matplotlib window.
If it is a string or file, the diagram is written into the file-path
Common extensions are
.png .dot .jpg .jpeg .pdf .svg
callplot.supported_plot_formats()
for more.If it IS True, opens the diagram in a matplotlib window (requires matplotlib package to be installed).
If it equals -1, it mat-plots but does not open the window.
Otherwise, just return the
pydot.Dot
instance.
- seealso
jupyter_render –
a nested dictionary controlling the rendering of graph-plots in Jupyter cells. If None, defaults to
default_jupyter_render
; you may modify those in place and they will apply for all future calls (see Jupyter notebooks).You may increase the height of the SVG cell output with something like this:
plottable.plot(jupyter_render={"svg_element_styles": "height: 600px; width: 100%"})
- Returns
the matplotlib image if
filename=-1
, or the given dot annotated with any jupyter-rendering configurations given in jupyter_render parameter.
See
Plottable.plot()
for sample code.
- with_styles(**kw) Plotter [source]¶
Returns a cloned plotter with a deep-copied theme modified as given.
See also
Theme.withset()
.
- class graphtik.plot.Ref(ref, default=Ellipsis)[source]¶
Deferred attribute reference
resolve()
d on a some object(s).- default¶
- ref¶
- class graphtik.plot.StylesStack(plot_args: PlotArgs, named_styles: List[Tuple[str, dict]], ignore_errors: bool = False)[source]¶
A mergeable stack of dicts preserving provenance and style expansion.
The
merge()
method joins the collected stack of styles into a single dictionary, and if DEBUG (seeremerge()
) insert their provenance in a'tooltip'
attribute; Any lists are merged (important for multi-valued Graphviz attributes likestyle
).Then they are
expanded
.- add(name, kw=Ellipsis)[source]¶
Adds a style by name from style-attributes, or provenanced explicitly, or fail early.
- Parameters
name – Either the provenance name when the kw styles is given, OR just an existing attribute of
style
instance.kw – if given and is None/empty, ignored.
- expand(style: dict) dict [source]¶
Apply style expansions on an already merged style.
Call any callables found as keys, values or the whole style-dict, passing in the current
plot_args
, and replace those with the callable’s result (even more flexible than templates).Resolve any
Ref
instances, first against the current nx_attrs and then against the attributes of the current theme.Render jinja2 templates with template-arguments all attributes of
plot_args
instance in use, (hence much more flexible thanRef
).Any Nones results above are discarded.
Workaround pydot/pydot#228 pydot-cstor not supporting styles-as-lists.
Merge tooltip & tooltip lists.
- property ignore_errors¶
When true, keep merging despite expansion errors.
- merge(debug=None) dict [source]¶
Recursively merge
named_styles
andexpand()
the result style.- Parameters
debug – When not None, override DEBUG flag; when enabled, tooltips are overridden with provenance & nx_attrs.
- Returns
the merged styles
- property named_styles¶
A list of 2-tuples: (name, dict) containing the actual styles along with their provenance.
- property plot_args¶
current item’s plot data with at least
PlotArgs.theme
attribute. ` `
- stack_user_style(nx_attrs: dict, skip=())[source]¶
Appends keys in nx_attrs starting with
USER_STYLE_PREFFIX
into the stack.
- class graphtik.plot.Theme(*, _prototype: Optional[Theme] = None, **kw)[source]¶
The poor man’s css-like plot theme (see also
StyleStack
).To use the values contained in theme-instances, stack them in a
StylesStack
, andStylesStack.merge()
them with style expansions (read it fromStyleStack.expand()
).Attention
It is recommended to use other means for Plot customizations instead of modifying directly theme’s class-attributes.
All
Theme
class-attributes are deep-copied when constructing new instances, to avoid modifications by mistake, while attempting to update instance-attributes instead (hint: allmost all its attributes are containers i.e. dicts). Therefore any class-attributes modification will be ignored, until a newTheme
instance from the patched class is used .- arch_url = 'https://graphtik.readthedocs.io/en/latest/arch.html'¶
the url to the architecture section explaining graphtik glossary, linked by legend.
- broken_color = 'Red'¶
- canceled_color = '#a9a9a9'¶
- data_bad_html_label_keys = {'label'}¶
Keys to ignore from data styles & node-attrs, because they are handled internally by HTML-Label, and/or interact badly with that label.
- data_template = <Template memory:7f29cc513b10>¶
- edge_defaults = {}[source]¶
Attributes applying to all edges with
edge [...]
graphviz construct, appended in graph only if non-empty.
- evicted_color = '#006666'¶
- failed_color = 'LightCoral'¶
- fill_color = 'wheat'¶
- kw_data = {'fixedsize': 'shape', 'shape': 'rect'}¶
Reduce margins, since sideffects take a lot of space (default margin: x=0.11, y=0.055O)
- kw_data_evicted = {'penwidth': '3', 'tooltip': ['(evicted)']}¶
- kw_data_in_solution = {'fillcolor': Ref('fill_color'), 'style': ['filled'], 'tooltip': [<function make_data_value_tooltip>]}¶
- kw_data_in_solution_null = {'fillcolor': Ref('null_color'), 'tooltip': ['(null-result)']}¶
- kw_data_inp_only = {'shape': 'invhouse', 'tooltip': ['(input)']}¶
- kw_data_io = {'shape': 'hexagon', 'tooltip': ['(input+output)']}¶
- kw_data_missing = {'color': Ref('canceled_color'), 'fontcolor': Ref('canceled_color'), 'tooltip': ['(missing-optional or canceled)']}¶
- kw_data_out_only = {'shape': 'house', 'tooltip': ['(output)']}¶
- kw_data_overwritten = {'fillcolor': Ref('overwrite_color'), 'style': ['filled'], 'tooltip': [<function make_overwrite_tooltip>]}¶
- kw_data_pruned = {'color': Ref('pruned_color'), 'fontcolor': Ref('pruned_color'), 'tooltip': ['(pruned)']}¶
- kw_data_sideffect = {'color': Ref('sideffect_color'), 'fontcolor': Ref('sideffect_color')}¶
- kw_data_to_evict = {'color': Ref('evicted_color'), 'style': ['filled', 'dashed'], 'tooltip': ['(to evict)']}¶
- kw_edge = {'headport': 'n', 'tailport': 's'}¶
- kw_edge_alias = {'fontsize': 11, 'label': <Template memory:7f29cc533210>}¶
Added conditionally if alias_of found in edge-attrs.
- kw_edge_broken = {'color': Ref('broken_color'), 'tooltip': ['(partial-broken)']}¶
- kw_edge_endured = {'style': ['dashed']}¶
- kw_edge_head_op = {'arrowtail': 'inv', 'dir': 'back'}¶
- kw_edge_implicit = {<function Theme.<lambda>>: 'obox', 'dir': 'both', 'tooltip': ['(implicit)'], 'fontcolor': Ref('sideffect_color')}¶
- kw_edge_mapping_keyword = {'fontname': 'italic', 'fontsize': 11, 'label': <Template memory:7f29d34e1d10>, 'tooltip': ['(mapped-fn-keyword)']}¶
Rendered if
keyword
exists in nx_attrs.
- kw_edge_null_result = {'color': Ref('null_color'), 'tooltip': ['(null-result)']}¶
- kw_edge_optional = {'style': ['dashed'], 'tooltip': ['(optional)']}¶
- kw_edge_pruned = {'color': Ref('pruned_color'), 'fontcolor': Ref('pruned_color')}¶
- kw_edge_rescheduled = {'style': ['dashed']}¶
- kw_edge_sideffect = {'color': Ref('sideffect_color')}¶
- kw_edge_subdoc = {'arrowtail': 'odot', 'color': Ref('subdoc_color'), 'dir': 'back', 'headport': 'nw', 'tailport': 'se', 'tooltip': ['(subdoc)']}¶
- kw_graph = {'fontname': 'italic', 'graph_type': 'digraph'}¶
- kw_graph_plottable_type = {'ExecutionPlan': {}, 'FnOp': {}, 'Network': {}, 'Pipeline': {}, 'Solution': {}}¶
styles per plot-type
- kw_graph_plottable_type_unknown = {}[source]¶
For when type-name of
PlotArgs.plottable
is not found inkw_plottable_type
( ot missing altogether).
- kw_legend = {'URL': 'https://graphtik.readthedocs.io/en/latest/_images/GraphtikLegend.svg', 'fillcolor': 'yellow', 'name': 'legend', 'shape': 'component', 'style': 'filled', 'target': '_blank'}¶
If
'URL'`
key missing/empty, no legend icon included in plots.
- kw_op = {'name': <function Theme.<lambda>>, 'shape': 'plain', 'tooltip': [<function Theme.<lambda>>]}¶
props for operation node (outside of label))
- kw_op_canceled = {'fillcolor': Ref('canceled_color'), 'tooltip': ['(canceled)']}¶
- kw_op_endured = {'badges': ['!'], 'penwidth': Ref('resched_thickness'), 'style': ['dashed'], 'tooltip': ['(endured)']}¶
- kw_op_executed = {'fillcolor': Ref('fill_color')}¶
- kw_op_failed = {'fillcolor': Ref('failed_color'), 'tooltip': [<Template memory:7f29cc768c90>]}¶
- kw_op_label = {'fn_link_target': '_top', 'fn_name': <function Theme.<lambda>>, 'fn_truncate': Ref('truncate_args'), 'fn_url': Ref('fn_url'), 'op_link_target': '_top', 'op_name': <function Theme.<lambda>>, 'op_truncate': Ref('truncate_args'), 'op_url': Ref('op_url')}¶
Jinja2 params for the HTML-Table label, applied 1ST.
- kw_op_label2 = {'fn_tooltip': [<function make_fn_tooltip>], 'op_tooltip': [<function make_op_tooltip>]}¶
Jinja2 params for the HTML-Table label applied AT THE END.
- kw_op_marshalled = {'badges': ['&']}¶
- kw_op_parallel = {'badges': ['|']}¶
- kw_op_prune_comment = {'op_tooltip': [<function make_op_prune_comment>]}¶
- kw_op_pruned = {'color': Ref('pruned_color'), 'fontcolor': Ref('pruned_color')}¶
- kw_op_rescheduled = {'badges': ['?'], 'penwidth': Ref('resched_thickness'), 'style': ['dashed'], 'tooltip': ['(rescheduled)']}¶
- kw_op_returns_dict = {'badges': ['}']}¶
- kw_step = {'arrowhead': 'vee', 'color': Ref('steps_color'), 'fontcolor': Ref('steps_color'), 'fontname': 'bold', 'fontsize': 18, 'splines': True, 'style': 'dotted'}¶
step edges
- kw_step_badge = {'step_bgcolor': Ref('steps_color'), 'step_color': 'white', 'step_target': '_top', 'step_tooltip': 'computation order', 'step_url': 'https://graphtik.readthedocs.io/en/latest/arch.html#term-steps', 'vector_color': Ref('vector_color')}¶
Available as jinja2 params for both data & operation templates.
- node_defaults = {'fillcolor': 'white', 'style': ['filled']}¶
Attributes applying to all nodes with
node [...]
graphviz construct, append in graph only if non-empty.
- null_color = '#ffa9cd'¶
- op_bad_html_label_keys = {'label', 'shape', 'style'}¶
Keys to ignore from operation styles & node-attrs, because they are handled internally by HTML-Label, and/or interact badly with that label.
- op_badge_styles = {'badge_styles': {'!': {'URL': 'https://graphtik.readthedocs.io/en/latest/arch.html#term-endured', 'bgcolor': '#04277d', 'color': 'white', 'target': '_top', 'tooltip': 'endured'}, '&': {'URL': 'https://graphtik.readthedocs.io/en/latest/arch.html#term-marshalling', 'bgcolor': '#4e3165', 'color': 'white', 'target': '_top', 'tooltip': 'marshalled'}, '?': {'URL': 'https://graphtik.readthedocs.io/en/latest/arch.html#term-partial-outputs', 'bgcolor': '#fc89ac', 'color': 'white', 'target': '_top', 'tooltip': 'rescheduled'}, '|': {'URL': 'https://graphtik.readthedocs.io/en/latest/arch.html#term-parallel-execution', 'bgcolor': '#b1ce9a', 'color': 'white', 'target': '_top', 'tooltip': 'parallel'}, '}': {'URL': 'https://graphtik.readthedocs.io/en/latest/arch.html#term-returns-dictionary', 'bgcolor': '#cc5500', 'color': 'white', 'target': '_top', 'tooltip': 'returns_dict'}}}¶
Operation styles may specify one or more “letters” in a badges list item, as long as the “letter” is contained in the dictionary below.
- op_template = <Template memory:7f29cc533d10>¶
Try to mimic a regular Graphviz node attributes (see examples in
test.test_plot.test_op_template_full()
for params). TODO: fix jinja2 template is un-picklable!
- overwrite_color = 'SkyBlue'¶
- pruned_color = '#d3d3d3'¶
- resched_thickness = 4¶
- show_chaindocs = None[source]¶
- None:
hide any parent/subdoc not related directly to some operation;
- true:
plot also hierarchical data nodes not directly linked to operations;
- false:
hide also parent-subdoc relation edges.
- show_steps = None[source]¶
- None:
plot just a badge with the order (a number) of each op/data in steps (if contained);
- true:
plot also execution steps, linking operations and evictions with green dotted lines labeled with numbers denoting the execution order;
- false:
hide even op/data step order badges.
- sideffect_color = 'blue'¶
- steps_color = '#00bbbb'¶
- subdoc_color = '#8B4513'¶
- truncate_args = ((23, True), {'reverse': True})¶
args for jinja2 patched truncate filter, above.
- vector_color = '#7193ff'¶
- graphtik.plot.USER_STYLE_PREFFIX = 'graphviz.'¶
Any nx-attributes starting with this prefix are appended verbatim as graphviz attributes, by
stack_user_style()
.
- graphtik.plot.active_plotter_plugged(plotter: Plotter) None [source]¶
Like
set_active_plotter()
as a context-manager, resetting back to old value.
- graphtik.plot.as_identifier(s)[source]¶
Convert string into a valid ID, both for html & graphviz.
It must not rely on Graphviz’s HTML-like string, because it would not be a valid HTML-ID.
Adapted from https://stackoverflow.com/a/3303361/548792,
HTML rule from https://stackoverflow.com/a/79022/548792
Graphviz rules: https://www.graphviz.org/doc/info/lang.html
- graphtik.plot.default_jupyter_render = {'svg_container_styles': '', 'svg_element_styles': 'width: 100%; height: 300px;', 'svg_pan_zoom_json': '{controlIconsEnabled: true, fit: true}'}¶
A nested dictionary controlling the rendering of graph-plots in Jupyter cells,
as those returned from
Plottable.plot()
(currently as SVGs). Either modify it in place, or pass another one in the respective methods.The following keys are supported.
- Parameters
svg_pan_zoom_json –
arguments controlling the rendering of a zoomable SVG in Jupyter notebooks, as defined in https://github.com/ariutta/svg-pan-zoom#how-to-use if None, defaults to string (also maps supported):
"{controlIconsEnabled: true, fit: true}"
svg_element_styles –
mostly for sizing the zoomable SVG in Jupyter notebooks. Inspect & experiment on the html page of the notebook with browser tools. if None, defaults to string (also maps supported):
"width: 100%; height: 300px;"
svg_container_styles – like svg_element_styles, if None, defaults to empty string (also maps supported).
Note
referred also by
graphtik
’sgraphtik_zoomable_options
default configuration value.
- graphtik.plot.get_active_plotter() Plotter [source]¶
Get the previously active
plotter
instance or default one.
- graphtik.plot.graphviz_html_string(s, *, repl_nl=None, repl_colon=None, xmltext=None)[source]¶
Workaround pydot parsing of node-id & labels by encoding as HTML.
pydot library does not quote DOT-keywords anywhere (pydot#111).
Char
:
on node-names denote port/compass-points and break IDs (pydot#224).Non-strings are not quote_if_necessary by pydot.
NLs im tooltips of HTML-Table labels need substitution with the XML-entity.
HTML-Label attributes (
xmlattr=True
) need both html-escape & quote.
Attention
It does not correctly handle
ID:port:compass-point
format.
- graphtik.plot.is_nx_node_dependent(graph, nx_node)[source]¶
Return true if node’s edges are not subdoc only.
- graphtik.plot.legend(filename=None, show=None, jupyter_render: Optional[Mapping] = None, plotter: Optional[Plotter] = None)[source]¶
Generate a legend for all plots (see
Plottable.plot()
for args)- Parameters
plotter – override the active plotter
show –
Deprecated since version v6.1.1: Merged with filename param (filename takes precedence).
See
Plotter.render_pydot()
for the rest arguments.
- graphtik.plot.make_data_value_tooltip(plot_args: PlotArgs)[source]¶
Called on datanodes, when solution exists.
- graphtik.plot.make_op_tooltip(plot_args: PlotArgs)[source]¶
the string-representation of an operation (name, needs, provides)
- graphtik.plot.make_overwrite_tooltip(plot_args: PlotArgs)[source]¶
Called on datanodes, withmultiple overwrite values.
- graphtik.plot.make_template(s)[source]¶
Makes dedented jinja2 templates supporting extra escape filters for Graphviz:
ee
Like default escape filter
e
, but Nones/empties evaluate to false. Needed because the default escape filter breaks xmlattr filter with Nones .eee
Escape for when writting inside HTML-strings. Collapses nones/empties (unlike default
e
).hrefer
Dubious escape for when writting URLs inside Graphviz attributes. Does NOT collapse nones/empties (like default
e
)ex
format exceptions
truncate
reversing truncate (keep tail) if truncate arg is true
sideffected
return the sideffected part of an sfxed or none
sfx_list
return the sfx_list part of an sfxed or none
jsonp
return the jsonp list of a dependency or none
- graphtik.plot.remerge(*containers, source_map: Optional[list] = None)[source]¶
Merge recursively dicts and extend lists with
boltons.iterutils.remap()
…screaming on type conflicts, ie, a list needs a list, etc, unless one of them is None, which is ignored.
- Parameters
containers – a list of dicts or lists to merge; later ones take precedence (last-wins). If source_map is given, these must be 2-tuples of
(name: container)
.source_map –
If given, it must be a dictionary, and containers arg must be 2-tuples like
(name: container)
. The source_map will be populated with mappings between path and the name of the container it came from.Warning
if source_map given, the order of input dictionaries is NOT preserved is the results (important if your code rely on PY3.7 stable dictionaries).
- Returns
returns a new, merged top-level container.
Adapted from https://gist.github.com/mahmoud/db02d16ac89fa401b968 but for lists and dicts only, ignoring Nones and screams on incompatible types.
Discusson in: https://gist.github.com/pleasantone/c99671172d95c3c18ed90dc5435ddd57
Example
>>> defaults = { ... 'subdict': { ... 'as_is': 'hi', ... 'overridden_key1': 'value_from_defaults', ... 'overridden_key1': 2222, ... 'merged_list': ['hi', {'untouched_subdict': 'v1'}], ... } ... }
>>> overrides = { ... 'subdict': { ... 'overridden_key1': 'overridden value', ... 'overridden_key2': 5555, ... 'merged_list': ['there'], ... } ... }
>>> from graphtik.plot import remerge >>> source_map = {} >>> remerge( ... ("defaults", defaults), ... ("overrides", overrides), ... source_map=source_map) {'subdict': {'as_is': 'hi', 'overridden_key1': 'overridden value', 'merged_list': ['hi', {'untouched_subdict': 'v1'}, 'there'], 'overridden_key2': 5555}} >>> source_map {('subdict', 'as_is'): 'defaults', ('subdict', 'overridden_key1'): 'overrides', ('subdict', 'merged_list'): ['defaults', 'overrides'], ('subdict',): 'overrides', ('subdict', 'overridden_key2'): 'overrides'}
- graphtik.plot.save_plot_file_by_sha1(plottable: Plottable, dir_prefix: Path)[source]¶
Save plottable in a fpath generated from sha1 of the dot.
- graphtik.plot.set_active_plotter(plotter: Plotter)[source]¶
The default instance to render plottables,
unless overridden with a plotter argument in
Plottable.plot()
.- Parameters
plotter – the
plotter
instance to install
Module: config¶
configurations for network execution, and utilities on them.
See also
methods plot.active_plotter_plugged()
, plot.set_active_plotter()
,
plot.get_active_plotter()
Plot configrations were not defined here, not to pollute import space early, until they are actually needed.
Note
The contant-manager function XXX_plugged()
or XXX_enabled()
do NOT launch
their code blocks using contextvars.Context.run()
in a separate “context”,
so any changes to these or other context-vars will persist
(unless they are also done within such context-managers)
- graphtik.config.abort_run()[source]¶
Sets the abort run global flag, to halt all currently or future executing plans.
This global flag is reset when any
Pipeline.compute()
is executed, or manually, by callingreset_abort()
.
- graphtik.config.debug_enabled(enabled=True)[source]¶
Like
set_debug()
as a context-manager, resetting back to old value.See also
disclaimer about context-managers at the top of this
config
module.
- graphtik.config.evictions_skipped(enabled=True)[source]¶
Like
set_skip_evictions()
as a context-manager, resetting back to old value.See also
disclaimer about context-managers at the top of this
config
module.
- graphtik.config.execution_pool_plugged(pool: Optional[Pool])[source]¶
Like
set_execution_pool()
as a context-manager, resetting back to old value.See also
disclaimer about context-managers at the top of this
config
module.
- graphtik.config.get_execution_pool() Optional[Pool] [source]¶
(deprecated) Get the process-pool for parallel plan executions.
- graphtik.config.is_debug() Optional[bool] [source]¶
Return
set_debug()
or True ifGRAPHTIK_DEBUG
not one of0 false off no
.Affected behavior when DEBUG flag enabled:
on errors, plots the 1st errored solution/plan/pipeline/net (in that order) in an SVG file inside the temp-directory, and its path is logged in ERROR-level;
jetsam logs in ERROR (instead of in DEBUG) all annotations on all calls up the stack trace (logged from
graphtik.jetsam.err
logger);FnOp.compute()
prints out full given-inputs (not just their keys);net objects print more details recursively, like fields (not just op-names) and prune-comments;
plotted SVG diagrams include style-provenance as tooltips;
Sphinx extension also saves the original DOT file next to each image (see
graphtik_save_dot_files
).
Note
The default is controlled with
GRAPHTIK_DEBUG
environment variable.Note that enabling this flag is different from enabling logging in DEBUG, since it affects all code (eg interactive printing in debugger session, exceptions, doctests), not just debug statements (also affected by this flag).
- Returns
a “reset” token (see
ContextVar.set()
)
- graphtik.config.is_marshal_tasks() Optional[bool] [source]¶
(deprecated) see
set_marshal_tasks()
- graphtik.config.operations_endured(enabled=True)[source]¶
Like
set_endure_operations()
as a context-manager, resetting back to old value.See also
disclaimer about context-managers at the top of this
config
module.
- graphtik.config.operations_reschedullled(enabled=True)[source]¶
Like
set_reschedule_operations()
as a context-manager, resetting back to old value.See also
disclaimer about context-managers at the top of this
config
module.
- graphtik.config.reset_abort()[source]¶
Reset the abort run global flag, to permit plan executions to proceed.
- graphtik.config.set_debug(enabled)[source]¶
Enable/disable debug-mode.
- Parameters
enabled –
None, False, string(0, false, off, no)
: Disabledanything else: Enable DEBUG
see
is_debug()
- graphtik.config.set_endure_operations(enabled)[source]¶
Enable/disable globally endurance to keep executing even if some operations fail.
- Parameters
enable –
If
None
(default), respect the flag on each operation;If true/false, force it for all operations.
- Returns
a “reset” token (see
ContextVar.set()
)
.
- graphtik.config.set_execution_pool(pool: Optional[Pool])[source]¶
(deprecated) Set the process-pool for parallel plan executions.
You may have to :also func:set_marshal_tasks() to resolve pickling issues.
- graphtik.config.set_layered_solution(enabled)[source]¶
whether to store operation results into separate solution layer
- Parameters
enable – If false/true, it overrides any param given when executing a pipeline or a plan. If None (default), results are layered only if there are NO jsonp dependencies in the network.
- Returns
a “reset” token (see
ContextVar.set()
)
- graphtik.config.set_marshal_tasks(enabled)[source]¶
(deprecated) Enable/disable globally marshalling of parallel operations, …
inputs & outputs with
dill
, which might help for pickling problems.- Parameters
enable –
If
None
(default), respect the respective flag on each operation;If true/false, force it for all operations.
- Returns
a “reset” token (see
ContextVar.set()
)
- graphtik.config.set_parallel_tasks(enabled)[source]¶
Enable/disable globally parallel execution of operations.
- Parameters
enable –
If
None
(default), respect the respective flag on each operation;If true/false, force it for all operations.
- Returns
a “reset” token (see
ContextVar.set()
)
- graphtik.config.set_reschedule_operations(enabled)[source]¶
Enable/disable globally rescheduling for operations returning only partial outputs.
- Parameters
enable –
If
None
(default), respect the flag on each operation;If true/false, force it for all operations.
- Returns
a “reset” token (see
ContextVar.set()
)
.
- graphtik.config.set_skip_evictions(enabled)[source]¶
When true, disable globally evictions, to keep all intermediate solution values, …
regardless of asked outputs.
- Returns
a “reset” token (see
ContextVar.set()
)
- graphtik.config.solution_layered(enabled=True)[source]¶
Like
set_layered_solution()
as a context-manager, resetting back to old value.See also
disclaimer about context-managers at the top of this
config
module.
- graphtik.config.tasks_in_parallel(enabled=True)[source]¶
(deprecated) Like
set_parallel_tasks()
as a context-manager, resetting back to old value.See also
disclaimer about context-managers at the top of this
config
module.
- graphtik.config.tasks_marshalled(enabled=True)[source]¶
(deprecated) Like
set_marshal_tasks()
as a context-manager, resetting back to old value.See also
disclaimer about context-managers at the top of this
config
module.
Module: base¶
Generic utilities, exceptions and operation & plottable base classes.
- exception graphtik.base.AbortedException[source]¶
Raised from Network when
abort_run()
is called, and contains the solution …with any values populated so far.
- exception graphtik.base.IncompleteExecutionError[source]¶
Reported when any endured/reschedule operations were are canceled.
The exception contains 3 arguments:
the causal errors and conditions (1st arg),
the list of collected exceptions (2nd arg), and
the solution instance (3rd argument), to interrogate for more.
Returned by
check_if_incomplete()
or raised byscream_if_incomplete()
.
- class graphtik.base.Operation[source]¶
An abstract class representing an action with
compute()
.- abstract compute(named_inputs, outputs=None, recompute_from=None, *kw)[source]¶
Compute (optional) asked outputs for the given named_inputs.
It is called by
Network
. End-users should simply call the operation with named_inputs as kwargs.- Parameters
named_inputs – the input values with which to feed the computation.
outputs – what results to compute, see
Pipeline.compute()
.recompute_from – recompute all downstream from those dependencies, see
Pipeline.compute()
.
- Returns list
Should return a list values representing the results of running the feed-forward computation on
inputs
.
- class graphtik.base.PlotArgs(plottable: Plottable = None, graph: nx.Graph = None, name: str = None, steps: Collection = None, inputs: Collection = None, outputs: Collection = None, solution: graphtik.planning.Solution = None, clusters: Mapping = None, plotter: graphtik.plot.Plotter = None, theme: graphtik.plot.Theme = None, dot: pydot.Dot = None, nx_item: Any = None, nx_attrs: dict = None, dot_item: Any = None, clustered: dict = None, jupyter_render: Mapping = None, filename: Union[str, bool, int] = None)[source]¶
All the args of a
Plottable.plot()
call,check this method for a more detailed explanation of its attributes.
- clone_or_merge_graph(base_graph) PlotArgs [source]¶
Overlay
graph
over base_graph, or clone base_graph, if no attribute.- Returns
the updated plot_args
- property clustered¶
Collect the actual clustered dot_nodes among the given nodes.
- property clusters¶
Either a mapping of node-names to dot(
.
)-separated cluster-names, or false/true to enable plotter’s default clustering of nodes based on their dot-separated name parts.Note that if it’s None (default), the plotter will cluster based on node-names, BUT the Plan may replace the None with a dictionary with the “pruned” cluster (when its dag differs from network’s graph); to suppress the pruned-cluster, pass a truthy, NON-dictionary value.
- property dot¶
Where to add graphviz nodes & stuff.
- property dot_item¶
The pydot-node/edge created
- property filename¶
where to write image or show in a matplotlib window
- property graph¶
what to plot (or the “overlay” when calling
Plottable.plot()
)
- property inputs¶
the list of input names .
- property jupyter_render¶
jupyter configuration overrides
- property name¶
The name of the graph in the dot-file (important for cmaps).
- property nx_attrs¶
Attributes gotten from nx-graph for the given graph/node/edge. They are NOT a clone, so any modifications affect the nx graph.
- property outputs¶
the list of output names .
- property plottable¶
who is the caller
- property plotter¶
If given, overrides :active plotter`.
- property steps¶
the list of execution plan steps.
- property theme¶
If given, overrides plot theme plotter will use. It can be any mapping, in which case it overrite the current theme.
- class graphtik.base.Plottable[source]¶
plottable capabilities and graph props for all major classes of the project.
Classes wishing to plot their graphs should inherit this and implement property
plot
to return a “partial” callable that somehow ends up callingplot.render_pydot()
with the graph or any other args bound appropriately. The purpose is to avoid copying this function & documentation here around.- find_op_by_name(name) Optional[Operation] [source]¶
Fetch the 1st operation named with the given name.
- find_ops(predicate) List[Operation] [source]¶
Scan operation nodes and fetch those satisfying predicate.
- Parameters
predicate – the node predicate is a 2-argument callable(op, node-data) that should return true for nodes to include.
- graph: networkx.Graph[source]¶
- plot(filename: Union[str, bool, int] = None, show=None, *, plotter: graphtik.plot.Plotter = None, theme: graphtik.plot.Theme = None, graph: networkx.Graph = None, name=None, steps=None, inputs=None, outputs=None, solution: graphtik.planning.Solution = None, clusters: Mapping = None, jupyter_render: Union[None, Mapping, str] = None) pydot.Dot [source]¶
Entry-point for plotting ready made operation graphs.
- Parameters
filename (str) –
Write a file or open a matplotlib window.
If it is a string or file, the diagram is written into the file-path
Common extensions are
.png .dot .jpg .jpeg .pdf .svg
callplot.supported_plot_formats()
for more.If it IS True, opens the diagram in a matplotlib window (requires matplotlib package to be installed).
If it equals -1, it mat-plots but does not open the window.
Otherwise, just return the
pydot.Dot
instance.
- seealso
plottable –
the plottable that ordered the plotting. Automatically set downstreams to one of:
op | pipeline | net | plan | solution | <missing>
- seealso
plotter –
the plotter to handle plotting; if none, the active plotter is used by default.
- seealso
theme –
Any plot theme or dictionary overrides; if none, the
Plotter.default_theme
of the active plotter is used.- seealso
name –
if not given, dot-lang graph would is named “G”; necessary to be unique when referring to generated CMAPs. No need to quote it, handled by the plotter, downstream.
- seealso
graph (str) –
(optional) A
nx.Digraph
with overrides to merge with the graph provided by underlying plottables (translated by the active plotter).It may contain graph, node & edge attributes for any usage, but these conventions apply:
'graphviz.xxx'
(graph/node/edge attributes)Any “user-overrides” with this prefix are sent verbatim a Graphviz attributes.
Note
Remember to escape those values as Graphviz HTML-Like strings (use
plot.graphviz_html_string()
).no_plot
(node/edge attribute)element skipped from plotting (see “Examples:” section, below)
- seealso
inputs –
an optional name list, any nodes in there are plotted as a “house”
- seealso
outputs –
an optional name list, any nodes in there are plotted as an “inverted-house”
- seealso
solution –
an optional dict with values to annotate nodes, drawn “filled” (currently content not shown, but node drawn as “filled”). It extracts more infos from a
Solution
instance, such as, if solution has anexecuted
attribute, operations contained in it are drawn as “filled”.- seealso
clusters –
Either a mapping, or false/true to enable plotter’s default clustering of nodes base on their dot-separated name parts.
Note that if it’s None (default), the plotter will cluster based on node-names, BUT the Plan may replace the None with a dictionary with the “pruned” cluster (when its dag differs from network’s graph); to suppress the pruned-cluster, pass a truthy, NON-dictionary value.
Practically, when it is a:
dictionary of node-names –> dot(
.
)-separated cluster-names, it is respected, even if empty;truthy: cluster based on dot(
.
)-separated node-name parts;falsy: don’t cluster at all.
- seealso
jupyter_render –
a nested dictionary controlling the rendering of graph-plots in Jupyter cells, if None, defaults to
jupyter_render
; you may modify it in place and apply for all future calls (see Jupyter notebooks).- seealso
show –
Deprecated since version v6.1.1: Merged with filename param (filename takes precedence).
- Returns
a
pydot.Dot
instance (for reference to as similar API topydot.Dot
instance, visit: https://pydotplus.readthedocs.io/reference.html#pydotplus.graphviz.Dot)The
pydot.Dot
instance returned is rendered directly in Jupyter/IPython notebooks as SVG images (see Jupyter notebooks).
Note that the graph argument is absent - Each Plottable provides its own graph internally; use directly
render_pydot()
to provide a different graph.NODES:
- oval
function
- egg
subgraph operation
- house
given input
- inversed-house
asked output
- polygon
given both as input & asked as output (what?)
- square
intermediate data, neither given nor asked.
- red frame
evict-instruction, to free up memory.
- filled
data node has a value in solution OR function has been executed.
- thick frame
function/data node in execution steps.
ARROWS
- solid black arrows
dependencies (source-data need-ed by target-operations, sources-operations provides target-data)
- dashed black arrows
optional needs
- blue arrows
sideffect needs/provides
- wheat arrows
broken dependency (
provide
) during pruning- green-dotted arrows
execution steps labeled in succession
To generate the legend, see
legend()
.Examples:
>>> from graphtik import compose, operation >>> from graphtik.modifier import optional >>> from operator import add
>>> pipeline = compose("pipeline", ... operation(name="add", needs=["a", "b1"], provides=["ab1"])(add), ... operation(name="sub", needs=["a", optional("b2")], provides=["ab2"])(lambda a, b=1: a-b), ... operation(name="abb", needs=["ab1", "ab2"], provides=["asked"])(add), ... )
>>> pipeline.plot(True); # plot just the graph in a matplotlib window # doctest: +SKIP >>> inputs = {'a': 1, 'b1': 2} >>> solution = pipeline(**inputs) # now plots will include the execution-plan
The solution is also plottable:
>>> solution.plot('plot1.svg'); # doctest: +SKIP
or you may augment the pipelinewith the requested inputs/outputs & solution:
>>> pipeline.plot('plot1.svg', inputs=inputs, outputs=['asked', 'b1'], solution=solution); # doctest: +SKIP
In any case you may get the pydot.Dot object (n.b. it is renderable in Jupyter as-is):
>>> dot = pipeline.plot(solution=solution); >>> print(dot) digraph pipeline { fontname=italic; label=<pipeline>; node [fillcolor=white, style=filled]; <a> [fillcolor=wheat, fixedsize=shape, label=<<TABLE CELLBORDER="0" CELLSPACING="0" BORDER="0"> ...
You may use the
PlotArgs.graph
overlay to skip certain nodes (or edges) from the plots:>>> import networkx as nx
>>> g = nx.DiGraph() # the overlay >>> to_hide = pipeline.net.find_op_by_name("sub") >>> g.add_node(to_hide, no_plot=True) >>> dot = pipeline.plot(graph=g) >>> assert "<sub>" not in str(dot), str(dot)
- abstract prepare_plot_args(plot_args: PlotArgs) PlotArgs [source]¶
Called by
plot()
to create the nx-graph and other plot-args, e.g. solution.Clone the graph or merge it with the one in the plot_args (see
PlotArgs.clone_or_merge_graph()
.For the rest args, prefer
PlotArgs.with_defaults()
over_replace()
, not to override user args.
- class graphtik.base.RenArgs(typ: str, op: Operation, name: str, parent: Pipeline = None)[source]¶
Arguments received by callbacks in
rename()
and operation nesting.- property name¶
Alias for field number 2
- property op¶
the operation currently being processed
- property parent¶
The parent
Pipeline
of the operation currently being processed,. Has value only when doing operation nesting fromcompose()
.
- class graphtik.base.Token(s)[source]¶
Guarantee equality, not(!) identity, across processes.
- hashid¶
- graphtik.base.aslist(i, argname, allowed_types=<class 'list'>)[source]¶
Utility to accept singular strings as lists, and None –> [].
- graphtik.base.first_solid(*tristates, default=None)[source]¶
Utility combining multiple tri-state booleans.
- graphtik.base.func_name(fn, default=Ellipsis, mod=None, fqdn=None, human=None, partials=None) Optional[str] [source]¶
FQDN of fn, descending into partials to print their args.
- Parameters
default – What to return if it fails; by default it raises.
mod – when true, prepend module like
module.name.fn_name
fqdn – when true, use
__qualname__
(instead of__name__
) which differs mostly on methods, where it contains class(es), and locals, respectively (PEP 3155). Sphinx uses fqdn=True for generating IDs.human – when true, explain built-ins, and assume
partials=True
(if that was None)partials – when true (or omitted & human true), partials denote their args like
fn({"a": 1}, ...)
- Returns
a (possibly dot-separated) string, or default (unless this is
...`
).- Raises
Only if default is
...
, otherwise, errors debug-logged.
Examples
>>> func_name(func_name) 'func_name' >>> func_name(func_name, mod=1) 'graphtik.base.func_name' >>> func_name(func_name.__format__, fqdn=0) '__format__' >>> func_name(func_name.__format__, fqdn=1) 'function.__format__'
Even functions defined in docstrings are reported:
>>> def f(): ... def inner(): ... pass ... return inner
>>> func_name(f, mod=1, fqdn=1) 'graphtik.base.f' >>> func_name(f(), fqdn=1) 'f.<locals>.inner'
On failures, arg default controls the outcomes:
TBD
Module: jetsam¶
jetsam utility for annotating exceptions from locals()
like PEP 678
PY3.11 exception-notes.
- class graphtik.jetsam.Jetsam(*args, **kwargs)[source]¶
The jetsam is a dict with items accessed also as attributes.
From https://stackoverflow.com/a/14620633/548792
- log_n_plot(plot=None) Path [source]¶
Log collected items, and plot 1st plottable in a temp-file, if DEBUG flag.
- Parameters
plot – override DEBUG-flag if given (true, plots, false not)
- Returns
the name of temp-file, also ERROR-logged along with the rest jetsam
- graphtik.jetsam.save_jetsam(ex, locs, *salvage_vars: str, annotation='jetsam', **salvage_mappings)[source]¶
Annotate exception with salvaged values from locals(), log, (if DEBUG flag) plot.
- Parameters
ex – the exception to annotate
locs –
locals()
from the context-manager’s block containing vars to be salvaged in case of exceptionATTENTION: wrapped function must finally call
locals()
, because locals dictionary only reflects local-var changes after call.annotation – the name of the attribute to attach on the exception
salvage_vars – local variable names to save as is in the salvaged annotations dictionary.
salvage_mappings – a mapping of destination-annotation-keys –> source-locals-keys; if a source is callable, the value to salvage is retrieved by calling
value(locs)
. They take precedence over`salvage_vars`.
- Returns
the
Jetsam
annotation, also attached on the exception- Raises
any exception raised by the wrapped function, annotated with values assigned as attributes on this context-manager
Any attributes attached on this manager are attached as a new dict on the raised exception as new
jetsam
attribute with a dict as value.If the exception is already annotated, any new items are inserted, but existing ones are preserved.
If DEBUG flag is enabled, plots the 1st found errored in order solution/plan/pipeline/net, and log its path.
Example:
Call it with managed-block’s
locals()
and tell which of them to salvage in case of errors:>>> try: ... a = 1 ... b = 2 ... raise Exception("trouble!") ... except Exception as ex: ... save_jetsam(ex, locals(), "a", b="salvaged_b", c_var="c") ... raise Traceback (most recent call last): Exception: trouble!
And then from a REPL:
>>> import sys >>> sys.exc_info()[1].jetsam # doctest: +SKIP {'a': 1, 'salvaged_b': 2, "c_var": None}
Note
In order not to obfuscate the landing position of post-mortem debuggers in the case of errors, use the
try-finally
withok
flag pattern:>>> ok = False >>> try: ... ... pass # do risky stuff ... ... ok = True # last statement in the try-body. ... except Exception as ex: ... if not ok: ... ex = sys.exc_info()[1] ... save_jetsam(...)
** Reason:**
Graphs may become arbitrary deep. Debugging such graphs is notoriously hard.
The purpose is not to require a debugger-session to inspect the root-causes (without precluding one).
Module: jsonpointer¶
Utility for json pointer path modifier
Copied from pypi/pandalone.
- exception graphtik.jsonpointer.ResolveError[source]¶
A
KeyError
raised when a json-pointer does notresolve
.- property part¶
the part where the resolution broke
- property path¶
the json-pointer that failed to resolve
- graphtik.jsonpointer.collection_popper(doc: Collection, part, do_pop)[source]¶
Resolve part in doc, or pop it with default`if `do_pop.
- graphtik.jsonpointer.collection_scouter(doc: Doc, key, mother, overwrite, concat_axis) Tuple[Any, Optional[Doc]] [source]¶
Get item key from doc collection, or create a new ome from mother.
- Parameters
mother – factory producing the child containers to extend missing steps, or the “child” value (when overwrite is true).
- Returns
a 2-tuple (child, doc) where doc is not None if it needs to be replaced in its parent container (e.g. due to df-concat with value).
- graphtik.jsonpointer.contains_path(doc: Doc, path: Union[str, Iterable[str]], root: Doc = '%%UNSET%%', descend_objects=True) bool [source]¶
Test if doc has a value for json-pointer path by calling
resolve_path()
.
- graphtik.jsonpointer.escape_jsonpointer_part(part: str) str [source]¶
convert path-part according to the json-pointer standard
- graphtik.jsonpointer.json_pointer(parts: Sequence[str]) str [source]¶
Escape & join parts into a jsonpointer path (inverse of
jsonp_path()
).Examples:
>>> json_pointer(["a", "b"]) 'a/b' >>> json_pointer(['', "a", "b"]) '/a/b'
>>> json_pointer([1, "a", 2]) '1/a/2'
>>> json_pointer([""]) '' >>> json_pointer(["a", ""]) '' >>> json_pointer(["", "a", "", "b"]) '/b'
>>> json_pointer([]) ''
>>> json_pointer(["/", "~"]) '~1/~0'
- graphtik.jsonpointer.jsonp_path(jsonpointer: str) List[str] [source]¶
Generates the path parts according to jsonpointer spec.
- Parameters
path – a path to resolve within document
- Returns
The parts of the path as generator), without converting any step to int, and None if None. (the 1st step of absolute-paths is always
''
)
In order to support relative & absolute paths along with a sensible
set_path_value()
, it departs from the standard in these aspects:A double slash or a slash at the end of the path restarts from the root.
- Author
Julian Berman, ankostis
Examples:
>>> jsonp_path('a') ['a'] >>> jsonp_path('a/') [''] >>> jsonp_path('a/b') ['a', 'b']
>>> jsonp_path('/a') ['', 'a']
>>> jsonp_path('/') [''] >>> jsonp_path('') []
>>> jsonp_path('a/b//c') ['', 'c'] >>> jsonp_path('a//b////c') ['', 'c']
- graphtik.jsonpointer.list_popper(doc: MutableSequence, part, do_pop)[source]¶
Call
collection_popper()
with integer part.
- graphtik.jsonpointer.list_scouter(doc: Doc, idx, mother, overwrite) Tuple[Any, Optional[Doc]] [source]¶
Get doc `list item by (int) `idx, or create a new one from mother.
- Parameters
mother – factory producing the child containers to extend missing steps, or the “child” value (when overwrite is true).
- Returns
a 2-tuple (child,
None
)
NOTE: must come after collection-scouter due to special
-
index collision.
- graphtik.jsonpointer.object_popper(doc: Collection, part, do_pop)[source]¶
Resolve part in doc attributes, or
delattr
it, returning its value or default.
- graphtik.jsonpointer.object_scouter(doc: Doc, attr, mother, overwrite) Tuple[Any, Optional[Doc]] [source]¶
Get attribute attr in doc object, or create a new one from mother.
- Parameters
mother – factory producing the child containers to extend missing steps, or the “child” value (when overwrite is true).
- Returns
a 2-tuple (child,
None
)
- graphtik.jsonpointer.pop_path(doc: Doc, path: Union[str, Iterable[str]], default='%%UNSET%%', root: Doc = '%%UNSET%%', descend_objects=True)[source]¶
Delete and return the item referenced by json-pointer path from the nested doc .
- Parameters
doc – the current document to start searching path (which may be different than root)
path –
An absolute or relative json-pointer expression to resolve within doc document (or just the unescaped steps).
Attention
Relative paths DO NOT support the json-pointer extension https://tools.ietf.org/id/draft-handrews-relative-json-pointer-00.html
default – the value to return if path does not resolve; by default, it raises.
root – From where to start resolving absolute paths or double-slashes(
//
), or final slashes. IfNone
, only relative paths allowed; by default, the given doc is assumed as root (so absolute paths are also accepted).descend_objects – If true, a last ditch effort is made for each part, whether it matches the name of an attribute of the parent item.
- Returns
the deleted item in doc, or default if given and path didn’t exist
- Raises
ResolveError – if path cannot resolve and no default given
ValueError – if path was an absolute path a
None
root had been given.
See
resolve_path()
for departures from the json-pointer standardExamples:
>>> dt = { ... 'pi':3.14, ... 'foo':'bar', ... 'df': pd.DataFrame(np.ones((3,2)), columns=list('VN')), ... 'sub': { ... 'sr': pd.Series({'abc':'def'}), ... } ... } >>> resolve_path(dt, '/pi', default=UNSET) 3.14
>>> resolve_path(dt, 'df/V') 0 1.0 1 1.0 2 1.0 Name: V, dtype: float64
>>> resolve_path(dt, '/pi/BAD', 'Hi!') 'Hi!'
- Author
Julian Berman, ankostis
- graphtik.jsonpointer.prepend_parts(prefix_parts: Sequence[str], parts: Sequence[str]) Sequence[str] [source]¶
Prepend prefix_parts before given parts (unless they are rooted).
Both parts & prefix_parts must have been produced by
json_path()
so that any root(""
) must come first, and must not be empty (except prefix-parts).Examples:
>>> prepend_parts(["prefix"], ["b"]) ['prefix', 'b']
>>> prepend_parts(("", "prefix"), ["b"]) ['', 'prefix', 'b'] >>> prepend_parts(["prefix ignored due to rooted"], ("", "b")) ('', 'b')
>>> prepend_parts([], ["b"]) ['b'] >>> prepend_parts(["prefix irrelevant"], []) Traceback (most recent call last): IndexError: list index out of range
- graphtik.jsonpointer.resolve_path(doc: Doc, path: Union[str, Iterable[str]], default='%%UNSET%%', root: Doc = '%%UNSET%%', descend_objects=True)[source]¶
Resolve roughly like a json-pointer path within the referenced doc.
- Parameters
doc – the current document to start searching path (which may be different than root)
path –
An absolute or relative json-pointer expression to resolve within doc document (or just the unescaped steps).
Attention
Relative paths DO NOT support the json-pointer extension https://tools.ietf.org/id/draft-handrews-relative-json-pointer-00.html
default – the value to return if path does not resolve; by default, it raises.
root – From where to start resolving absolute paths or double-slashes(
//
), or final slashes. IfNone
, only relative paths allowed; by default, the given doc is assumed as root (so absolute paths are also accepted).descend_objects – If true, a last ditch effort is made for each part, whether it matches the name of an attribute of the parent item.
- Returns
the resolved doc-item
- Raises
ResolveError – if path cannot resolve and no default given
ValueError – if path was an absolute path a
None
root had been given.
In order to support relative & absolute paths along with a sensible
set_path_value()
, it departs from the standard in these aspects:Supports also relative paths (but not the official extension).
For arrays, it tries 1st as an integer, and then falls back to normal indexing (usefull when accessing pandas).
A
/
path does not bring the value of empty``’’`` key but the whole document (aka the “root”).A double slash or a slash at the end of the path restarts from the root.
Examples:
>>> dt = { ... 'pi':3.14, ... 'foo':'bar', ... 'df': pd.DataFrame(np.ones((3,2)), columns=list('VN')), ... 'sub': { ... 'sr': pd.Series({'abc':'def'}), ... } ... } >>> resolve_path(dt, '/pi') 3.14
>>> resolve_path(dt, 'df/V') 0 1.0 1 1.0 2 1.0 Name: V, dtype: float64
>>> resolve_path(dt, '/pi/BAD') Traceback (most recent call last): graphtik.jsonpointer.ResolveError: Failed resolving step (#2) "BAD" of path '/pi/BAD'. Check debug logs.
>>> resolve_path(dt, '/pi/BAD', 'Hi!') 'Hi!'
- Author
Julian Berman, ankostis
- graphtik.jsonpointer.set_path_value(doc: ~graphtik.jsonpointer.Doc, path: ~typing.Union[str, ~typing.Iterable[str]], value, container_factory=<class 'dict'>, root: ~graphtik.jsonpointer.Doc = '%%UNSET%%', descend_objects=True, concat_axis: ~typing.Optional[int] = None)[source]¶
Set value into a jsonp path within the referenced doc.
Special treatment (i.e. concat) if must insert a DataFrame into a DataFrame with steps
.
(vertical) and-
(horizontal) denoting concatenation axis.- Parameters
doc – the document to extend & insert value
path –
An absolute or relative json-pointer expression to resolve within doc document (or just the unescaped steps).
For sequences (arrays), it supports the special index dash(
-
) char, to refer to the position beyond the last item, as by the spec, BUT it does not raise - it always add a new item.Attention
Relative paths DO NOT support the json-pointer extension https://tools.ietf.org/id/draft-handrews-relative-json-pointer-00.html
container_factory – a factory producing the container to extend missing steps (usually a mapping or a sequence).
root – From where to start resolving absolute paths, double-slashes(
//
) or final slashes. IfNone
, only relative paths allowed; by default, the given doc is assumed as root (so absolute paths are also accepted).descend_objects – If true, a last ditch effort is made for each part, whether it matches the name of an attribute of the parent item.
concat_axis – if 0 or 1, applies :term:’pandas concatenation` vertically or horizontally, by clipping last step when traversing it and doc & value are both Pandas objects.
- Raises
if jsonpointer empty, missing, invalid-content
changed given doc/root (e.g due to concat-ed with value)
See
resolve_path()
for departures from the json-pointer standard
- graphtik.jsonpointer.unescape_jsonpointer_part(part: str) str [source]¶
convert path-part according to the json-pointer standard
- graphtik.jsonpointer.update_paths(doc: ~graphtik.jsonpointer.Doc, paths_vals: ~typing.Collection[~typing.Tuple[str, ~typing.Any]], container_factory=<class 'dict'>, root: ~graphtik.jsonpointer.Doc = '%%UNSET%%', descend_objects=True, concat_axis: ~typing.Optional[int] = None) None [source]¶
Mass-update path_vals (jsonp, value) pairs into doc.
Group jsonp-keys by nesting level,to optimize.
- Parameters
concat_axis – None, 0 or 1, see
set_path_value()
.- Returns
the updated doc (if it was a dataframe and
pd.concact
needed)
Package: sphinxext¶
Extends Sphinx with graphtik
directive for plotting from doctest code.
- class graphtik.sphinxext.DocFilesPurgatory[source]¶
Keeps 2-way associations of docs <–> abs-files, to purge them.
- class graphtik.sphinxext.GraphtikDoctestDirective(name, arguments, options, content, lineno, content_offset, block_text, state, state_machine)[source]¶
Embeds plots from doctest code (see
graphtik
).- option_spec = {'align': <function Figure.align>, 'alt': <function unchanged>, 'caption': <function unchanged>, 'class': <function class_option>, 'figclass': <function class_option>, 'figwidth': <function Figure.figwidth_value>, 'graph-format': <function _valid_format_option>, 'graphvar': <function unchanged_required>, 'height': <function length_or_unitless>, 'hide': <function flag>, 'name': <function unchanged>, 'no-trim-doctest-flags': <function flag>, 'options': <function unchanged>, 'pyversion': <function unchanged_required>, 'scale': <function percentage>, 'skipif': <function unchanged_required>, 'target': <function unchanged_required>, 'trim-doctest-flags': <function flag>, 'width': <function length_or_percentage_or_unitless>, 'zoomable': <function _tristate_bool_option>, 'zoomable-opts': <function unchanged>}¶
Mapping of option names to validator functions.
- class graphtik.sphinxext.GraphtikTestoutputDirective(name, arguments, options, content, lineno, content_offset, block_text, state, state_machine)[source]¶
Like
graphtik
directive, but emulates doctesttestoutput
blocks.- option_spec = {'align': <function Figure.align>, 'alt': <function unchanged>, 'caption': <function unchanged>, 'class': <function class_option>, 'figclass': <function class_option>, 'figwidth': <function Figure.figwidth_value>, 'graph-format': <function _valid_format_option>, 'graphvar': <function unchanged_required>, 'height': <function length_or_unitless>, 'hide': <function flag>, 'name': <function unchanged>, 'no-trim-doctest-flags': <function flag>, 'options': <function unchanged>, 'pyversion': <function unchanged_required>, 'scale': <function percentage>, 'skipif': <function unchanged_required>, 'target': <function unchanged_required>, 'trim-doctest-flags': <function flag>, 'width': <function length_or_percentage_or_unitless>, 'zoomable': <function _tristate_bool_option>, 'zoomable-opts': <function unchanged>}¶
Mapping of option names to validator functions.