Repository URL to install this package:
|
Version:
2022.10.0 ▾
|
properties:
temporary-directory:
type:
- string
- "null"
description: |
Temporary directory for local disk storage /tmp, /scratch,
or /local. This directory is used during dask spill-to-disk operations.
When the value is "null" (default), dask will create a directory from
where dask was launched: `cwd/dask-worker-space`
visualization:
type: object
properties:
engine:
type:
- string
- 'null'
description: |
Visualization engine to use when calling ``.visualize()`` on a Dask collection.
Currently supports ``'graphviz'``, ``'ipycytoscape'``, and ``'cytoscape'``
(alias for ``'ipycytoscape'``)
tokenize:
type: object
properties:
ensure-deterministic:
type:
- boolean
description: |
If ``true``, tokenize will error instead of falling back to uuids
when a deterministic token cannot be generated. Defaults to
``false``.
dataframe:
type: object
properties:
backend:
type:
- string
- "null"
description: |
Backend to use for supported dataframe-creation functions.
Default is "pandas".
shuffle-compression:
type:
- string
- "null"
description: |
Compression algorithm used for on disk-shuffling. Partd, the library used
for compression supports ZLib, BZ2, SNAPPY, and BLOSC
parquet:
type: object
properties:
metadata-task-size-local:
type: integer
description: |
The number of files to handle within each metadata-processing
task when reading a parquet dataset from a LOCAL file system.
Specifying 0 will result in serial execution on the client.
metadata-task-size-remote:
type: integer
description: |
The number of files to handle within each metadata-processing
task when reading a parquet dataset from a REMOTE file system.
Specifying 0 will result in serial execution on the client.
array:
type: object
properties:
backend:
type:
- string
- "null"
description: |
Backend to use for supported array-creation functions.
Default is "numpy".
svg:
type: object
properties:
size:
type: integer
description: |
The size of pixels used when displaying a dask array as an SVG image.
This is used, for example, for nice rendering in a Jupyter notebook
slicing:
type: object
properties:
split-large-chunks:
type: [boolean, 'null']
description: |
How to handle large chunks created when slicing Arrays. By default a
warning is produced. Set to ``False`` to silence the warning
and allow large output chunks. Set to ``True`` to silence the
warning and avoid large output chunks.
optimization:
type: object
properties:
annotations:
type: object
properties:
fuse:
type: boolean
description: |
If adjacent blockwise layers have different annotations (e.g., one has
retries=3 and another has retries=4), Dask can make an attempt to merge
those annotations according to some simple rules. ``retries`` is set to
the max of the layers, ``priority`` is set to the max of the layers,
``resources`` are set to the max of all the resources, ``workers`` is
set to the intersection of the requested workers. If this setting is
disabled, then adjacent blockwise layers with different annotations
will *not* be fused.
fuse:
type: object
description: Options for Dask's task fusion optimizations
properties:
active:
type: [boolean, 'null']
description: |
Turn task fusion on/off. This option refers to the fusion of a
fully-materialized task graph (not a high-Level graph). By default
(None), the active task-fusion option will be treated as ``False``
for Dask-Dataframe collections, and as ``True`` for all other graphs
(including Dask-Array collections).
ave-width:
type: number
minimum: 0
description:
Upper limit for width, where width = num_nodes / height, a good measure
of parallelizability
max-width:
type: [number, 'null']
minimum: 0
description:
Don't fuse if total width is greater than this. Set to null to dynamically
adjust to 1.5 + ave_width * log(ave_width + 1)
max-height:
type: number
minimum: 0
description: Don't fuse more than this many levels
max-depth-new-edges:
type: [number, 'null']
minimum: 0
description:
Don't fuse if new dependencies are added after this many levels.
Set to null to dynamically adjust to ave_width * 1.5.
subgraphs:
type: [boolean, 'null']
description: |
Set to True to fuse multiple tasks into SubgraphCallable objects. Set to
None to let the default optimizer of individual dask collections decide.
If no collection-specific default exists, None defaults to False.
rename-keys:
type: boolean
description:
Set to true to rename the fused keys with `default_fused_keys_renamer`.
Renaming fused keys can keep the graph more understandable and
comprehensible, but it comes at the cost of additional processing. If
False, then the top-most key will be used. For advanced usage, a function
to create the new name is also accepted.