Gemfury

duality-group / dask python

Repository URL to install this package:
Details
dask / dask / dask-schema.yaml
properties:

  temporary-directory:
    type:
    - string
    - "null"
    description: |
      Temporary directory for local disk storage /tmp, /scratch,
      or /local.  This directory is used during dask spill-to-disk operations.

      When the value is "null" (default), dask will create a directory from
      where dask was launched: `cwd/dask-worker-space`

  visualization:
    type: object
    properties:
      engine:
        type:
          - string
          - 'null'
        description: |
          Visualization engine to use when calling ``.visualize()`` on a Dask collection.
          Currently supports ``'graphviz'``, ``'ipycytoscape'``, and ``'cytoscape'``
          (alias for ``'ipycytoscape'``)

  tokenize:
    type: object
    properties:
      ensure-deterministic:
        type:
        - boolean
        description: |
          If ``true``, tokenize will error instead of falling back to uuids
          when a deterministic token cannot be generated. Defaults to
          ``false``.

  dataframe:
    type: object
    properties:

      backend:
        type:
        - string
        - "null"
        description: |
          Backend to use for supported dataframe-creation functions.
          Default is "pandas".

      shuffle-compression:
        type:
        - string
        - "null"
        description: |
          Compression algorithm used for on disk-shuffling. Partd, the library used
          for compression supports ZLib, BZ2, SNAPPY, and BLOSC

      parquet:
        type: object
        properties:

          metadata-task-size-local:
            type: integer
            description: |
              The number of files to handle within each metadata-processing
              task when reading a parquet dataset from a LOCAL file system.
              Specifying 0 will result in serial execution on the client.

          metadata-task-size-remote:
            type: integer
            description: |
              The number of files to handle within each metadata-processing
              task when reading a parquet dataset from a REMOTE file system.
              Specifying 0 will result in serial execution on the client.

  array:
    type: object
    properties:

      backend:
        type:
        - string
        - "null"
        description: |
          Backend to use for supported array-creation functions.
          Default is "numpy".

      svg:
        type: object
        properties:

          size:
            type: integer
            description: |
              The size of pixels used when displaying a dask array as an SVG image.
              This is used, for example, for nice rendering in a Jupyter notebook

      slicing:
        type: object
        properties:
          split-large-chunks:
            type: [boolean, 'null']
            description: |
              How to handle large chunks created when slicing Arrays. By default a
              warning is produced. Set to ``False`` to silence the warning
              and allow large output chunks. Set to ``True`` to silence the
              warning and avoid large output chunks.

  optimization:
    type: object
    properties:

      annotations:
        type: object
        properties:
          fuse:
            type: boolean
            description: |
              If adjacent blockwise layers have different annotations (e.g., one has
              retries=3 and another has retries=4), Dask can make an attempt to merge
              those annotations according to some simple rules. ``retries`` is set to
              the max of the layers, ``priority`` is set to the max of the layers,
              ``resources`` are set to the max of all the resources, ``workers`` is
              set to the intersection of the requested workers. If this setting is
              disabled, then adjacent blockwise layers with different annotations
              will *not* be fused.

      fuse:
        type: object
        description: Options for Dask's task fusion optimizations
        properties:

          active:
            type: [boolean, 'null']
            description: |
              Turn task fusion on/off. This option refers to the fusion of a
              fully-materialized task graph (not a high-Level graph). By default
              (None), the active task-fusion option will be treated as ``False``
              for Dask-Dataframe collections, and as ``True`` for all other graphs
              (including Dask-Array collections).

          ave-width:
            type: number
            minimum: 0
            description:
              Upper limit for width, where width = num_nodes / height, a good measure
              of parallelizability

          max-width:
            type: [number, 'null']
            minimum: 0
            description:
              Don't fuse if total width is greater than this. Set to null to dynamically
              adjust to 1.5 + ave_width * log(ave_width + 1)

          max-height:
            type: number
            minimum: 0
            description: Don't fuse more than this many levels

          max-depth-new-edges:
            type: [number, 'null']
            minimum: 0
            description:
              Don't fuse if new dependencies are added after this many levels.
              Set to null to dynamically adjust to ave_width * 1.5.

          subgraphs:
            type: [boolean, 'null']
            description: |
              Set to True to fuse multiple tasks into SubgraphCallable objects. Set to
              None to let the default optimizer of individual dask collections decide.
              If no collection-specific default exists, None defaults to False.

          rename-keys:
            type: boolean
            description:
              Set to true to rename the fused keys with `default_fused_keys_renamer`.
              Renaming fused keys can keep the graph more understandable and
              comprehensible, but it comes at the cost of additional processing. If
              False, then the top-most key will be used. For advanced usage, a function
              to create the new name is also accepted.
duality-group / dask python

Products

About

Resources

Contact Gemfury