DistArray API Reference

distarray Package

DistArray: Distributed NumPy-like arrays

Documentation is available in docstrings and online at http://distarray.readthedocs.org.

Check out the examples directory in the source distribution for several example modules and IPython notebooks using DistArray.

__version__ Module

Version information for the DistArray package.

error Module

Exception classes for DistArray errors.

exception distarray.error.ContextError[source]

Bases: distarray.error.DistArrayError

Exception class when a unique Context cannot be found.

exception distarray.error.DistArrayError[source]

Bases: exceptions.Exception

Base exception class for DistArray errors.

exception distarray.error.DistributionError[source]

Bases: distarray.error.DistArrayError

Exception class when inconsistent distributions are used.

exception distarray.error.InvalidCommSizeError[source]

Bases: distarray.error.MPIDistArrayError

Exception class when a requested communicator is too large.

exception distarray.error.InvalidRankError[source]

Bases: distarray.error.MPIDistArrayError

Exception class when an invalid rank is used in a communicator.

exception distarray.error.MPIDistArrayError[source]

Bases: distarray.error.DistArrayError

Base exception class for MPI distribution errors.

metadata_utils Module

Utility functions for dealing with DistArray metadata.

exception distarray.metadata_utils.GridShapeError[source]

Bases: exceptions.Exception

Exception class when it is not possible to distribute the processes over the number of dimensions.

exception distarray.metadata_utils.InvalidGridShapeError[source]

Bases: exceptions.Exception

Exception class when the grid shape is incompatible with the distribution or communicator.

distarray.metadata_utils.arg_kwarg_proxy_converter(args, kwargs, module_name='__main__')[source]
distarray.metadata_utils.block_cyclic_size(dim_data)[source]

Get a size from a block-cyclic dim_data.

distarray.metadata_utils.block_size(dim_data)[source]

Get a size from a block distributed dim_data.

distarray.metadata_utils.c_or_bc_chooser(dim_data)[source]

Get a size from a cyclic or block-cyclic dim_data.

distarray.metadata_utils.check_grid_shape_postconditions(grid_shape, shape, dist, comm_size)[source]

Check grid_shape for reasonableness after creating it.

distarray.metadata_utils.check_grid_shape_preconditions(shape, dist, comm_size)[source]

Verify various distarray parameters are correct before making a grid_shape.

distarray.metadata_utils.condense(intervals)[source]
distarray.metadata_utils.cyclic_size(dim_data)[source]

Get a size from a cyclic dim_data.

distarray.metadata_utils.distribute_block_indices(dd)[source]

Fill in start and stop in dim dict dd.

distarray.metadata_utils.distribute_cyclic_indices(dd)[source]

Fill in start in dim dict dd.

distarray.metadata_utils.distribute_indices(dd)[source]

Fill in index related keys in dim dict dd.

distarray.metadata_utils.make_grid_shape(shape, dist, comm_size)[source]

Generate a grid_shape from shape tuple and dist tuple.

Does not assume that dim_data has proc_grid_size set for each dimension.

Attempts to allocate processes optimally for distributed dimensions.

Parameters:
  • shape (tuple of int) – The global shape of the array.
  • dist (tuple of str) – dist_type character per dimension.
  • comm_size (int) – Total number of processes to distribute.
Returns:

dist_grid_shape

Return type:

tuple of int

Raises:

GridShapeError – if not possible to distribute comm_size processes over number of dimensions.

distarray.metadata_utils.ndim_from_flat(flat, strides)[source]
distarray.metadata_utils.non_dist_size(dim_data)[source]

Get a size from a nondistributed dim_data.

distarray.metadata_utils.normalize_dim_dict(dd)[source]

Fill out some degenerate dim_dicts.

distarray.metadata_utils.normalize_dist(dist, ndim)[source]

Return a tuple containing dist-type for each dimension.

Parameters:
  • dist (str, list, tuple, or dict) –
  • ndim (int) –
Returns:

Contains string distribution type for each dim.

Return type:

tuple of str

Examples

>>> normalize_dist({0: 'b', 3: 'c'}, 4)
('b', 'n', 'n', 'c')
distarray.metadata_utils.normalize_grid_shape(grid_shape, shape, dist, comm_size)[source]

Adds 1s to grid_shape so it has ndims dimensions. Validates grid_shape tuple against the dist tuple and comm_size.

distarray.metadata_utils.normalize_reduction_axes(axes, ndim)[source]
distarray.metadata_utils.positivify(index, size)[source]

Check that an index is within bounds and return a positive version.

Parameters:
  • index (Integral or slice) –
  • size (Integral) –
Raises:

IndexError – for out-of-bounds indices

distarray.metadata_utils.sanitize_indices(indices, ndim=None, shape=None)[source]

Classify and sanitize indices.

  • Wrap naked Integral, slice, or Ellipsis indices into tuples
  • Classify result as ‘value’ or ‘view’
  • Expand Ellipsis objects to slices
  • If the length of the tuple-ized indices is < ndim (and it’s provided), add slice(None)’s to indices until indices is ndim long
  • If shape is provided, call positivify on the indices
Raises:
  • TypeError – If indices is other than Integral, slice or a Sequence of these
  • IndexError – If len(indices) > ndim
Returns:

Return type:

2-tuple of (str, n-tuple of slices and Integral values)

distarray.metadata_utils.shapes_from_dim_data_per_rank(ddpr)[source]

Given a dim_data_per_rank object, return the shapes of the localarrays. This requires no communication.

distarray.metadata_utils.size_chooser(dist_type)[source]

Get a function from a dist_type.

distarray.metadata_utils.size_from_dim_data(dim_data)[source]

Get a size from a dim_data.

distarray.metadata_utils.strides_from_shape(shape)[source]
distarray.metadata_utils.tuple_intersection(t0, t1)[source]

Compute intersection of a (start, stop, step) and a (start, stop) tuple.

Assumes all values are positive.

Parameters:
  • t0 (2-tuple or 3-tuple) – Tuple of (start, stop, [step]) representing an index range
  • t1 (2-tuple) – Tuple of (start, stop) representing an index range
Returns:

A tightly bounded interval.

Return type:

3-tuple or None

distarray.metadata_utils.unstructured_size(dim_data)[source]

Get a size from an unstructured dim_data.

mpi_engine Module

The engine_loop function and utilities necessary for it.

class distarray.mpi_engine.Engine[source]

Bases: object

INTERCOMM = None
builtin_call(msg)[source]
delete(msg)[source]
engine_make_targets_comm(msg)[source]
execute(msg)[source]
free_comm(msg)[source]
func_call(msg)[source]
is_engine()[source]
kill(msg)[source]

Break out of the engine loop.

parse_msg(msg)[source]
pull(msg)[source]
push(msg)[source]

mpionly_utils Module

Utilities for running Distarray in MPI mode.

distarray.mpionly_utils.get_comm_world()[source]
distarray.mpionly_utils.get_world_rank()[source]
distarray.mpionly_utils.initial_comm_setup()[source]

Setup client and engine intracomm, and intercomm.

distarray.mpionly_utils.is_solo_mpi_process()[source]
distarray.mpionly_utils.make_targets_comm(targets)[source]
distarray.mpionly_utils.push_function(context, key, func, targets=None)[source]

run_tests Module

Functions for running DistArray tests.

distarray.run_tests.test()[source]

Run all DistArray tests.

testing Module

Functions used for tests.

class distarray.testing.BaseContextTestCase(methodName='runTest')[source]

Bases: unittest.case.TestCase

Base test class for test cases that use a Context.

Overload the ntargets class attribute to change the default number of engines required. A cls.context object will be created with targets=range(cls.ntargets). Tests will be skipped if there are too few targets.

ntargets

int or ‘any’, default=4

If an int, indicates how many engines are required for this test to run. If the string ‘any’, indicates that any number of engines may be used with this test.

ntargets = 4
classmethod setUpClass()[source]
classmethod tearDownClass()[source]
class distarray.testing.CommNullPasser[source]

Bases: type

Metaclass.

Applies the comm_null_passes decorator to every method on a generated class.

class distarray.testing.DefaultContextTestCase(methodName='runTest')[source]

Bases: distarray.testing.BaseContextTestCase

classmethod make_context(targets=None)[source]
class distarray.testing.IPythonContextTestCase(methodName='runTest')[source]

Bases: distarray.testing.BaseContextTestCase

classmethod make_context(targets=None)[source]
classmethod setUpClass()[source]
class distarray.testing.MPIContextTestCase(methodName='runTest')[source]

Bases: distarray.testing.BaseContextTestCase

classmethod make_context(targets=None)[source]
class distarray.testing.ParallelTestCase(methodName='runTest')[source]

Bases: unittest.case.TestCase

Base test class for fully distributed and client-less test cases.

Overload the comm_size class attribute to change the default number of processes required.

comm_size

int, default=4

Indicates how many MPI processes are required for this test to run. If fewer than comm_size are available, the test will be skipped.

comm_size = 4
classmethod setUpClass()[source]
classmethod tearDownClass()[source]
distarray.testing.assert_localarrays_allclose(l0, l1, check_dtype=False, rtol=1e-07, atol=0)[source]

Call np.testing.assert_allclose on l0 and l1.

Also, check that LocalArray properties are equal.

distarray.testing.assert_localarrays_equal(l0, l1, check_dtype=False)[source]

Call np.testing.assert_equal on l0 and l1.

Also, check that LocalArray properties are equal.

distarray.testing.check_targets(required, available)[source]

If available < required, raise a SkipTest with a nice error message.

distarray.testing.comm_null_passes(fn)[source]

Decorator. If self.comm is COMM_NULL, pass.

This allows our tests to pass on processes that have nothing to do.

distarray.testing.import_or_skip(name)[source]

Try importing name, raise SkipTest on failure.

Parameters:name (str) – Module name to try to import.
Returns:module – Module object imported by importlib.
Return type:module object
Raises:unittest.SkipTest – If the attempted import raises an ImportError.

Examples

>>> h5py = import_or_skip('h5py')
>>> h5py.get_config()
<h5py.h5.H5PYConfig at 0x103dd5a78>
distarray.testing.raise_typeerror(fn)[source]

Decorator for protocol validator functions.

These functions return (success, err_msg), but sometimes we would rather have an exception.

distarray.testing.temp_filepath(extension='')[source]

Return a randomly generated filename.

This filename is appended to the directory path returned by tempfile.gettempdir() and has extension appended to it.

utils Module

Utilities.

distarray.utils.all_equal(iterable)[source]

Return True if all elements in iterable are equal.

Also returns True if iterable is empty.

class distarray.utils.count_round_trips(client)[source]

Bases: object

Context manager for counting the number of roundtrips between a IPython client and controller.

Usage:
>>> with count_round_trips(client) as r:
...     send_42_messages()
>>> r.count
42
update_count()[source]
distarray.utils.distarray_random_getstate()[source]

Get the state of the global random number generator.

distarray.utils.distarray_random_setstate(state)[source]

Set the state of the global random number generator.

distarray.utils.divisors_minmax(n, dmin, dmax)[source]

Find the divisors of n in the interval (dmin,dmax].

distarray.utils.flatten(seq, to_expand=<function list_or_tuple>)[source]

Flatten a nested sequence.

distarray.utils.get_from_dotted_name(dotted_name)[source]
distarray.utils.has_exactly_one(iterable)[source]

Does iterable have exactly one non-None element?

distarray.utils.list_or_tuple(seq)[source]

Is the object either a list or a tuple?

distarray.utils.mirror_sort(seq, ref_seq)[source]

Sort seq into the order that ref_seq is in.

>>> mirror_sort(range(5),[1,5,2,4,3])
[0, 4, 1, 3, 2]
distarray.utils.mult_partitions(n, s)[source]

Compute the multiplicative partitions of n of size s

>>> mult_partitions(52,3)
[(2, 2, 13)]
>>> mult_partitions(52,2)
[(2, 26), (4, 13)]
distarray.utils.mult_partitions_recurs(n, s, pd=0)[source]
distarray.utils.multi_for(iterables)[source]
distarray.utils.nonce()[source]
distarray.utils.remove_elements(to_remove, seq)[source]

Return a list, with the elements with specified indices removed.

Parameters:
  • to_remove (iterable) – Indices of elements in list to remove
  • seq (iterable) – Elements in the list.
Returns:

Return type:

List with the specified indices removed.

distarray.utils.set_from_dotted_name(name, val)[source]
distarray.utils.slice_intersection(s1, s2)[source]

Compute a slice that represents the intersection of two slices.

Currently only implemented for steps of size 1.

Parameters:s2 (s1,) –
Returns:
Return type:slice object
distarray.utils.uid()[source]

Get a unique name for a distarray object.