local Package¶
local Package¶
construct Module¶
- distarray.local.construct.init_base_comm(comm)¶
Sanitize an MPI.comm instance or create one.
- distarray.local.construct.init_comm(base_comm, grid_shape)¶
Create an MPI communicator with a cartesian topology.
error Module¶
- exception distarray.local.error.DistError¶
- exception distarray.local.error.DistMatrixError¶
- exception distarray.local.error.IncompatibleArrayError¶
- exception distarray.local.error.InvalidBaseCommError¶
- exception distarray.local.error.InvalidDimensionError¶
- exception distarray.local.error.InvalidMapCodeError¶
- exception distarray.local.error.NullCommError¶
format Module¶
Define a simple format for saving LocalArrays to disk with full information about them. This format, .dnpy, draws heavily from the .npy format specification from NumPy and from the data structure defined in the Distributed Array Protocol.
Version numbering¶
The version numbering of this format is independent of DistArray’s and the Distributed Array Protocol’s version numberings.
Format Version 1.0¶
The first 6 bytes are a magic string: exactly \x93DARRY.
The next 1 byte is an unsigned byte: the major version number of the file format, e.g. \x01.
The next 1 byte is an unsigned byte: the minor version number of the file format, e.g. \x00. Note: the version of the file format is not tied to the version of the DistArray package.
The next 2 bytes form a little-endian unsigned short int: the length of the header data HEADER_LEN.
The next HEADER_LEN bytes form the header data describing the distribution of this chunk of the LocalArray. It is an ASCII string which contains a Python literal expression of a dictionary. It is terminated by a newline (\n) and padded with spaces (\x20) to make the total length of magic string + 4 + HEADER_LEN be evenly divisible by 16 for alignment purposes.
The dictionary contains two keys, both described in the Distributed Array Protocol:
- “__version__” : str
- Version of the Distributed Array Protocol used in this header.
- “dim_data” : tuple of dict
- One dictionary per array dimension; see the Distributed Array Protocol for the details of this data structure.
For repeatability and readability, the dictionary keys are sorted in alphabetic order. This is for convenience only. A writer SHOULD implement this if possible. A reader MUST NOT depend on this.
Following this header is the output of numpy.save for the underlying data buffer. This contains the full output of save, beginning with the magic number for .npy files, followed by the .npy header and array data.
The .npy format, including reasons for creating it and a comparison of alternatives, is described fully in the “npy-format” NEP and in the module docstring for numpy.lib.format.
- distarray.local.format.magic(major, minor, prefix=<MagicMock name='mock.asbytes()' id='139659141105168'>)¶
Return the magic string for the given file format version.
Parameters: major (int in [0, 255]) – Returns: magic Return type: str Raises: ValueError – if the version cannot be formatted.
- distarray.local.format.read_array_header_1_0(fp)¶
Read an array header from a filelike object using the 1.0 file format version.
This will leave the file object located just after the header.
Parameters: fp (filelike object) – A file object or something with a .read() method like a file. Returns: - __version__ (str) – Version of the Distributed Array Protocol used.
- dim_data (tuple) – A tuple containing a dictionary for each dimension of the underlying array, as described in the Distributed Array Protocol.
Raises: ValueError – If the data is invalid.
- distarray.local.format.read_localarray(fp)¶
Read a LocalArray from an .dnpy file.
Parameters: fp (file_like object) – If this is not a real file object, then this may take extra memory and time. Returns: distbuffer – The Distributed Array Protocol structure created from the data on disk. Return type: dict Raises: ValueError – If the data is invalid.
- distarray.local.format.read_magic(fp)¶
Read the magic string to get the version of the file format.
Returns: - major (int)
- minor (int)
- distarray.local.format.write_localarray(fp, arr, version=(1, 0))¶
Write a LocalArray to a .dnpy file, including a header.
The __version__ and dim_data keys from the Distributed Array Protocol are written to a header, then numpy.save is used to write the value of the buffer key.
Parameters: - fp (file_like object) – An open, writable file object, or similar object with a .write() method.
- arr (LocalArray) – The array to write to disk.
- version ((int, int), optional) – The version number of the file format. Default: (1, 0)
Raises: - ValueError – If the array cannot be persisted.
- Various other errors – If the underlying numpy array contains Python objects as part of its dtype, the process of pickling them may raise various errors if the objects are not picklable.
localarray Module¶
- class distarray.local.localarray.GlobalIndex(distribution, ndarray)¶
Bases: object
Object which provides access to global indexing on LocalArrays.
- checked_getitem(global_inds)¶
- checked_setitem(global_inds, value)¶
- get_slice(global_inds, new_distribution)¶
- class distarray.local.localarray.GlobalIterator(arr)¶
Bases: distarray.externals.six.Iterator
- class distarray.local.localarray.LocalArray(distribution, dtype=None, buf=None)¶
Bases: object
Distributed memory Python arrays.
- __array_wrap__(obj, context=None)¶
Return a LocalArray based on obj.
This method constructs a new LocalArray object using the distribution from self and the buffer from obj.
This is used to construct return arrays for ufuncs.
- __distarray__()¶
Returns the data structure required by the DAP.
DAP = Distributed Array Protocol
See the project’s documentation for the Protocol’s specification.
- __getitem__(index)¶
Get a local item.
- __setitem__(index, value)¶
Set a local item.
- asdist_like(other)¶
Return a version of self that has shape, dist and grid_shape like other.
- astype(newdtype)¶
Return a copy of this LocalArray with a new underlying dtype.
- cart_coords¶
- comm¶
- comm_rank¶
- comm_size¶
- compatibility_hash()¶
- coords_from_rank(rank)¶
- copy()¶
Return a copy of this LocalArray.
- dim_data¶
- dist¶
- dtype¶
- fill(scalar)¶
- classmethod from_distarray(comm, obj)¶
Make a LocalArray from Distributed Array Protocol data structure.
An object that supports the Distributed Array Protocol will have a __distarray__ method that returns the data structure described here:
https://github.com/enthought/distributed-array-protocol
Parameters: obj (an object with a __distarray__ method or a dict) – If a dict, it must conform to the structure defined by the distributed array protocol. Returns: A LocalArray encapsulating the buffer of the original data. No copy is made. Return type: LocalArray
- global_from_local(local_ind)¶
- global_limits(dim)¶
- global_shape¶
- global_size¶
- grid_shape¶
- itemsize¶
- local_data¶
- local_from_global(global_ind)¶
- local_shape¶
- local_size¶
- local_view(dtype=None)¶
- nbytes¶
- ndarray¶
- ndim¶
- pack_index(inds)¶
- rank_from_coords(coords)¶
- sync()¶
- unpack_index(packed_ind)¶
- view(distribution, dtype)¶
Return a new LocalArray whose underlying ndarray is a view on self.ndarray.
- class distarray.local.localarray.LocalArrayBinaryOperation(numpy_ufunc)¶
Bases: object
- class distarray.local.localarray.LocalArrayUnaryOperation(numpy_ufunc)¶
Bases: object
- distarray.local.localarray.arecompatible(a, b)¶
Do these arrays have the same compatibility hash?
- distarray.local.localarray.compact_indices(dim_data)¶
Given a dim_data structure, return a tuple of compact indices.
For every dimension in dim_data, return a representation of the indices indicated by that dim_dict; return a slice if possible, else, return the list of global indices.
Parameters: dim_data (tuple of dict) – A dict for each dimension, with the data described here: https://github.com/enthought/distributed-array-protocol we use only the indexing related keys from this structure here. Returns: index – Efficient structure usable for indexing into a numpy-array-like data structure. Return type: tuple of slices and/or lists of int
- distarray.local.localarray.empty(distribution, dtype=<type 'float'>)¶
Create an empty LocalArray.
- distarray.local.localarray.empty_like(arr, dtype=None)¶
Create an empty LocalArray with a distribution like arr.
- distarray.local.localarray.fromfunction(function, distribution, **kwargs)¶
- distarray.local.localarray.fromndarray_like(ndarray, like_arr)¶
Create a new LocalArray like like_arr with buffer set to ndarray.
- distarray.local.localarray.get_printoptions()¶
- distarray.local.localarray.load_dnpy(comm, file)¶
Load a LocalArray from a .dnpy file.
Parameters: file (file-like object or str) – The file to read. It must support seek() and read() methods. Returns: result – A LocalArray encapsulating the data loaded. Return type: LocalArray
- distarray.local.localarray.load_hdf5(comm, filename, dim_data, key='buffer')¶
Load a LocalArray from an .hdf5 file.
Parameters: - filename (str) – The filename to read.
- dim_data (tuple of dict) – A dict for each dimension, with the data described here: https://github.com/enthought/distributed-array-protocol, describing which portions of the HDF5 file to load into this LocalArray, and with what metadata.
- comm (MPI comm object) –
- key (str, optional) – The identifier for the group to load the LocalArray from (the default is ‘buffer’).
Returns: result – A LocalArray encapsulating the data loaded.
Return type: LocalArray
Note
For dim_data dimension dictionaries containing unstructured (‘u’) distribution types, the indices selected by the ‘indices’ key must be in increasing order. This is a limitation of h5py / hdf5.
- distarray.local.localarray.load_npy(comm, filename, dim_data)¶
Load a LocalArray from a .npy file.
Parameters: - filename (str) – The file to read.
- dim_data (tuple of dict) – A dict for each dimension, with the data described here: https://github.com/enthought/distributed-array-protocol, describing which portions of the HDF5 file to load into this LocalArray, and with what metadata.
Returns: result – A LocalArray encapsulating the data loaded.
Return type: LocalArray
- distarray.local.localarray.local_reduction(out_comm, reducer, larr, ddpr, dtype, axes)¶
Entry point for reductions on local arrays.
Parameters: - reducer (callable) – Performs the core reduction operation.
- out_comm (MPI Comm instance.) – The MPI communicator for the result of the reduction. Is equal to MPI.COMM_NULL when this rank is not part of the output communicator.
- larr (LocalArray) – Input. Defined for all ranks.
Returns: When out_comm == MPI.COMM_NULL, returns None. Otherwise, returns the LocalArray section of the reduction result.
Return type: LocalArray or None
- distarray.local.localarray.max_reducer(reduce_comm, larr, out, axes, dtype)¶
Core reduction function for max.
- distarray.local.localarray.mean_reducer(reduce_comm, larr, out, axes, dtype)¶
Core reduction function for mean.
- distarray.local.localarray.min_reducer(reduce_comm, larr, out, axes, dtype)¶
Core reduction function for min.
- distarray.local.localarray.ndenumerate(arr)¶
- distarray.local.localarray.ones(distribution, dtype=<type 'float'>)¶
Create a LocalArray filled with ones.
- distarray.local.localarray.save_dnpy(file, arr)¶
Save a LocalArray to a .dnpy file.
Parameters: - file (file-like object or str) – The file or filename to which the data is to be saved.
- arr (LocalArray) – Array to save to a file.
- distarray.local.localarray.save_hdf5(filename, arr, key='buffer', mode='a')¶
Save a LocalArray to a dataset in an .hdf5 file.
Parameters: - filename (str) – Name of file to write to.
- arr (LocalArray) – Array to save to a file.
- key – The identifier for the group to save the LocalArray to (the default is ‘buffer’).
- distarray.local.localarray.set_printoptions(precision=None, threshold=None, edgeitems=None, linewidth=None, suppress=None)¶
- distarray.local.localarray.std_reducer(reduce_comm, larr, out, axes, dtype)¶
Core reduction function for std.
- distarray.local.localarray.sum_reducer(reduce_comm, larr, out, axes, dtype)¶
Core reduction function for sum.
- distarray.local.localarray.var_reducer(reduce_comm, larr, out, axes, dtype)¶
Core reduction function for var.
- distarray.local.localarray.zeros(distribution, dtype=<type 'float'>)¶
Create a LocalArray filled with zeros.
- distarray.local.localarray.zeros_like(arr, dtype=<type 'float'>)¶
Create a LocalArray of zeros with a distribution like arr.
maps Module¶
Classes to manage the distribution-specific aspects of a LocalArray.
The Distribution class is the main entry point and is meant to be used by LocalArrays to help translate between local and global index spaces. It manages ndim one-dimensional map objects.
The one-dimensional map classes BlockMap, CyclicMap, BlockCyclicMap, and UnstructuredMap all manage the mapping tasks for their particular dimension. All are subclasses of MapBase. The reason for the several subclasses is to allow more compact and efficient operations.
- class distarray.local.maps.BlockCyclicMap(global_size, grid_size, grid_rank, start, block_size)¶
Bases: distarray.local.maps.MapBase
One-dimensional block cyclic map class.
- dim_dict¶
- dist = 'c'¶
- global_from_local_index(lidx)¶
- global_iter¶
- global_slice¶
Return a slice representing the global index space of this dimension; only possible for block_size == 1.
- local_from_global_index(gidx)¶
- size¶
- class distarray.local.maps.BlockMap(global_size, grid_size, grid_rank, start, stop)¶
Bases: distarray.local.maps.MapBase
One-dimensional block map class.
- dim_dict¶
- dist = 'b'¶
- global_from_local_index(lidx)¶
- global_from_local_slice(lidx)¶
- global_iter¶
- global_slice¶
Return a slice representing the global index space of this dimension.
- local_from_global_index(gidx)¶
- local_from_global_slice(gidx)¶
- size¶
- class distarray.local.maps.CyclicMap(global_size, grid_size, grid_rank, start)¶
Bases: distarray.local.maps.MapBase
One-dimensional cyclic map class.
- dim_dict¶
- dist = 'c'¶
- global_from_local_index(lidx)¶
- global_iter¶
- global_slice¶
Return a slice representing the global index space of this dimension.
- local_from_global_index(gidx)¶
- size¶
- class distarray.local.maps.Distribution(comm, dim_data)¶
Bases: object
Multi-dimensional Map class.
Manages one or more one-dimensional map classes.
- cart_coords¶
- comm_rank¶
- comm_size¶
- coords_from_rank(rank)¶
- dim_data¶
- dist¶
- classmethod from_shape(comm, shape, dist=None, grid_shape=None)¶
Create a Distribution from a shape and optional arguments.
- global_from_local(local_ind)¶
Given local_ind indices, translate into global indices.
- global_shape¶
- global_size¶
- global_slice¶
Return a slice representing the global index space of this dimension.
- grid_shape¶
- local_from_global(global_ind)¶
Given global_ind indices, translate into local indices.
- local_shape¶
- local_size¶
- ndim¶
- rank_from_coords(coords)¶
- class distarray.local.maps.MapBase¶
Bases: object
Base class for all one dimensional Map classes.
- class distarray.local.maps.UnstructuredMap(global_size, grid_size, grid_rank, indices)¶
Bases: distarray.local.maps.MapBase
One-dimensional unstructured map class.
- dim_dict¶
- dist = 'u'¶
- global_from_local_index(lidx)¶
- global_iter¶
- local_from_global_index(gidx)¶
- size¶
- distarray.local.maps.map_from_dim_dict(dd)¶
Factory function that returns a 1D map for a given dimension dictionary.
mpiutils Module¶
Entry point for MPI.
- distarray.local.mpiutils.create_comm_of_size(size=4)¶
Create a subcommunicator of COMM_PRIVATE of given size.
- distarray.local.mpiutils.create_comm_with_list(nodes, base_comm=None)¶
Create a subcommunicator of base_comm with a list of ranks.
If base_comm is not specified, defaults to COMM_PRIVATE.
- distarray.local.mpiutils.mpi_type_for_ndarray(a)¶
proxyize Module¶
random Module¶
- distarray.local.random.beta(a, b, distribution=None)¶
- distarray.local.random.label_state(comm)¶
Label/personalize the random generator state for the local rank.
- distarray.local.random.normal(loc=0.0, scale=1.0, distribution=None)¶
- distarray.local.random.rand(distribution=None)¶
- distarray.local.random.randint(low, high=None, distribution=None)¶
- distarray.local.random.randn(distribution=None)¶