local Package¶
local Package¶
construct Module¶
- distarray.local.construct.init_base_comm(comm)¶
Sanitize an MPI.comm instance or create one.
- distarray.local.construct.init_comm(base_comm, grid_shape)¶
Create an MPI communicator with a cartesian topology.
error Module¶
- exception distarray.local.error.DistError¶
- exception distarray.local.error.DistMatrixError¶
- exception distarray.local.error.IncompatibleArrayError¶
- exception distarray.local.error.InvalidBaseCommError¶
- exception distarray.local.error.InvalidDimensionError¶
- exception distarray.local.error.InvalidMapCodeError¶
- exception distarray.local.error.NullCommError¶
format Module¶
Define a simple format for saving LocalArrays to disk with full information about them. This format, .dnpy, draws heavily from the .npy format specification from NumPy and from the data structure defined in the Distributed Array Protocol.
Version numbering¶
The version numbering of this format is independent of DistArray’s and the Distributed Array Protocol’s version numberings.
Format Version 1.0¶
The first 6 bytes are a magic string: exactly \x93DARRY.
The next 1 byte is an unsigned byte: the major version number of the file format, e.g. \x01.
The next 1 byte is an unsigned byte: the minor version number of the file format, e.g. \x00. Note: the version of the file format is not tied to the version of the DistArray package.
The next 2 bytes form a little-endian unsigned short int: the length of the header data HEADER_LEN.
The next HEADER_LEN bytes form the header data describing the distribution of this chunk of the LocalArray. It is an ASCII string which contains a Python literal expression of a dictionary. It is terminated by a newline (\n) and padded with spaces (\x20) to make the total length of magic string + 4 + HEADER_LEN be evenly divisible by 16 for alignment purposes.
The dictionary contains two keys, both described in the Distributed Array Protocol:
- “__version__” : str
- Version of the Distributed Array Protocol used in this header.
- “dim_data” : tuple of dict
- One dictionary per array dimension; see the Distributed Array Protocol for the details of this data structure.
For repeatability and readability, the dictionary keys are sorted in alphabetic order. This is for convenience only. A writer SHOULD implement this if possible. A reader MUST NOT depend on this.
Following this header is the output of numpy.save for the underlying data buffer. This contains the full output of save, beginning with the magic number for .npy files, followed by the .npy header and array data.
The .npy format, including reasons for creating it and a comparison of alternatives, is described fully in the “npy-format” NEP and in the module docstring for numpy.lib.format.
- distarray.local.format.magic(major, minor, prefix=<MagicMock name='mock.asbytes()' id='31245904'>)¶
Return the magic string for the given file format version.
Parameters: major (int in [0, 255]) – Returns: magic Return type: str Raises: ValueError – if the version cannot be formatted.
- distarray.local.format.read_array_header_1_0(fp)¶
Read an array header from a filelike object using the 1.0 file format version.
This will leave the file object located just after the header.
Parameters: fp (filelike object) – A file object or something with a .read() method like a file. Returns: - __version__ (str) – Version of the Distributed Array Protocol used.
- dim_data (tuple) – A tuple containing a dictionary for each dimension of the underlying array, as described in the Distributed Array Protocol.
Raises: ValueError – If the data is invalid.
- distarray.local.format.read_localarray(fp)¶
Read a LocalArray from an .dnpy file.
Parameters: fp (file_like object) – If this is not a real file object, then this may take extra memory and time. Returns: distbuffer – The Distributed Array Protocol structure created from the data on disk. Return type: dict Raises: ValueError – If the data is invalid.
- distarray.local.format.read_magic(fp)¶
Read the magic string to get the version of the file format.
Returns: - major (int)
- minor (int)
- distarray.local.format.write_localarray(fp, arr, version=(1, 0))¶
Write a LocalArray to a .dnpy file, including a header.
The __version__ and dim_data keys from the Distributed Array Protocol are written to a header, then numpy.save is used to write the value of the buffer key.
Parameters: - fp (file_like object) – An open, writable file object, or similar object with a .write() method.
- arr (LocalArray) – The array to write to disk.
- version ((int, int), optional) – The version number of the file format. Default: (1, 0)
Raises: - ValueError – If the array cannot be persisted.
- Various other errors – If the underlying numpy array contains Python objects as part of its dtype, the process of pickling them may raise various errors if the objects are not picklable.
localarray Module¶
- class distarray.local.localarray.GlobalIndex(maps, ndarray)¶
Bases: object
Object which provides access to global indexing on LocalArrays.
- checked_getitem(global_inds)¶
- checked_setitem(global_inds, value)¶
- global_to_local(*global_ind)¶
- local_to_global(*local_ind)¶
- class distarray.local.localarray.GlobalIterator(arr)¶
Bases: distarray.externals.six.Iterator
- class distarray.local.localarray.LocalArray(shape, dtype=None, dist=None, grid_shape=None, comm=None, buf=None)¶
Bases: object
Distributed memory Python arrays.
- __array_wrap__(obj, context=None)¶
Return a LocalArray based on obj.
This method constructs a new LocalArray object using (shape, dist, grid_shape and base_comm) from self and dtype, buffer from obj.
This is used to construct return arrays for ufuncs.
- __distarray__()¶
Returns the data structure required by the DAP.
DAP = Distributed Array Protocol
See the project’s documentation for the Protocol’s specification.
- __getitem__(index)¶
Get a local item.
- __setitem__(index, value)¶
Set a local item.
- all(axis=None, out=None)¶
- any(axis=None, out=None)¶
- argmax(axis=None, out=None)¶
- argmin(axis=None, out=None)¶
- argsort(axis=-1, kind='quick')¶
- asdist(shape, dist={0: 'b'}, grid_shape=None)¶
- asdist_like(other)¶
Return a version of self that has shape, dist and grid_shape like other.
- astype(newdtype)¶
Return a copy of this LocalArray with a new underlying dtype.
- cart_coords¶
- checked_getitem(global_inds)¶
- checked_setitem(global_inds, value)¶
- choose(choices, out=None, mode='raise')¶
- clip(min, max, out=None)¶
- comm_rank¶
- comm_size¶
- compatibility_hash()¶
- compress(condition, axis=None, out=None)¶
- conj(out=None)¶
- conjugate(out=None)¶
- coords_from_rank(rank)¶
- copy()¶
Return a copy of this LocalArray.
- cumprod(axis=None, dtype=None, out=None)¶
- cumsum(axis=None, dtype=None, out=None)¶
- diagonal(offset=0, axis1=0, axis2=1)¶
- dist¶
- dtype¶
- fill(scalar)¶
- flatten(order='C')¶
- classmethod from_dim_data(dim_data, dtype=None, buf=None, comm=None)¶
Make a LocalArray from a dim_data tuple.
Parameters: - dim_data (tuple of dictionaries) – A dict for each dimension, with the data described here: https://github.com/enthought/distributed-array-protocol
- dtype (numpy dtype, optional) – If both dtype and buf are provided, buf will be encapsulated and interpreted with the given dtype. If neither are, an empty array will be created with a dtype of ‘float’. If only dtype is given, an empty array of that dtype will be created.
- buf (buffer object, optional) – If both dtype and buf are provided, buf will be encapsulated and interpreted with the given dtype. If neither are, an empty array will be created with a dtype of ‘float’. If only buf is given, self.dtype will be set to its dtype.
Returns: A LocalArray encapsulating buf, or else an empty (uninitialized) LocalArray.
Return type: LocalArray
- classmethod from_distarray(obj, comm=None)¶
Make a LocalArray from Distributed Array Protocol data structure.
An object that supports the Distributed Array Protocol will have a __distarray__ method that returns the data structure described here:
https://github.com/enthought/distributed-array-protocol
Parameters: obj (an object with a __distarray__ method or a dict) – If a dict, it must conform to the structure defined by the distributed array protocol. Returns: A LocalArray encapsulating the buffer of the original data. No copy is made. Return type: LocalArray
- get_localarray()¶
- global_from_local(*local_ind)¶
- global_limits(dim)¶
- global_shape¶
- global_size¶
- grid_shape¶
- itemsize¶
- local_data¶
- local_from_global(*global_ind)¶
- local_shape¶
- local_size¶
- local_view(dtype=None)¶
- max(axis=None, out=None)¶
- mean(axis=None, dtype=<type 'float'>, out=None)¶
- min(axis=None, out=None)¶
- nbytes¶
- ndim¶
- nonzero()¶
- pack_index(inds)¶
- prod(axis=None, dtype=None, out=None)¶
- ptp(axis=None, out=None)¶
- put(indices, values, mode='raise')¶
- putmask(mask, values)¶
- rank_from_coords(coords)¶
- ravel(order='C')¶
- redist(newshape, newdist={0: 'b'}, newgrid_shape=None)¶
- repeat(repeats, axis=None)¶
- reshape(newshape)¶
- resize(newshape, refcheck=1, order='C')¶
- round(decimals=0, out=None)¶
- searchsorted(values)¶
- set_localarray(a)¶
- sort(axis=-1, kind='quick')¶
- squeeze()¶
- std(axis=None, dtype=None, out=None)¶
- sum(axis=None, dtype=None, out=None)¶
- swapaxes(axis1, axis2)¶
- sync()¶
- take(indices, axis=None, out=None, mode='raise')¶
- trace(offset=0, axis1=0, axis2=1, dtype=None, out=None)¶
- transpose(axes=None)¶
- unpack_index(packed_ind)¶
- var(axis=None, dtype=None, out=None)¶
- view(dtype=None)¶
Return a new LocalArray whose underlying local_array is a view on self.local_array.
Note
Currently unimplemented for dtype is not None.
- class distarray.local.localarray.LocalArrayBinaryOperation(numpy_ufunc)¶
Bases: object
- class distarray.local.localarray.LocalArrayUnaryOperation(numpy_ufunc)¶
Bases: object
- distarray.local.localarray.allclose(a, b, rtol=0.0001, atom=1e-07)¶
- distarray.local.localarray.arange(start, stop=None, step=1, dtype=None, dist={0: 'b'}, grid_shape=None, comm=None)¶
- distarray.local.localarray.arecompatible(a, b)¶
Do these arrays have the same compatibility hash?
- distarray.local.localarray.aslocalarray(object, dtype=None, order=None)¶
- distarray.local.localarray.average(a, axis=None, weights=None, returned=0)¶
- distarray.local.localarray.compact_indices(dim_data)¶
Given a dim_data structure, return a tuple of compact indices.
For every dimension in dim_data, return a representation of the indicies indicated by that dim_dict; return a slice if possible, else, return the list of global indices.
Parameters: dim_data (tuple of dict) – A dict for each dimension, with the data described here: https://github.com/enthought/distributed-array-protocol we use only the indexing related keys from this structure here. Returns: index – Efficient structure usable for indexing into a numpy-array-like data structure. Return type: tuple of slices and/or lists of int
- distarray.local.localarray.concatenate(seq, axis=0)¶
- distarray.local.localarray.convolve(x, y, mode='valid')¶
- distarray.local.localarray.corrcoef(x, y=None, rowvar=1, bias=0)¶
- distarray.local.localarray.correlate(x, y, mode='valid')¶
- distarray.local.localarray.cov(x, y=None, rowvar=1, bias=0)¶
- distarray.local.localarray.cross(a, b, axisa=-1, axisb=-1, axisc=-1, axis=None)¶
- distarray.local.localarray.diag(v, k=0)¶
- distarray.local.localarray.digitize(x, bins)¶
- distarray.local.localarray.distarray2string(a)¶
- distarray.local.localarray.distribute_block_indices(dd)¶
Fill in start and stop in dimdict dd.
- distarray.local.localarray.distribute_cyclic_indices(dd)¶
Fill in start given dimdict dd.
- distarray.local.localarray.distribute_indices(dim_data)¶
Fill in missing index related keys...
for supported dist_types.
- distarray.local.localarray.dot(a, b)¶
- distarray.local.localarray.empty(shape, dtype=<type 'float'>, dist=None, grid_shape=None, comm=None)¶
- distarray.local.localarray.empty_like(arr, dtype=None)¶
- distarray.local.localarray.eye(n, m=None, k=0, dtype=<type 'float'>)¶
- distarray.local.localarray.fromfunction(function, shape, **kwargs)¶
- distarray.local.localarray.fromlocalarray_like(local_arr, like_arr)¶
Create a new LocalArray using a given local array (+its dtype).
- distarray.local.localarray.get_printoptions()¶
- distarray.local.localarray.histogram(x, bins=None, range=None, normed=False)¶
- distarray.local.localarray.histogram2d(x, y, bins, normed=False)¶
- distarray.local.localarray.identity(n, dtype=<MagicMock name='mock.intp' id='31303952'>)¶
- distarray.local.localarray.inner(a, b)¶
- distarray.local.localarray.linspace(start, stop, num=50, endpoint=True, retstep=False)¶
- distarray.local.localarray.load_dnpy(file, comm=None)¶
Load a LocalArray from a .dnpy file.
Parameters: file (file-like object or str) – The file to read. It must support seek() and read() methods. Returns: result – A LocalArray encapsulating the data loaded. Return type: LocalArray
- distarray.local.localarray.load_hdf5(filename, dim_data, key='buffer', comm=None)¶
Load a LocalArray from an .hdf5 file.
Parameters: - filename (str) – The filename to read.
- dim_data (tuple of dict) – A dict for each dimension, with the data described here: https://github.com/enthought/distributed-array-protocol, describing which portions of the HDF5 file to load into this LocalArray, and with what metadata.
- key (str, optional) – The identifier for the group to load the LocalArray from (the default is ‘buffer’).
Returns: result – A LocalArray encapsulating the data loaded.
Return type: LocalArray
Note
For dim_data dimension dictionaries containing unstructured (‘u’) distribution types, the indices selected by the ‘indices’ key must be in increasing order. This is a limitation of h5py / hdf5.
- distarray.local.localarray.load_npy(filename, dim_data, comm=None)¶
Load a LocalArray from a .npy file.
Parameters: - filename (str) – The file to read.
- dim_data (tuple of dict) – A dict for each dimension, with the data described here: https://github.com/enthought/distributed-array-protocol, describing which portions of the HDF5 file to load into this LocalArray, and with what metadata.
Returns: result – A LocalArray encapsulating the data loaded.
Return type: LocalArray
- distarray.local.localarray.logspace(start, stop, num=50, endpoint=True, base=10.0)¶
- distarray.local.localarray.make_partial_dim_data(shape, dist=None, grid_shape=None)¶
Create an (incomplete) dim_data structure from simple parameters.
Parameters: - shape (tuple of int) – Number of elements in each dimension.
- dist (dict mapping int -> str, default is {0: ‘b’}) – Keys are dimension number, values are dist_type, e.g ‘b’, ‘c’, or ‘n’.
- grid_shape (tuple of int, optional) – Size of process grid in each dimension
Returns: dim_data – Partial dim_data structure as outlined in the Distributed Array Protocol.
Return type: tuple of dict
- distarray.local.localarray.median(m)¶
- distarray.local.localarray.ndenumerate(arr)¶
- distarray.local.localarray.ones(shape, dtype=<type 'float'>, dist=None, grid_shape=None, comm=None)¶
- distarray.local.localarray.outer(a, b)¶
- distarray.local.localarray.save_dnpy(file, arr)¶
Save a LocalArray to a .dnpy file.
Parameters: - file (file-like object or str) – The file or filename to which the data is to be saved.
- arr (LocalArray) – Array to save to a file.
- distarray.local.localarray.save_hdf5(filename, arr, key='buffer', mode='a')¶
Save a LocalArray to a dataset in an .hdf5 file.
Parameters: - filename (str) – Name of file to write to.
- arr (LocalArray) – Array to save to a file.
- key – The identifier for the group to save the LocalArray to (the default is ‘buffer’).
- distarray.local.localarray.set_printoptions(precision=None, threshold=None, edgeitems=None, linewidth=None, suppress=None)¶
- distarray.local.localarray.sum(a, dtype=None)¶
- distarray.local.localarray.tensordot(a, b, axes=(-1, 0))¶
- distarray.local.localarray.vdot(a, b)¶
- distarray.local.localarray.where(condition, x=None, y=None)¶
- distarray.local.localarray.zeros(shape, dtype=<type 'float'>, dist=None, grid_shape=None, comm=None)¶
- distarray.local.localarray.zeros_like(arr)¶
maps Module¶
Classes to manage the distribution-specific aspects of a LocalArray.
The MDMap class is the main entry point and is meant to be used by LocalArrays to help translate between local and global index spaces. It manages ndim one-dimensional map objects.
The one-dimensional map classes BlockMap, CyclicMap, BlockCyclicMap, and UnstructuredMap all manage the mapping tasks for their particular dimension. All are subclasses of MapBase. The reason for the several subclasses is to allow more compact and efficient operations.
- class distarray.local.maps.BlockCyclicMap(global_size, grid_size, grid_rank, start, block_size)¶
Bases: distarray.local.maps.MapBase
One-dimensional block cyclic map class.
- dim_dict¶
- dist = 'c'¶
- global_from_local(lidx)¶
- global_iter¶
- local_from_global(gidx)¶
- size¶
- class distarray.local.maps.BlockMap(global_size, grid_size, grid_rank, start, stop)¶
Bases: distarray.local.maps.MapBase
One-dimensional block map class.
- dim_dict¶
- dist = 'b'¶
- global_from_local(lidx)¶
- global_iter¶
- local_from_global(gidx)¶
- size¶
- class distarray.local.maps.CyclicMap(global_size, grid_size, grid_rank, start)¶
Bases: distarray.local.maps.MapBase
One-dimensional cyclic map class.
- dim_dict¶
- dist = 'c'¶
- global_from_local(lidx)¶
- global_iter¶
- local_from_global(gidx)¶
- size¶
- class distarray.local.maps.MDMap¶
Bases: object
Multi-dimensional Map class.
Manages one or more one-dimensional map classes.
- classmethod from_dim_data(dim_data)¶
- global_from_local(*local_ind)¶
Given local_ind indices, translate into global indices.
- local_from_global(*global_ind)¶
Given global_ind indices, translate into local indices.
- local_shape¶
- class distarray.local.maps.MapBase¶
Bases: object
Base class for all one dimensional Map classes.
- class distarray.local.maps.UnstructuredMap(global_size, grid_size, grid_rank, indices)¶
Bases: distarray.local.maps.MapBase
One-dimensional unstructured map class.
- dim_dict¶
- dist = 'u'¶
- global_from_local(lidx)¶
- global_iter¶
- local_from_global(gidx)¶
- size¶
- distarray.local.maps.map_from_dim_dict(dd)¶
Factory function that returns a 1D map for a given dimension dictionary.
random Module¶
- distarray.local.random.beta(a, b, size=None, dist=None, grid_shape=None, comm=None)¶
- distarray.local.random.label_state(comm)¶
Label/personalize the random generator state for the local rank.
- distarray.local.random.normal(loc=0.0, scale=1.0, size=None, dist=None, grid_shape=None, comm=None)¶
- distarray.local.random.rand(size=None, dist=None, grid_shape=None, comm=None)¶
- distarray.local.random.randint(low, high=None, size=None, dist=None, grid_shape=None, comm=None)¶
- distarray.local.random.randn(size=None, dist=None, grid_shape=None, comm=None)¶