globalapi Package¶
globalapi
Package¶
Modules dealing with the global indexspace view of DistArrays.
In other words, the view from the client.
context
Module¶
Context objects contain the information required for DistArrays to communicate with LocalArrays.

class
distarray.globalapi.context.
BaseContext
[source]¶ Bases:
object
Context objects manage the setup and communication of the worker processes for DistArray objects. A DistArray object has a context, and contexts have an MPI intracommunicator that they use to communicate with worker processes.
Typically there is just one context object that uses all processes, although it is possible to have more than one context with a different selection of engines.

empty
(shape_or_dist, dtype=<type 'float'>)[source]¶ Create an empty Distarray.
Parameters:  shape_or_dist (shape tuple or Distribution object) –
 dtype (NumPy dtype, optional (default float)) –
Returns: A DistArray distributed as specified, with uninitialized values.
Return type:

fromarray
(arr, distribution=None)¶ Create a DistArray from an ndarray.
Parameters: distribution (Distribution object, optional) – If a Distribution object is not provided, one is created with Distribution(arr.shape). Returns: A DistArray distributed as specified, using the values and dtype from arr. Return type: DistArray

fromfunction
(function, shape, **kwargs)[source]¶ Create a DistArray from a function over global indices.
Unlike numpy’s fromfunction, the result of distarray’s fromfunction is restricted to the same Distribution as the index array generated from shape.
See numpy.fromfunction for more details.

fromndarray
(arr, distribution=None)[source]¶ Create a DistArray from an ndarray.
Parameters: distribution (Distribution object, optional) – If a Distribution object is not provided, one is created with Distribution(arr.shape). Returns: A DistArray distributed as specified, using the values and dtype from arr. Return type: DistArray

load_dnpy
(name)[source]¶ Load a distributed array from
.dnpy
files.The
.dnpy
file format is a binary format inspired by NumPy’s.npy
format. The header of a particular.dnpy
file contains information about which portion of a DistArray is saved in it (using the metadata outlined in the Distributed Array Protocol), and the data portion contains the output of NumPy’s save function for the local array data. See the module docstring for distarray.localapi.format for full details.Parameters: name (str or list of str) – If a str, this is used as the prefix for the filename used by each engine. Each engine will load a file named <name>_<rank>.dnpy
. If a list of str, each engine will use the name at the index corresponding to its rank. An exception is raised if the length of this list is not the same as the context’s communicator’s size.Returns: result – A DistArray encapsulating the file loaded on each engine. Return type: DistArray Raises: TypeError
– If name is an iterable whose length is different from the context’s communicator’s size.See also
save_dnpy()
 Saving files to load with with load_dnpy.

load_hdf5
(filename, distribution, key='buffer')[source]¶ Load a DistArray from a dataset in an
.hdf5
file.Parameters:  filename (str) – Filename to load.
 distribution (Distribution object) –
 key (str, optional) – The identifier for the group to load the DistArray from (the default is ‘buffer’).
Returns: result – A DistArray encapsulating the file loaded.
Return type:

load_npy
(filename, distribution)[source]¶ Load a DistArray from a dataset in a
.npy
file.Parameters:  filename (str) – Filename to load.
 distribution (Distribution object) –
Returns: result – A DistArray encapsulating the file loaded.
Return type:

ones
(shape_or_dist, dtype=<type 'float'>)[source]¶ Create a Distarray filled with ones.
Parameters:  shape_or_dist (shape tuple or Distribution object) –
 dtype (NumPy dtype, optional (default float)) –
Returns: A DistArray distributed as specified, filled with ones.
Return type:

register
(func)[source]¶ Associate a function with this Context. Allows access to the local process and local data associated with each DistArray.
After registering a function with a context, the function can be called as
context.func(...)
. Doing so will call the function locally on target processes determined from the arguments passed in usingContext.apply(...)
. The function can take nonproxied Python objects, DistArrays, or other proxied objects as arguments. Nonproxied Python objects will be broadcasted to all local processes; proxied objects will be dereferenced before calling the function on the local process.

save_dnpy
(name, da)[source]¶ Save a distributed array to files in the
.dnpy
format.The
.dnpy
file format is a binary format inspired by NumPy’s.npy
format. The header of a particular.dnpy
file contains information about which portion of a DistArray is saved in it (using the metadata outlined in the Distributed Array Protocol), and the data portion contains the output of NumPy’s save function for the local array data. See the module docstring for distarray.localapi.format for full details.Parameters:  name (str or list of str) – If a str, this is used as the prefix for the filename used by each
engine. Each engine will save a file named
<name>_<rank>.dnpy
. If a list of str, each engine will use the name at the index corresponding to its rank. An exception is raised if the length of this list is not the same as the context’s communicator’s size.  da (DistArray) – Array to save to files.
Raises: TypeError
– If name is an sequence whose length is different from the context’s communicator’s size.See also
load_dnpy()
 Loading files saved with save_dnpy.
 name (str or list of str) – If a str, this is used as the prefix for the filename used by each
engine. Each engine will save a file named

save_hdf5
(filename, da, key='buffer', mode='a')[source]¶ Save a DistArray to a dataset in an
.hdf5
file.Parameters:  filename (str) – Name of file to write to.
 da (DistArray) – Array to save to a file.
 key (str, optional) – The identifier for the group to save the DistArray to (the default is ‘buffer’).
 mode (optional, {‘w’, ‘w‘, ‘a’}, default ‘a’) –
'w'
 Create file, truncate if exists
'w'
 Create file, fail if exists
'a'
 Read/write if exists, create otherwise (default)


class
distarray.globalapi.context.
IPythonContext
(client=None, targets=None)[source]¶ Bases:
distarray.globalapi.context.BaseContext
Context class that uses IPython.parallel.
See the docstring for BaseContext for more information about Contexts.
See also

apply
(func, args=None, kwargs=None, targets=None, autoproxyize=False)[source]¶ Analogous to IPython.parallel.view.apply_sync
Parameters:  func (function) –
 args (tuple) – positional arguments to func
 kwargs (dict) – key word arguments to func
 targets (sequence of integers) – engines func is to be run on.
 autoproxyize (bool, default False) – If True, implicitly return a Proxy object from the function.
Returns: Return type: return a list of the results on the each engine.


class
distarray.globalapi.context.
MPIContext
(targets=None)[source]¶ Bases:
distarray.globalapi.context.BaseContext
Context class that uses MPI only (no IPython.parallel).
See the docstring for BaseContext for more information about Contexts.
See also

INTERCOMM
= None¶

apply
(func, args=None, kwargs=None, targets=None, autoproxyize=False)[source]¶ Analogous to IPython.parallel.view.apply_sync
Parameters:  func (function) –
 args (tuple) – positional arguments to func
 kwargs (dict) – keyword arguments to func
 targets (sequence of integers) – engines func is to be run on.
 autoproxyize (bool, default False) – If True, implicitly return a Proxy object from the function.
Returns: result from each engine.
Return type: list

distarray
Module¶
The DistArray data structure.
DistArray objects are proxies for collections of LocalArray objects. They are meant to roughly emulate NumPy ndarrays.

class
distarray.globalapi.distarray.
DistArray
(distribution, dtype=<type 'float'>)[source]¶ Bases:
object

context
¶

dist
¶

distribute_as
(shape_or_dist)[source]¶ Redistributes this DistArray, returning a new DistArray with the same data and corresponding distribution.
Parameters: shape_or_dist (shape tuple or Distribution object.) – Distribution for the new DistArray. The new distribution must have the same number of items as this distarray. The global shape and targets may be different. If shape tuple, immediately converted to a Distribution object with default parameters. Returns: A new DistArray distributed according to dist. Return type: DistArray Note
Currently implemented for block and nondistributed maps only.

dtype
¶

classmethod
from_localarrays
(key, context=None, targets=None, distribution=None, dtype=None)[source]¶ The caller has already created the LocalArray objects. key is their name on the engines. This classmethod creates a DistArray that refers to these LocalArrays.
Either a context or a distribution must also be provided. If context is provided, a
dim_data_per_rank
will be pulled from the existingLocalArray
s and aDistribution
will be created from it. If distribution is provided, it should accurately reflect the distribution of the existingLocalArray
s.If dtype is not provided, it will be fetched from the engines.

get_localarrays
()[source]¶ Pull the LocalArray objects from the engines.
Returns: one localarray per process Return type: list of localarrays

get_ndarrays
()[source]¶ Pull the local ndarrays from the engines.
Returns: one ndarray per process Return type: list of ndarrays

global_size
¶

grid_shape
¶

itemsize
¶

max
(axis=None, dtype=None, out=None)[source]¶ Return the maximum of array elements over the given axis.

mean
(axis=None, dtype=<type 'float'>, out=None)[source]¶ Return the mean of array elements over the given axis.

min
(axis=None, dtype=None, out=None)[source]¶ Return the minimum of array elements over the given axis.

nbytes
¶

ndim
¶

shape
¶

std
(axis=None, dtype=<type 'float'>, out=None)[source]¶ Return the standard deviation of array elements over the given axis.

targets
¶

toarray
()¶ Returns the distributed array as an ndarray.

var
(axis=None, dtype=<type 'float'>, out=None)[source]¶ Return the variance of array elements over the given axis.

view
(dtype=None)[source]¶ New view of array with the same data.
Parameters: dtype (numpy dtype, optional) – Datatype descriptor of the returned view, e.g., float32 or int16. The default, None, results in the view having the same datatype as the original array. Returns: A view on the original DistArray, optionally with the underlying memory interpreted as a different dtype. Return type: DistArray Note
No redistribution is done. The sizes of all LocalArrays must be compatible with the new view.

functions
Module¶
Distributed ufuncs for DistArrays.

distarray.globalapi.functions.
absolute
(a, *args, **kwargs)¶

distarray.globalapi.functions.
arccos
(a, *args, **kwargs)¶

distarray.globalapi.functions.
arccosh
(a, *args, **kwargs)¶

distarray.globalapi.functions.
arcsin
(a, *args, **kwargs)¶

distarray.globalapi.functions.
arcsinh
(a, *args, **kwargs)¶

distarray.globalapi.functions.
arctan
(a, *args, **kwargs)¶

distarray.globalapi.functions.
arctanh
(a, *args, **kwargs)¶

distarray.globalapi.functions.
conjugate
(a, *args, **kwargs)¶

distarray.globalapi.functions.
cos
(a, *args, **kwargs)¶

distarray.globalapi.functions.
cosh
(a, *args, **kwargs)¶

distarray.globalapi.functions.
exp
(a, *args, **kwargs)¶

distarray.globalapi.functions.
expm1
(a, *args, **kwargs)¶

distarray.globalapi.functions.
invert
(a, *args, **kwargs)¶

distarray.globalapi.functions.
log
(a, *args, **kwargs)¶

distarray.globalapi.functions.
log10
(a, *args, **kwargs)¶

distarray.globalapi.functions.
log1p
(a, *args, **kwargs)¶

distarray.globalapi.functions.
negative
(a, *args, **kwargs)¶

distarray.globalapi.functions.
reciprocal
(a, *args, **kwargs)¶

distarray.globalapi.functions.
rint
(a, *args, **kwargs)¶

distarray.globalapi.functions.
sign
(a, *args, **kwargs)¶

distarray.globalapi.functions.
sin
(a, *args, **kwargs)¶

distarray.globalapi.functions.
sinh
(a, *args, **kwargs)¶

distarray.globalapi.functions.
sqrt
(a, *args, **kwargs)¶

distarray.globalapi.functions.
square
(a, *args, **kwargs)¶

distarray.globalapi.functions.
tan
(a, *args, **kwargs)¶

distarray.globalapi.functions.
tanh
(a, *args, **kwargs)¶

distarray.globalapi.functions.
add
(a, b, *args, **kwargs)¶

distarray.globalapi.functions.
arctan2
(a, b, *args, **kwargs)¶

distarray.globalapi.functions.
bitwise_and
(a, b, *args, **kwargs)¶

distarray.globalapi.functions.
bitwise_or
(a, b, *args, **kwargs)¶

distarray.globalapi.functions.
bitwise_xor
(a, b, *args, **kwargs)¶

distarray.globalapi.functions.
divide
(a, b, *args, **kwargs)¶

distarray.globalapi.functions.
floor_divide
(a, b, *args, **kwargs)¶

distarray.globalapi.functions.
fmod
(a, b, *args, **kwargs)¶

distarray.globalapi.functions.
hypot
(a, b, *args, **kwargs)¶

distarray.globalapi.functions.
left_shift
(a, b, *args, **kwargs)¶

distarray.globalapi.functions.
mod
(a, b, *args, **kwargs)¶

distarray.globalapi.functions.
multiply
(a, b, *args, **kwargs)¶

distarray.globalapi.functions.
power
(a, b, *args, **kwargs)¶

distarray.globalapi.functions.
remainder
(a, b, *args, **kwargs)¶

distarray.globalapi.functions.
right_shift
(a, b, *args, **kwargs)¶

distarray.globalapi.functions.
subtract
(a, b, *args, **kwargs)¶

distarray.globalapi.functions.
true_divide
(a, b, *args, **kwargs)¶

distarray.globalapi.functions.
less
(a, b, *args, **kwargs)¶

distarray.globalapi.functions.
less_equal
(a, b, *args, **kwargs)¶

distarray.globalapi.functions.
equal
(a, b, *args, **kwargs)¶

distarray.globalapi.functions.
not_equal
(a, b, *args, **kwargs)¶

distarray.globalapi.functions.
greater
(a, b, *args, **kwargs)¶

distarray.globalapi.functions.
greater_equal
(a, b, *args, **kwargs)¶
ipython_cleanup
Module¶
Functions for cleaning up DistArray objects from IPython parallel engines.

distarray.globalapi.ipython_cleanup.
cleanup
(view, module_name, prefix)[source]¶ Delete Context object with the given name from the given module

distarray.globalapi.ipython_cleanup.
cleanup_all
(module_name, prefix)[source]¶ Connects to all engines and runs
cleanup()
on them.
ipython_utils
Module¶
The single IPython entry point.
maps
Module¶
Distribution class and auxiliary ClientMap classes.
The Distribution is a multidimensional map class that manages the onedimensional maps for each DistArray dimension. The Distribution class represents the distribution information for a distributed array, independent of the distributed array’s data. Distributions allow DistArrays to reduce overall communication when indexing and slicing by determining which processes own (or may possibly own) the indices in question. Two DistArray objects can share the same Distribution if they have the exact same distribution.
The onedimensional ClientMap classes keep track of which process owns which index in that dimension. This class has several subclasses for specific distribution types, including BlockMap, CyclicMap, NoDistMap, and UnstructuredMap.

class
distarray.globalapi.maps.
BlockCyclicMap
(size, grid_size, block_size=1)[source]¶ Bases:
distarray.globalapi.maps.MapBase

dist
= 'c'¶


class
distarray.globalapi.maps.
BlockMap
(size, grid_size, bounds=None, comm_padding=None, boundary_padding=None)[source]¶ Bases:
distarray.globalapi.maps.MapBase

dist
= 'b'¶


class
distarray.globalapi.maps.
Distribution
[source]¶ Bases:
object
Governs the mapping between global indices and process ranks for multidimensional objects.

comm
¶

comm_union
(*dists)[source]¶ Make a communicator that includes the union of all targets in dists.
Parameters: dists (sequence of distribution objects.) – Returns: First element is encompassing communicator proxy; second is a sequence of all targets in dists. Return type: tuple

classmethod
from_dim_data_per_rank
(context, dim_data_per_rank, targets=None)[source]¶ Create a Distribution from a sequence of dim_data tuples.
Parameters:  context (Context object) –
 dim_data_per_rank (Sequence of dim_data tuples, one per rank) – See the “Distributed Array Protocol” for a description of dim_data tuples.
 targets (Sequence of int, optional) – Sequence of engine target numbers. Default: all available
Returns: Return type:

classmethod
from_global_dim_data
(context, global_dim_data, targets=None)[source]¶ Make a Distribution from a global_dim_data structure.
Parameters:  context (Context object) –
 global_dim_data (tuple of dict) – A global dimension dictionary per dimension. See following Note section.
 targets (Sequence of int, optional) – Sequence of engine target numbers. Default: all available
Returns: Return type: Note
The global_dim_data tuple is a simple, straightforward data structure that allows full control over all aspects of a DistArray’s distribution information. It does not contain any of the array’s data, only the metadata needed to specify how the array is to be distributed. Each dimension of the array is represented by corresponding dictionary in the tuple, one per dimension. All dictionaries have a dist_type key that specifies whether the array is block, cyclic, or unstructured. The other keys in the dictionary are dependent on the dist_type key.
Block
dist_type
is'b'
.bounds
is a sequence of integers, at least two elements.The
bounds
sequence always starts with 0 and ends with the globalsize
of the array. The other elements indicate the local array global index boundaries, such that successive pairs of elements frombounds
indicates thestart
andstop
indices of the corresponding local array.comm_padding
integer, greater than or equal to zero.boundary_padding
integer, greater than or equal to zero.
These integer values indicate the communication or boundary padding, respectively, for the local arrays. Currently only a single value for both
boundary_padding
andcomm_padding
is allowed for the entire dimension.Cyclic
dist_type
is'c'
proc_grid_size
integer, greater than or equal to one.
The size of the process grid in this dimension. Equivalent to the number of local arrays in this dimension and determines the number of array sections.
size
integer, greater than or equal to zero.
The global size of the array in this dimension.
block_size
integer, optional. Greater than or equal to one.
If not present, equivalent to being present with value of one.
Unstructured
dist_type
is'u'
indices
sequence of onedimensional numpy integer arrays or buffers.The
len(indices)
is the number of local unstructured arrays in this dimension.To compute the global size of the array in this dimension, compute
sum(len(ii) for ii in indices)
.
Notdistributed
The
'n'
distribution type is a convenience to specify that an array is not distributed along this dimension.dist_type
is'n'
size
integer, greater than or equal to zero.
The global size of the array in this dimension.

classmethod
from_maps
(context, maps, targets=None)[source]¶ Create a Distribution from a sequence of Maps.
Parameters:  context (Context object) –
 maps (Sequence of Map objects) –
 targets (Sequence of int, optional) – Sequence of engine target numbers. Default: all available
Returns: Return type:

has_precise_index
¶ Does the clientside Distribution know precisely who owns all indices?
This can be used to determine whether one needs to use the checked version of __getitem__ or __setitem__ on LocalArrays.

owning_ranks
(idxs)[source]¶ Returns a list of ranks that may possibly own the location in the idxs tuple.
For many distribution types, the owning rank is precisely known; for others, it is only probably known. When the rank is precisely known, owning_ranks() returns a list of exactly one rank. Otherwise, returns a list of more than one rank.
If the idxs tuple is out of bounds, raises IndexError.

owning_targets
(idxs)[source]¶ Like owning_ranks() but returns a list of targets rather than ranks.
Convenience method meant for IPython parallel usage.


class
distarray.globalapi.maps.
MapBase
[source]¶ Bases:
object
Base class for onedimensional clientside maps.
Maps keep track of the relevant distribution information for a single dimension of a distributed array. Maps allow distributed arrays to keep track of which process to talk to when indexing and slicing.
Classes that inherit from MapBase must implement the index_owners() abstractmethod.

classmethod
from_axis_dim_dicts
(axis_dim_dicts)[source]¶ Make a Map from a sequence of processlocal dimension dictionaries.
There should be one such dimension dictionary per process.

classmethod
from_global_dim_dict
(glb_dim_dict)[source]¶ Make a Map from a global dimension dictionary.

classmethod

class
distarray.globalapi.maps.
NoDistMap
(size, grid_size)[source]¶ Bases:
distarray.globalapi.maps.MapBase

dist
= 'n'¶


class
distarray.globalapi.maps.
UnstructuredMap
(size, grid_size, indices=None)[source]¶ Bases:
distarray.globalapi.maps.MapBase

dist
= 'u'¶


distarray.globalapi.maps.
asdistribution
(context, shape_or_dist, dist=None, grid_shape=None, targets=None)[source]¶

distarray.globalapi.maps.
choose_map
(dist_type)[source]¶ Choose a map class given one of the distribution types.

distarray.globalapi.maps.
global_flat_indices
(dim_data)[source]¶ Return a list of tuples of indices into the flattened global array.
Parameters: dim_data (dimension dictionary.) – Returns: Each tuple is a (start, stop) interval into the flattened global array. All selected ranges comprise the indices for this dim_data’s subarray. Return type: list of 2tuples of ints.
random
Module¶
Contains the Random class that emulates numpy.random for DistArray.

class
distarray.globalapi.random.
Random
(context)[source]¶ Bases:
object

normal
(shape_or_dist, loc=0.0, scale=1.0)[source]¶ Draw random samples from a normal (Gaussian) distribution.
The probability density function of the normal distribution, first derived by De Moivre and 200 years later by both Gauss and Laplace independently [2], is often called the bell curve because of its characteristic shape (see the example below).
The normal distributions occurs often in nature. For example, it describes the commonly occurring distribution of samples influenced by a large number of tiny, random disturbances, each with its own unique distribution [2].
Parameters:  loc (float) – Mean (“centre”) of the distribution.
 scale (float) – Standard deviation (spread or “width”) of the distribution.
 shape_or_dist (shape tuple or Distribution object) –
Notes
The probability density for the Gaussian distribution is
\[p(x) = \frac{1}{\sqrt{ 2 \pi \sigma^2 }} e^{  \frac{ (x  \mu)^2 } {2 \sigma^2} },\]where \(\mu\) is the mean and \(\sigma\) the standard deviation. The square of the standard deviation, \(\sigma^2\), is called the variance.
The function has its peak at the mean, and its “spread” increases with the standard deviation (the function reaches 0.607 times its maximum at \(x + \sigma\) and \(x  \sigma\) [2]). This implies that numpy.random.normal is more likely to return samples lying close to the mean, rather than those far away.
References
[1] Wikipedia, “Normal distribution”, http://en.wikipedia.org/wiki/Normal_distribution [2] (1, 2, 3) P. R. Peebles Jr., “Central Limit Theorem” in “Probability, Random Variables and Random Signal Principles”, 4th ed., 2001, pp. 51, 51, 125.

rand
(shape_or_dist)[source]¶ Random values over a given distribution.
Create a distarray of the given shape and propagate it with random samples from a uniform distribution over
[0, 1)
.Parameters: shape_or_dist (shape tuple or Distribution object) – Returns: out – Random values. Return type: DistArray

randint
(shape_or_dist, low, high=None)[source]¶ Return random integers from low (inclusive) to high (exclusive).
Return random integers from the “discrete uniform” distribution in the “halfopen” interval [low, high). If high is None (the default), then results are from [0, low).
Parameters:  shape_or_dist (shape tuple or Distribution object) –
 low (int) – Lowest (signed) integer to be drawn from the distribution (unless
high=None
, in which case this parameter is the highest such integer).  high (int, optional) – if provided, one above the largest (signed) integer to be drawn
from the distribution (see above for behavior if
high=None
).
Returns: out – DistArray of random integers from the appropriate distribution.
Return type: DistArray of ints

randn
(shape_or_dist)[source]¶ Return samples from the “standard normal” distribution.
Parameters: shape_or_dist (shape tuple or Distribution object) – Returns: out – A DistArray of floatingpoint samples from the standard normal distribution. Return type: DistArray

seed
(seed=None)[source]¶ Seed the random number generators on each engine.
Parameters: seed (None, int, or array of integers) – Base random number seed to use on each engine. If None, then a nondeterministic seed is obtained from the operating system. Otherwise, the seed is used as passed, and the sequence of random numbers will be deterministic.
Each individual engine has its state adjusted so that it is different from each other engine. Thus, each engine will compute a different sequence of random numbers.
