dist Package¶

`dist` Package¶

`cleanup` Module¶

distarray.dist.cleanup.cleanup(view, module_name, prefix)¶: Delete Context object with the given name from the given module

distarray.dist.cleanup.cleanup_all(module_name, prefix)¶: Connects to all engines and runs cleanup() on them.

distarray.dist.cleanup.clear(view)¶: Removes all distarray-related modules from engines’ sys.modules.

distarray.dist.cleanup.clear_all()¶

distarray.dist.cleanup.get_local_keys(view, prefix)¶: Returns a dictionary of keyname -> target_list mapping for all names that start with prefix on engines in view.

`context` Module¶

Context objects contain the information required for DistArrays to communicate with LocalArrays.

class distarray.dist.context.Context(client=None, targets=None)¶

Bases: object

Context objects manage the setup and communication of the worker processes for DistArray objects. A DistArray object has a context, and contexts have an MPI intracommunicator that they use to communicate with worker processes.

Typically there is just one context object that uses all processes, although it is possible to have more than one context with a different selection of engines.

apply(func, args=None, kwargs=None, targets=None)¶

Analogous to IPython.parallel.view.apply_sync

Parameters:	func (function) – args (tuple) – positional arguments to func kwargs (dict) – key word arguments to func targets (sequence of integers) – engines func is to be run on.
Return type:	return a list of the results on the each engine.

cleanup()¶: Delete keys that this context created from all the engines.

close()¶

delete_key(key, targets=None)¶: Delete the specific key from all the engines.

empty(distribution, dtype=<type 'float'>)¶

Create an empty Distarray.

Parameters:	distribution (Distribution object) –
Returns:	A DistArray distributed as specified, with uninitialized values.
Return type:	DistArray

fromarray(arr, distribution=None)¶

Create a DistArray from an ndarray.

Parameters:	distribution (Distribution object, optional) – If a Distribution object is not provided, one is created with Distribution.from_shape(arr.shape).
Returns:	A DistArray distributed as specified, using the values and dtype from arr.
Return type:	DistArray

fromfunction(function, shape, **kwargs)¶

Create a DistArray from a function over global indices.

Unlike numpy’s fromfunction, the result of distarray’s fromfunction is restricted to the same Distribution as the index array generated from shape.

See numpy.fromfunction for more details.

fromndarray(arr, distribution=None)¶

Create a DistArray from an ndarray.

Parameters:	distribution (Distribution object, optional) – If a Distribution object is not provided, one is created with Distribution.from_shape(arr.shape).
Returns:	A DistArray distributed as specified, using the values and dtype from arr.
Return type:	DistArray

load_dnpy(name)¶

Load a distributed array from .dnpy files.

The .dnpy file format is a binary format inspired by NumPy’s .npy format. The header of a particular .dnpy file contains information about which portion of a DistArray is saved in it (using the metadata outlined in the Distributed Array Protocol), and the data portion contains the output of NumPy’s save function for the local array data. See the module docstring for distarray.local.format for full details.

Parameters:	name (str or list of str) – If a str, this is used as the prefix for the filename used by each engine. Each engine will load a file named `<name>_<rank>.dnpy`. If a list of str, each engine will use the name at the index corresponding to its rank. An exception is raised if the length of this list is not the same as the context’s communicator’s size.
Returns:	result – A DistArray encapsulating the file loaded on each engine.
Return type:	DistArray
Raises:	`TypeError` – If name is an iterable whose length is different from the context’s communicator’s size.

`decorators` Module¶

Decorators for defining functions that use DistArrays.

class distarray.dist.decorators.DecoratorBase(fn)¶

Bases: object

Base class for decorators, handles name wrapping and allows the decorator to take an optional kwarg.

determine_context(args, kwargs)¶: Determine a context from a functions arguments.

key_and_push_args(args, kwargs, context=None, da_handler=None)¶

Push a tuple of args and dict of kwargs to the engines. Return a tuple with keys corresponding to args values on the engines. And a dictionary with the same keys and values which are the keys to the input dictionary’s values.

This allows us to use the following interface to execute code on the engines:

>>> def foo(*args, **kwargs):
>>>     args, kwargs = _key_and_push_args(args, kwargs)
>>>     exec_str = "remote_foo(*%s, **%s)"
>>>     exec_str %= (args, kwargs)
>>>     context.execute(exec_str)

process_return_value(context, result_key)¶

Figure out what to return on the Client.

Parameters:	key (string) – Key corresponding to wrapped function’s return value.
Returns:	A DistArray (if locally all values are DistArray), a None (if locally all values are None), or else, pull the result back to the client and return it. If all but one of the pulled values is None, return that non-None value only.
Return type:	Varied

push_fn(context, fn_key, fn)¶: Push function to the engines.

class distarray.dist.decorators.local(fn)¶

Bases: distarray.dist.decorators.DecoratorBase

Decorator to run a function locally on the engines.

class distarray.dist.decorators.vectorize(fn)¶

Bases: distarray.dist.decorators.DecoratorBase

Analogous to numpy.vectorize. Input DistArray’s must all be the same shape, and this will be the shape of the output distarray.

get_ndarray(da, arg_keys)¶

`distarray` Module¶

The Distarray data structure.`DistArray` objects are proxies for collections of LocalArray objects. They are meant to roughly emulate NumPy ndarrays.

class distarray.dist.distarray.DistArray(distribution, dtype=<type 'float'>)¶

Bases: object

context¶

dist¶

dtype¶

fill(value)¶

classmethod from_localarrays(key, context=None, targets=None, distribution=None, dtype=None)¶

The caller has already created the LocalArray objects. key is their name on the engines. This classmethod creates a DistArray that refers to these LocalArrays.

Either a context or a distribution must also be provided. If context is provided, a dim_data_per_rank will be pulled from the existing LocalArrays and a Distribution will be created from it. If distribution is provided, it should accurately reflect the distribution of the existing LocalArrays.

If dtype is not provided, it will be fetched from the engines.

get_dist_matrix()¶

get_localarrays()¶

Pull the LocalArray objects from the engines.

Returns:	one localarray per process
Return type:	list of localarrays

get_localshapes()¶

get_ndarrays()¶

Pull the local ndarrays from the engines.

Returns:	one ndarray per process
Return type:	list of ndarrays

global_size¶

grid_shape¶

itemsize¶

max(axis=None, dtype=None, out=None)¶: Return the maximum of array elements over the given axis.

mean(axis=None, dtype=<type 'float'>, out=None)¶: Return the mean of array elements over the given axis.

min(axis=None, dtype=None, out=None)¶: Return the minimum of array elements over the given axis.

nbytes¶

ndim¶

shape¶

std(axis=None, dtype=<type 'float'>, out=None)¶: Return the standard deviation of array elements over the given axis.

sum(axis=None, dtype=None, out=None)¶: Return the sum of array elements over the given axis.

targets¶

toarray()¶: Returns the distributed array as an ndarray.

tondarray()¶: Returns the distributed array as an ndarray.

var(axis=None, dtype=<type 'float'>, out=None)¶: Return the variance of array elements over the given axis.

`functions` Module¶

Distributed unfuncs for distributed arrays.

distarray.dist.functions.absolute(a, *args, **kwargs)¶

distarray.dist.functions.arccos(a, *args, **kwargs)¶

distarray.dist.functions.arccosh(a, *args, **kwargs)¶

distarray.dist.functions.arcsin(a, *args, **kwargs)¶

distarray.dist.functions.arcsinh(a, *args, **kwargs)¶

distarray.dist.functions.arctan(a, *args, **kwargs)¶

distarray.dist.functions.arctanh(a, *args, **kwargs)¶

distarray.dist.functions.conjugate(a, *args, **kwargs)¶

distarray.dist.functions.cos(a, *args, **kwargs)¶

distarray.dist.functions.cosh(a, *args, **kwargs)¶

distarray.dist.functions.exp(a, *args, **kwargs)¶

distarray.dist.functions.expm1(a, *args, **kwargs)¶

distarray.dist.functions.invert(a, *args, **kwargs)¶

distarray.dist.functions.log(a, *args, **kwargs)¶

distarray.dist.functions.log10(a, *args, **kwargs)¶

distarray.dist.functions.log1p(a, *args, **kwargs)¶

distarray.dist.functions.negative(a, *args, **kwargs)¶

distarray.dist.functions.reciprocal(a, *args, **kwargs)¶

distarray.dist.functions.rint(a, *args, **kwargs)¶

distarray.dist.functions.sign(a, *args, **kwargs)¶

distarray.dist.functions.sin(a, *args, **kwargs)¶

distarray.dist.functions.sinh(a, *args, **kwargs)¶

distarray.dist.functions.sqrt(a, *args, **kwargs)¶

distarray.dist.functions.square(a, *args, **kwargs)¶

distarray.dist.functions.tan(a, *args, **kwargs)¶

distarray.dist.functions.tanh(a, *args, **kwargs)¶

distarray.dist.functions.add(a, b, *args, **kwargs)¶

distarray.dist.functions.arctan2(a, b, *args, **kwargs)¶

distarray.dist.functions.bitwise_and(a, b, *args, **kwargs)¶

distarray.dist.functions.bitwise_or(a, b, *args, **kwargs)¶

distarray.dist.functions.bitwise_xor(a, b, *args, **kwargs)¶

distarray.dist.functions.divide(a, b, *args, **kwargs)¶

distarray.dist.functions.floor_divide(a, b, *args, **kwargs)¶

distarray.dist.functions.fmod(a, b, *args, **kwargs)¶

distarray.dist.functions.hypot(a, b, *args, **kwargs)¶

distarray.dist.functions.left_shift(a, b, *args, **kwargs)¶

distarray.dist.functions.mod(a, b, *args, **kwargs)¶

distarray.dist.functions.multiply(a, b, *args, **kwargs)¶

distarray.dist.functions.power(a, b, *args, **kwargs)¶

distarray.dist.functions.remainder(a, b, *args, **kwargs)¶

distarray.dist.functions.right_shift(a, b, *args, **kwargs)¶

distarray.dist.functions.subtract(a, b, *args, **kwargs)¶

distarray.dist.functions.true_divide(a, b, *args, **kwargs)¶

distarray.dist.functions.less(a, b, *args, **kwargs)¶

distarray.dist.functions.less_equal(a, b, *args, **kwargs)¶

distarray.dist.functions.equal(a, b, *args, **kwargs)¶

distarray.dist.functions.not_equal(a, b, *args, **kwargs)¶

distarray.dist.functions.greater(a, b, *args, **kwargs)¶

distarray.dist.functions.greater_equal(a, b, *args, **kwargs)¶

`ipython_utils` Module¶

The single IPython entry point.

`maps` Module¶

Distribution class and auxiliary ClientMap classes.

The Distribution is a multi-dimensional map class that manages the one-dimensional maps for each DistArray dimension. The Distribution class represents the distribution information for a distributed array, independent of the distributed array’s data. Distributions allow DistArrays to reduce overall communication when indexing and slicing by determining which processes own (or may possibly own) the indices in question. Two DistArray objects can share the same Distribution if they have the exact same distribution.

The one-dimensional ClientMap classes keep track of which process owns which index in that dimension. This class has several subclasses for specific distribution types, including BlockMap, CyclicMap, NoDistMap, and UnstructuredMap.

class distarray.dist.maps.BlockCyclicMap(size, grid_size, block_size=1)¶

Bases: distarray.dist.maps.MapBase

dist = 'c'¶

classmethod from_axis_dim_dicts(axis_dim_dicts)¶

classmethod from_global_dim_dict(glb_dim_dict)¶

get_dimdicts()¶

owners(idx)¶

class distarray.dist.maps.BlockMap(size, grid_size)¶

Bases: distarray.dist.maps.MapBase

dist = 'b'¶

classmethod from_axis_dim_dicts(axis_dim_dicts)¶

classmethod from_global_dim_dict(glb_dim_dict)¶

get_dimdicts()¶

owners(idx)¶

class distarray.dist.maps.Distribution(context, global_dim_data, targets=None)¶

Bases: object

Governs the mapping between global indices and process ranks for multi-dimensional objects.

classmethod from_dim_data_per_rank(context, dim_data_per_rank, targets=None)¶: Create a Distribution from a sequence of dim_data tuples.

classmethod from_shape(context, shape, dist=None, grid_shape=None, targets=None)¶

get_dim_data_per_rank()¶

has_precise_index¶

Does the client-side Distribution know precisely who owns all indices?

This can be used to determine whether one needs to use the checked version of __getitem__ or __setitem__ on LocalArrays.

is_compatible(o)¶

owning_ranks(idxs)¶

Returns a list of ranks that may possibly own the location in the idxs tuple.

For many distribution types, the owning rank is precisely known; for others, it is only probably known. When the rank is precisely known, owning_ranks() returns a list of exactly one rank. Otherwise, returns a list of more than one rank.

If the idxs tuple is out of bounds, raises IndexError.

owning_targets(idxs)¶

Like owning_ranks() but returns a list of targets rather than ranks.

Convenience method meant for IPython parallel usage.

reduce(axes)¶: Returns a new Distribution reduced along axis, i.e., the new distribution has one fewer dimension than self.

class distarray.dist.maps.MapBase¶

Bases: object

Base class for one-dimensional client-side maps.

Maps keep track of the relevant distribution information for a single dimension of a distributed array. Maps allow distributed arrays to keep track of which process to talk to when indexing and slicing.

Classes that inherit from MapBase must implement the owners() abstractmethod.

is_compatible(map)¶

owners(idx)¶

Returns a list of process IDs in this dimension that might possibly own idx.

Raises IndexError if idx is out of bounds.

class distarray.dist.maps.NoDistMap(size, grid_size)¶

Bases: distarray.dist.maps.MapBase

dist = 'n'¶

classmethod from_axis_dim_dicts(axis_dim_dicts)¶

classmethod from_global_dim_dict(glb_dim_dict)¶

get_dimdicts()¶

owners(idx)¶

class distarray.dist.maps.UnstructuredMap(size, grid_size, indices=None)¶

Bases: distarray.dist.maps.MapBase

dist = 'u'¶

classmethod from_axis_dim_dicts(axis_dim_dicts)¶

classmethod from_global_dim_dict(glb_dim_dict)¶

get_dimdicts()¶

owners(idx)¶

distarray.dist.maps.choose_map(dist_type)¶: Choose a map class given one of the distribution types.

distarray.dist.maps.map_from_global_dim_dict(global_dim_dict)¶: Given a global_dim_dict return map.

distarray.dist.maps.map_from_sizes(size, dist_type, grid_size)¶: Returns an instance of the appropriate subclass of MapBase.

`random` Module¶

Emulate numpy.random

class distarray.dist.random.Random(context)¶

Bases: object

normal(distribution, loc=0.0, scale=1.0)¶

Draw random samples from a normal (Gaussian) distribution.

The probability density function of the normal distribution, first derived by De Moivre and 200 years later by both Gauss and Laplace independently [2], is often called the bell curve because of its characteristic shape (see the example below).

The normal distributions occurs often in nature. For example, it describes the commonly occurring distribution of samples influenced by a large number of tiny, random disturbances, each with its own unique distribution [2].

Parameters:	loc (float) – Mean (“centre”) of the distribution. scale (float) – Standard deviation (spread or “width”) of the distribution.

Notes

The probability density for the Gaussian distribution is

\[p(x) = \frac{1}{\sqrt{ 2 \pi \sigma^2 }} e^{ - \frac{ (x - \mu)^2 } {2 \sigma^2} },\]

where \(\mu\) is the mean and \(\sigma\) the standard deviation. The square of the standard deviation, \(\sigma^2\), is called the variance.

The function has its peak at the mean, and its “spread” increases with the standard deviation (the function reaches 0.607 times its maximum at \(x + \sigma\) and \(x - \sigma\) [2]). This implies that numpy.random.normal is more likely to return samples lying close to the mean, rather than those far away.

References

[1]	Wikipedia, “Normal distribution”, http://en.wikipedia.org/wiki/Normal_distribution

[2]	(1, 2, 3) P. R. Peebles Jr., “Central Limit Theorem” in “Probability, Random Variables and Random Signal Principles”, 4th ed., 2001, pp. 51, 51, 125.

rand(distribution)¶

Random values over a given distribution.

Create a distarray of the given shape and propagate it with random samples from a uniform distribution over [0, 1).

Returns:	out – Random values.
Return type:	DistArray

randint(distribution, low, high=None)¶

Return random integers from low (inclusive) to high (exclusive).

Return random integers from the “discrete uniform” distribution in the “half-open” interval [low, high). If high is None (the default), then results are from [0, low).

Parameters:	distribution (Distribution object) – low (int) – Lowest (signed) integer to be drawn from the distribution (unless `high=None`, in which case this parameter is the highest such integer). high (int, optional) – if provided, one above the largest (signed) integer to be drawn from the distribution (see above for behavior if `high=None`).
Returns:	out – DistArray of random integers from the appropriate distribution.
Return type:	DistArray of ints

randn(distribution)¶

Return samples from the “standard normal” distribution.

Returns:	out – A DistArray of floating-point samples from the standard normal distribution.
Return type:	DistArray

seed(seed=None)¶

Seed the random number generators on each engine.

Parameters:

seed (None, int, or array of integers) –

Base random number seed to use on each engine. If None, then a non-deterministic seed is obtained from the operating system. Otherwise, the seed is used as passed, and the sequence of random numbers will be deterministic.

Each individual engine has its state adjusted so that it is different from each other engine. Thus, each engine will compute a different sequence of random numbers.

dist Package¶

dist Package¶

cleanup Module¶

context Module¶

decorators Module¶

distarray Module¶

functions Module¶

ipython_utils Module¶

maps Module¶

random Module¶

`dist` Package¶

`cleanup` Module¶

`context` Module¶

`decorators` Module¶

`distarray` Module¶

`functions` Module¶

`ipython_utils` Module¶

`maps` Module¶

`random` Module¶