On Wednesday 14 May 2008 02:18:06 Christopher Burns wrote:

> I'm finding it difficult to tell which methods/operations respect the

> mask and which do not, in masked arrays.

Christopher,

Unfortunately, there's no tutorial yet. Perhaps could you get one started on

the scipy wiki ? I'm afraid I won't have time to do it myself, but I'd be

more than happy to fill the gaps.

To answer some of your questions:

>>>import numpy as np, numpy.ma as ma

>>>mydata = ma.array([0,1,2,3,4,5], mask=[1,0,1,0,1,0])

* If you want to access the underlying data directly, these two commands are

(almost) equivalent [1]:

>>>mydata._data

>>>mydata.view(np.ndarray)

Note that you lose the mask information, and that the values that were masked

can be bogus.

* If you want to get a copy of the underlying data with masked values set

to "myvalue", use .filled(myvalue).

>>>mydata.filled(-999)

array([-99, 1, -99, 3, -99, 5])

If you don't use any argument, ".filled" uses the "fill_value" attribute,

whose value depends on the dtype:

>>>mydata.fill_value

999999

>>>mydata.filled()

array([999999, 1, 999999, 3, 999999, 5])

Note that the argument of ".filled" is casted to the dtype of mydata.

>>>mydata.dtype

dtype('int64')

>>>mydata.filled(np.pi)

array([3, 1, 3, 3, 3, 5])

That can be a problem if you wanted to use NaNs as filling values (a bad idea

in itself):

>>>mydata.filled(np.nan)

array([0, 1, 0, 3, 0, 5])

Here, you don't have the NaNs you expected because NaNs are for floats, not

integers.

* Because masked arrays inherit from ndarrays, there's also a "fill" method

available: this one acts directly on the ._data part, but setting all the

values at once. The mask is preserved.

>>>mydata.fill(-999)

>>>print mydata

[-- -999 -- -999 -- -999]

You could achieve the same result with this command

>>>mydata.flat = -999

* Assigning a value to a slice of mydata will modify the mask:

>>>mydata = ma.array([0,1,2,3,4,5], mask=[1,0,1,0,1,0])

>>>mydata[:2] = -999

>>>print mydata

[-999 -999 -- 3 -- 5]

>>>mydata[-2:] = ma.masked

>>>print mydata

[-999 -999 -- 3 -- --]

* If you want to make sure you don't unmask data by mistake with slice

assignments, set the ._hardmask attribute to True (it is set to False by

default)

>>>mydata = ma.array([0,1,2,3,4,5], mask=[1,0,1,0,1,0], hard_mask=True)

>>>mydata[:2] = -999

>>>print mydata

[-- -999 -- 3 -- 5]

You can change the value of ._hardmask either directly, or with the

soften_mask() and harden_mask() methods

*

> Basic methods respect the mask, like mydata.mean(), but np.asarray

> ignores the mask.

Yes, np.asarray(x) is equivalent to np.array(x, copy=False, subok=False). If

you want to keep the mask, use np.asanyarray, which is equivalent to

np.array(x, copy=False, subok=True) [2]

>>>mydata = ma.array([0,1,2,3,4,5], mask=[1,0,1,0,1,0])

>>>print mydata.mean()

3.0

>>>print np.asarray(mydata).mean()

2.5

>>>print np.asanyarray(mydata).mean()

3.0

>>>print np.mean(mydata)

3.0

On the last command, np.mean(mydta) tries first to access the .mean method of

mydata: if mydata hand't such a method, it would be equivalent to

np.asarray(mydata).mean()

Hope it helps, don't hesitate to ask for more details/explanations. Specific

examples are always easier.

I'm looking forward to your wiki page ;)

P.

[1] Almost: mydata._data is in fact a shortcut to

mydata.view(mydata._baseclass), where ._baseclass is the class of the

underlying data. For example

>>>mxdata=ma.array(np.matrix([[1,2,],[3,4,]]),mask=[[1,0],[0,0]])

>>>print mxdata._baseclass

<class 'numpy.core.defmatrix.matrix'>

>>>print type(mxdata._data)

<class 'numpy.core.defmatrix.matrix'>

>>>print type(mxdata.view(np.ndarray))

<type 'numpy.ndarray'>

[2] Note that np.asanyarray returns a masked array in numpy.ma only, not in

previous implementations.

_______________________________________________

Numpy-discussion mailing list

[hidden email]
http://projects.scipy.org/mailman/listinfo/numpy-discussion