Wednesday 15 September 2010

python - reassigning expanded recarray field -



python - reassigning expanded recarray field -

i loading file info numpy recarray , subsequently filling in known gaps nans. however, can not find way increment size of field in recarray in order reassign array filled gaps. illustration of problem (given below) throws valueerror broadcasting larger smaller shape.

using python 2.7.6.1, numpy 1.8.1-6

thanks, rob

import numpy np import numpy.ma ma a1 = np.arange(0,20,1) a2 = np.arange(100,120,1) x = np.recarray((20,), dtype=[('g', float), ('h', int)]) x['g'][:] = a1 x['h'][:] = a2 afield in x.dtype.names: y = x[afield].copy(order='k') icnt in range(0,3): y = np.insert(y, 5, np.nan, axis=0) ma.resize(x[afield], (len(y),) ) x[afield][:] = y[:]

you did not "expand" recarray x. recarrays cannot expanded per label (name/column), hoping ma.resize. note ma.resize returns new (masked) array new shape without altering arrays passed it, in code not using homecoming value. line doesn't anything. clarify:

x[afield] = ma.resize(x[afield], (len(y),) )

would not work, because record arrays cannot expanded per label ('column'). if want expand recarray, you'll need in 1 go (with functions np.lib.recfunctions), add together exclusively new column or add together several new records existing columns.

that beingness said, why not seek this:

>>> y = np.arange(20, dtype=np.float) >>> ynan = np.insert(y, (5,)*3, (np.nan,)*3) >>> x = np.rec.fromarrays([ynan, ynan+100], names='g,h') >>> x rec.array([(0.0, 100.0), (1.0, 101.0), (2.0, 102.0), (3.0, 103.0), (4.0, 104.0), (nan, nan), (nan, nan), (nan, nan), (5.0, 105.0), (6.0, 106.0), (7.0, 107.0), (8.0, 108.0), (9.0, 109.0), (10.0, 110.0), (11.0, 111.0), (12.0, 112.0), (13.0, 113.0), (14.0, 114.0), (15.0, 115.0), (16.0, 116.0), (17.0, 117.0), (18.0, 118.0), (19.0, 119.0)], dtype=[('g', '<f8'), ('h', '<f8')])

note cannot convert 2nd column (label 'h') int, because np.nan floating point type. if tried, you'd garbage:

>>> x['h'].astype(np.int) array([ 100, 101, 102, 103, 104, -9223372036854775808, -9223372036854775808, -9223372036854775808, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119])

i think you're after masked record arrays:

>>> import numpy.ma.mrecords mrecords >>> >>> x = np.rec.fromarrays([ynan, (ynan+100).astype(np.int)], names='g,h') >>> z = np.ma.array(x, mask=np.isnan(ynan)) >>> z2 = z.view(mrecords.mrecarray) >>> >>> z2 masked_records( g : [0.0 1.0 2.0 3.0 4.0 -- -- -- 5.0 6.0 7.0 8.0 9.0 10.0 11.0 12.0 13.0 14.0 15.0 16.0 17.0 18.0 19.0] h : [100 101 102 103 104 -- -- -- 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119] fill_value : (1e+20, 999999) ) >>> >>> z2['h'] masked_array(data = [100 101 102 103 104 -- -- -- 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119], mask = [false false false false false true true true false false false false false false false false false false false false false false false], fill_value = 999999)

as can see, "columns" of z2 have desired dtype (float , int), accessible column names , have of info masked.

python numpy recarray

No comments:

Post a Comment