python - pick TxK numpy array from TxN numpy array using TxK column index array -
this indirect indexing problem.
it can solved list comprehension.
the question whether, or, how solve within numpy,
when data.shape
(t,n)
, c.shape
(t,k)
and each element of c
int
between 0 , n-1 inclusive, is, each element of c
intended refer column number data
.
the goal obtain out
where
out.shape = (t,k)
and each i
in 0..(t-1)
the row out[i] = [ data[i, c[i,0]] , ... , data[i, c[i,k-1]] ]
concrete example:
data = np.array([\ [ 0, 1, 2],\ [ 3, 4, 5],\ [ 6, 7, 8],\ [ 9, 10, 11],\ [12, 13, 14]]) c = np.array([ [0, 2],\ [1, 2],\ [0, 0],\ [1, 1],\ [2, 2]]) out should out = [[0, 2], [4, 5], [6, 6], [10, 10], [14, 14]]
the first row of out [0,2] because columns chosen given c's row 0, 0 , 2, , data[0] @ columns 0 , 2 0 , 2.
the sec row of out [4,5] because columns chosen given c's row 1, 1 , 2, , data[1] @ columns 1 , 2 4 , 5.
numpy fancy indexing doesn't seem solve in obvious way because indexing info c (e.g. data[c]
, np.take(data,c,axis=1)
) produces 3 dimensional array.
a list comprehension can solve it:
out = [ [data[rowidx,i1],data[rowidx,i2]] (rowidx, (i1,i2)) in enumerate(c) ]
if k 2 suppose marginally ok. if k variable, not good.
the list comprehension has rewritten each value k, because unrolls columns picked out of data
each row of c
. violates dry.
is there solution based exclusively in numpy
?
you can avoid loops np.choose:
in [1]: %cpaste pasting code; come in '--' lone on line stop or utilize ctrl-d. info = np.array([\ [ 0, 1, 2],\ [ 3, 4, 5],\ [ 6, 7, 8],\ [ 9, 10, 11],\ [12, 13, 14]]) c = np.array([ [0, 2],\ [1, 2],\ [0, 0],\ [1, 1],\ [2, 2]]) -- in [2]: np.choose(c, data.t[:,:,np.newaxis]) out[2]: array([[ 0, 2], [ 4, 5], [ 6, 6], [10, 10], [14, 14]])
python numpy indexing