matlab - Multiplication of corresponding 2d slices of two arrays and inversion of array slices -
i have 2 arrays a
, b
of same dimension 1000 x 3 x 20 x 20
. want generate 3rd array c
of dimension 3 x 3 x 20 x 20
outcome of matrix multiplication of corresponding slices of a
, b
, i.e. c(:,:,i,j) = a(:,:,i,j)'*b(:,:,i,j)
. need transform array c
new array d
inverting corresponding 3 x 3
matrices, i.e. d(:,:,i,j) = inv(c(:,:,i,j))
. again, it's clear how loops. there way awoid looping on 400
items?
edit: benchmarking code compare performance of different solutions -
%// inputs n1 = 50; n2 = 200; = rand(n1,3,n2,n2); b = rand(n1,3,n2,n2); %// a. cpu loopy code tic c = zeros(3,3,n2,n2); ii = 1:n2 jj = 1:n2 c(:,:,ii,jj) = a(:,:,ii,jj)'*b(:,:,ii,jj); %//' end end toc %// b. vectorized code (using squeeze) tic c1 = squeeze(sum(bsxfun(@times,permute(a,[2 1 5 3 4]),permute(b,[5 1 2 3 4])),2)); toc %// c. vectorized code (avoiding squeeze) tic c2 = sum(bsxfun(@times,permute(a,[2 5 3 4 1]),permute(b,[5 2 3 4 1])),5); toc %// d. gpu vectorized code tic = gpuarray(a); b = gpuarray(b); c3 = sum(bsxfun(@times,permute(a,[2 5 3 4 1]),permute(b,[5 2 3 4 1])),5); c3 = gather(c3); toc
runtime results -
elapsed time 0.287511 seconds. elapsed time 0.250663 seconds. elapsed time 0.337628 seconds. elapsed time 1.259207 seconds.
code
%// part - 1 c = sum(bsxfun(@times,permute(a,[2 5 3 4 1]),permute(b,[5 2 3 4 1])),5); %// part - 2: utilize matlab file-exchange tool multinv d = multinv(c);
the function code multinv
available here , claims pretty efficient.
for first part, can seek out -
c = squeeze(sum(bsxfun(@times,permute(a,[2 1 5 3 4]),permute(b,[5 1 2 3 4])),2));
this 1 seems re-arranging elements not "disruptively" 1 mentioned in code above, downside need squeeze
might slow downwards bit. leave , encourage benchmark , select improve one.
bsxfun
+ gpu
? i have increased loop limits, real test between loopy code , vectorized code. so, here modified code part 1 -
%// inputs n1 = 50; n2 = 200; = rand(n1,3,n2,n2); b = rand(n1,3,n2,n2); %// a. cpu loopy code tic c = zeros(3,3,n2,n2); ii = 1:n2 jj = 1:n2 c(:,:,ii,jj) = a(:,:,ii,jj)'*b(:,:,ii,jj); %//' end end toc %// b. gpu vectorized code tic = gpuarray(a); b = gpuarray(b); c1 = sum(bsxfun(@times,permute(a,[2 5 3 4 1]),permute(b,[5 2 3 4 1])),5); c1 = gather(c1); toc
the runtime results @ scheme -
elapsed time 0.310056 seconds. elapsed time 0.172499 seconds.
so, see!
arrays matlab vectorization matrix-multiplication
No comments:
Post a Comment