x86 - Accessing an array of structs in assembly -
i have 455 byte array contains 13, 35 byte info structures.
turntreebuff: resb 455 ; turnnum (1 byte) + mt (16 bytes) + pm (16 bytes) ; + boardstate (2 bytes) = 35 bytes * 13 turns = 455 bytes
initially thought access particular info construction taking index, multiplying 35 , adding turntreebuff. valid scale factors 1, 2, 4 , 8.
so this:
mov word [turntreebuff+ebx*35],ax ; doesn't work
i can copying index register, shifting original index value left 5 times , adding it's re-create it's shifted value 3 times. seems cumbersome. seems cumbersome utilize mul command. improve off padding construction 64 bytes, it's multiple of 2? seems wasteful well.
is there improve way?
(using nasm 2.10 assemble 32-bit binary)
that * 35
indeed not compile:
mov word [turntreebuff+ebx*35],ax ; doesn't work
the processor supports powerfulness of 2 , first few: x1, x2, x4, , x8. if construction larger, have revert using multiplication.
note mul/imul fast (i.e. fast add together or sub) shouldn't worry using such, although if simple shift work construction (i.e. 64 bytes mentioned) using shift lot better. (the mul/imul take longer if utilize result on next line.)
finally, mov in 32 bit process odd address not idea. size of construction should @ to the lowest degree multiple of 4 bytes, 36 bytes in case.
p.s. utilize both features: index (ebx) set 0, 9, 18, etc. , utilize x4 in instruction. however, using multiplicator in address field slows things downwards bit... yet, if have fun, can jester proposed multiply bare index (0, 1, 2, etc.) , utilize lea ecx, [ebx * 8 + ebx]
multiply 9 , utilize x4 in other address. big problem such cool things is: if construction changes size... have rewrite a lot of code.
now, do, assuming looping on array of structures, add together size of construction index. example:
mov ebx, turntreebuff ; address of first construction .l1: ... mov al, [ebx+0] ; turnnum mov eax, [ebx+1] ; mt 1st work (should aligned...) mov eax, [ebx+5] ; mt 2nd work .. mov ax, [ebx+33] ; boardstate ... add together ebx, 35 ; again, utilize multiple of architecture: 16, 32, 64 bits... loop .l1
now mov
instructions efficient because not have complicated address mode slows downwards things (assuming millions of accesses, show!)
note construction should reorganized things aligned:
turnnum (1 byte) pad (1 byte) boardstate (2 bytes) mt (16 bytes) pm (16 bytes)
otherwise nail memory @ unaligned positions time , definitively slows things down.
p.s. x2, x4, , x8 added processors 1 access arrays of pointers, added benefit access structures of such sizes. uses 2 bits in instruction, hence limited range: 1 << n n 0, 1, 2, 3.
assembly x86 nasm
No comments:
Post a Comment