opencl - Difference between two ways of measuring kernel execution times -
i have n separate opencl kernels run synchronously in sequential fashion. sec kernel uses results first kernel, 3rd kernel uses sec kernel, , etc.
i measured total "execution" time kernels in 2 different ways:
1) associate each kernel individual event , measure time each kernel separately subtracting start time end time of each event. total time computed adding times events of kernels. in case, event wait method called after executing each kernel.
2) in 1), associate each kernel event (n kernels , n events). then, phone call event wait methods @ end after executing kernels (namely, event wait calls followed series of kernel calls). total execution time obtained subtracting start time of first event end time of lastly event.
the difference between total times 1) , 2) quite significant: ~20% of total execution time. 20% lost? associated (en)queuing multiple kernels?
how go reducing "overhead"? cut down overhead write larger kernel cramming kernels single big kernel restructuring kernels?
thanks!
opencl pyopencl
No comments:
Post a Comment