c - Does Intel array notation and elementary functions vectorize well with Xeon Phi ISA? -


i try find proper material explains different ways write c/c++ source code can vectorized intel compiler using array notation , elementary functions. materials online take trivial examples: saxpy, reduction etc. there lack of explanation on how vectorize code has conditional branching or contains loop loop-dependence.

for example: there sequential code want run different arrays. matrix stored in major row format. columns of matrix computed compute_seq() function:

#define n      256 #define stride 256  __attribute__((vector))     inline void compute_seq(float *sum, float* a) {   int i;   *sum = 0.0f;   for(i=0; i<n; i++)      *sum += a[i*stride]; }  int main() {   // initialize   float *a = malloc(n*n*sizeof(float));   float sums[n];   // following line not going valid, somthing this:   compute_seq(sums[:],*(a[0:n:1])); } 

any comments appreciated.

here corrected version of example.

__attribute__((vector(linear(sum),linear(a)))) inline void compute_seq(float *sum, float* a) {   int i;   *sum = 0.0f;   for(i=0; i<n; i++)     *sum += a[i*stride]; }  int main() {   // initialize   float *a = malloc(n*n*sizeof(float));   float sums[n];   compute_seq(&sums[:],&a[0:n:n]); } 

the important change @ call site. expression &sums[:] creates array section consisting of &sums[0], &sums[1], &sums[2], ... &sums[n-1]. expression &a[0:n:n] creates array section consisting of &a[0*n], &a[1*n], &a[2*n], ...&a[(n-1)*n].

i added 2 linear clauses vector attribute tell compiler generate clone optimized case arguments arithmetic sequences, in example. example, (and vector attribute) redundant since compiler can see both callee , call site in same translation unit , figure out particulars itself. if compute_seq defined in translation unit, attribute might help.

array notation work in progress. icc 14.0 beta compiled example intel(r) xeon phi(tm) without complaint. icc 13.0 update 3 reported couldn't vectorize function ("dereference complex"). perversely, leaving vector attribute off shut report, because compiler can vectorize after inlining.

i use compiler option "-opt-assume-safe-padding" when compiling intel(r) xeon phi(tm). may improve vector code quality. lets compiler assume page beyond accessed address safe touch, enabling instruction sequences otherwise disallowed.


Comments

Popular posts from this blog

css - Which browser returns the correct result for getBoundingClientRect of an SVG element? -

gcc - Calling fftR4() in c from assembly -

Function that returns a formatted array in VBA -