1 year ago
#347491

HLI
how to use _mm_mask_add_ps instruction correctly?
I wrote the test code as below. If I set mask 0b1111 or 0b0000, it works fine. If I use the mask combined with 01, 0b1101 0b1001..., the program crashed. A SIGILL signal which means illegal instruction is received in _mm_mask_add_ps
when I debug.
Any help is appreciated. Thanks.
__m128 vec0 =_mm_setr_ps(1,2,3,10);
__m128 vec1 =_mm_setr_ps(4,5,6,10);
__m128 src = _mm_setr_ps(14,15,16,110);
__mmask8 mask = 0b1101;
__m128 res = _mm_mask_add_ps(src, mask, vec0, vec1);
alignas (16) float arr[4];
_mm_store_ps(arr, res);
float *p = arr;
cout<<*p++<<endl;
cout<<*p++<<endl;
cout<<*p++<<endl;
cout<<*p<<endl;
c++
sse
intrinsics
avx512
0 Answers
Your Answer