python (65.2k questions)
javascript (44.3k questions)
reactjs (22.7k questions)
java (20.8k questions)
c# (17.4k questions)
html (16.3k questions)
r (13.7k questions)
android (13k questions)
Auto-vectorization for hand-unrolled initialized tiled-computation versus simple loop with no initialization
In optimization for an AABB collision detection algorithm's inner-most 4-versus-4 comparison part, I am stuck at simplifying code at the same time gaining(or just retaining) performance.
Here is the v...
huseyin tugrul buyukisik
Votes: 0
Answers: 0
Does C++ support run-time query for natural width of SIMD units of a core?
In C++, is there a way to query number of lanes of SIMD units like this:
// 4 for bulldozer,
// 8 for skylake,
// 16 for cascadelake
int width = std::this_thread::SIMD_WIDTH;
or does it have to be...
huseyin tugrul buyukisik
Votes: 0
Answers: 0
error: invalid static_cast from type ‘__m256i’ {aka ‘__vector(4) long long int’} to type ‘void*’
I'm trying to compile a piece of code where it calls uses static_cast to do something like the following:
__m256i values;
int64_t i = 1;
static_cast<void*>(values + i);
but this results i...
David
Votes: 0
Answers: 0
I can't get Non-Temporal stores/loads to work without -fsanitize=address
Non-temporal store fails every time for me, but if I replace them with temporal stores it never fails;
The only way I got non-temporal loads/stores not to fail is if I use -fsanitize=address option, a...
Nieważne Nieważne
Votes: 0
Answers: 0