bolha.us is one of the many independent Mastodon servers you can use to participate in the fediverse.
We're a Brazilian IT Community. We love IT/DevOps/Cloud, but we also love to talk about life, the universe, and more. | Nós somos uma comunidade de TI Brasileira, gostamos de Dev/DevOps/Cloud e mais!

Server stats:

252
active users

#simd

1 post1 participant0 posts today
Hacker News<p>Towards fearless SIMD, 7 years later</p><p><a href="https://linebender.org/blog/towards-fearless-simd/" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">linebender.org/blog/towards-fe</span><span class="invisible">arless-simd/</span></a></p><p><a href="https://mastodon.social/tags/HackerNews" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>HackerNews</span></a> <a href="https://mastodon.social/tags/Towards" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Towards</span></a> <a href="https://mastodon.social/tags/fearless" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>fearless</span></a> <a href="https://mastodon.social/tags/SIMD" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>SIMD</span></a> #7 <a href="https://mastodon.social/tags/years" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>years</span></a> <a href="https://mastodon.social/tags/later" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>later</span></a> <a href="https://mastodon.social/tags/SIMD" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>SIMD</span></a> <a href="https://mastodon.social/tags/Programming" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Programming</span></a> <a href="https://mastodon.social/tags/Performance" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Performance</span></a> <a href="https://mastodon.social/tags/Optimization" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Optimization</span></a> <a href="https://mastodon.social/tags/Technology" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Technology</span></a> <a href="https://mastodon.social/tags/Blog" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Blog</span></a></p>
Jiří Činčura ↹<p>(Not) Vectorizing the .NET Dictionary class</p><p><a href="https://gist.github.com/kg/f5bfe4c095f66d2dcda5f1e43e015cf1" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">gist.github.com/kg/f5bfe4c095f</span><span class="invisible">66d2dcda5f1e43e015cf1</span></a></p><p><a href="https://mas.to/tags/dotnet" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>dotnet</span></a> <a href="https://mas.to/tags/simd" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>simd</span></a></p>
N-gated Hacker News<p>Ah, the classic tale of a coder thinking <a href="https://mastodon.social/tags/SIMD" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>SIMD</span></a> would make their code fly 🚀, only to discover it trips over its own feet 👟. Our hero's memory seems as patchy as their <a href="https://mastodon.social/tags/benchmarks" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>benchmarks</span></a>, but fear not, the valuable lesson here is clear: <a href="https://mastodon.social/tags/optimization" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>optimization</span></a> is just a synonym for <a href="https://mastodon.social/tags/headache" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>headache</span></a>. 🤦‍♂️<br><a href="https://genna.win/blog/convolution-simd/" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">genna.win/blog/convolution-sim</span><span class="invisible">d/</span></a> <a href="https://mastodon.social/tags/coding" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>coding</span></a> <a href="https://mastodon.social/tags/woes" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>woes</span></a> <a href="https://mastodon.social/tags/lessons" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>lessons</span></a> <a href="https://mastodon.social/tags/HackerNews" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>HackerNews</span></a> <a href="https://mastodon.social/tags/ngated" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>ngated</span></a></p>
Hacker News<p>Performance optimization, and how to do it wrong — <a href="https://genna.win/blog/convolution-simd/" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">genna.win/blog/convolution-sim</span><span class="invisible">d/</span></a><br><a href="https://mastodon.social/tags/HackerNews" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>HackerNews</span></a> <a href="https://mastodon.social/tags/PerformanceOptimization" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>PerformanceOptimization</span></a> <a href="https://mastodon.social/tags/HowToDoItWrong" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>HowToDoItWrong</span></a> <a href="https://mastodon.social/tags/Convolution" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Convolution</span></a> <a href="https://mastodon.social/tags/SIMD" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>SIMD</span></a> <a href="https://mastodon.social/tags/HackerNews" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>HackerNews</span></a> <a href="https://mastodon.social/tags/Blog" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Blog</span></a></p>
IT News<p>Faster Integer Division with Floating Point - Multiplication on a common microcontroller is easy. But division is much more diff... - <a href="https://hackaday.com/2024/12/22/faster-integer-division-with-floating-point/" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">hackaday.com/2024/12/22/faster</span><span class="invisible">-integer-division-with-floating-point/</span></a> <a href="https://schleuss.online/tags/softwaredevelopment" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>softwaredevelopment</span></a> <a href="https://schleuss.online/tags/softwarehacks" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>softwarehacks</span></a> <a href="https://schleuss.online/tags/optimization" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>optimization</span></a> <a href="https://schleuss.online/tags/assembly" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>assembly</span></a> <a href="https://schleuss.online/tags/avx" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>avx</span></a>-512 <a href="https://schleuss.online/tags/x86_64" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>x86_64</span></a> <a href="https://schleuss.online/tags/simd" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>simd</span></a> <a href="https://schleuss.online/tags/x86" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>x86</span></a></p>
mattst88 :gentoo:<p>I landed some improvements and small optimizations to <a href="https://fosstodon.org/tags/pixman" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>pixman</span></a>'s AltiVec code. See <a href="https://gitlab.freedesktop.org/pixman/pixman/-/merge_requests/136" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">gitlab.freedesktop.org/pixman/</span><span class="invisible">pixman/-/merge_requests/136</span></a></p><p>It was fun working with a new (to me) instruction set and trying to figure out how to puzzle together the pieces into something that improved the `pix_multiply()` function (which is kind of the core primitive of most fast paths).</p><p>I couldn't figure out a way to use the `vec_mradds`/`vmhraddshs` instruction. Maybe you can? (see <a href="https://gitlab.freedesktop.org/pixman/pixman/-/merge_requests/136#note_2699795" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">gitlab.freedesktop.org/pixman/</span><span class="invisible">pixman/-/merge_requests/136#note_2699795</span></a>)</p><p><a href="https://fosstodon.org/tags/altivec" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>altivec</span></a> <a href="https://fosstodon.org/tags/powerpc" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>powerpc</span></a> <a href="https://fosstodon.org/tags/simd" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>simd</span></a></p>
mattst88 :gentoo:<p>I fixed an issue in pixman's Altivec code the other day -- <a href="https://cgit.freedesktop.org/pixman/commit/?id=207626180d0282bb14a50f2e494174f54ac8a6ce" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">cgit.freedesktop.org/pixman/co</span><span class="invisible">mmit/?id=207626180d0282bb14a50f2e494174f54ac8a6ce</span></a></p><p>And in the process, I read through the Altivec docs and discovered that there are vector instructions that pack and unpack between a8r8g8b8 and a1r5g5b5 formats (but nothing fo r5g6b5).</p><p>Any clues why? Was a1r5g5b5 really common on Mac OS or something? I don't think I've seen a1r5g5b5 used anywhere.</p><p><a href="https://fosstodon.org/tags/powerpc" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>powerpc</span></a> <a href="https://fosstodon.org/tags/altivec" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>altivec</span></a> <a href="https://fosstodon.org/tags/simd" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>simd</span></a> <a href="https://fosstodon.org/tags/macos9" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>macos9</span></a> <a href="https://fosstodon.org/tags/pixman" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>pixman</span></a></p>
sarah quiñones<p><a href="https://eldritch.cafe/tags/simd" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>simd</span></a> </p><p>there's this trick i randomly found a few years ago and i've been wondering if there's a name for it or if other people have done this before</p><p>```<br>for enforcing floating point determinism with realigned buffers</p><p>if we have<br>x x x 0 1 2 3 4 5 6 7 x x x</p><p>where x is the identity for my operation, and our operation is commutative (not necessarily associative)</p><p>then adding x padding doesn't affect the result as long as we do a tree reduction at the end</p><p>e.g.</p><p>accumulate in register: v = 0+4 1+5 2+6 3+7</p><p>tree reduction step 0: (0+4)+(2+6) (1+5)+(3+7)<br>tree reduction step 1: ((0+4)+(2+6)) + ((1+5)+(3+7))</p><p>if we add padding (e.g., by realigning the buffer and using a masked load)</p><p>accumulate in register: v = x+1+5 x+2+6 x+3+7 0+4+x</p><p>tree reduction step 0: (1+5)+(3+7) (0+4)+(2+6)<br>tree reduction step 1: ((1+5)+(3+7)) + ((0+4)+(2+6))</p><p>commuting the elements shows us that this is the exact same result as the previous one, so the bit pattern of the final result is unaffected (modulo signed zero, nan, etc)<br>```</p>
FCLC<p>HW implementation people: is <a href="https://mast.hpc.social/tags/RVV" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>RVV</span></a> as bad to implement if it’s purely in order? </p><p>My intuition is you’re still going to have trouble around LMUL + gather, but that seems much easier without tracking implied state/register renaming across OoO</p><p><a href="https://mast.hpc.social/tags/SIMD" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>SIMD</span></a> <a href="https://mast.hpc.social/tags/riscv" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>riscv</span></a> <a href="https://mast.hpc.social/tags/HPC" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>HPC</span></a></p>
Karsten Schmidt<p>Yesterday, one year ago... (Still wondering how many people actually have read or tried out any of these)</p><p><a href="https://mastodon.thi.ng/@toxi/111348591236791838" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">mastodon.thi.ng/@toxi/11134859</span><span class="invisible">1236791838</span></a></p><p><a href="https://mastodon.thi.ng/tags/ThingUmbrella" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>ThingUmbrella</span></a> <a href="https://mastodon.thi.ng/tags/HowToThing" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>HowToThing</span></a> <a href="https://mastodon.thi.ng/tags/TypeScript" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>TypeScript</span></a> <a href="https://mastodon.thi.ng/tags/Tutorial" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Tutorial</span></a> <a href="https://mastodon.thi.ng/tags/Shader" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Shader</span></a> <a href="https://mastodon.thi.ng/tags/GIS" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>GIS</span></a> <a href="https://mastodon.thi.ng/tags/SIMD" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>SIMD</span></a> <a href="https://mastodon.thi.ng/tags/Forth" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Forth</span></a> <a href="https://mastodon.thi.ng/tags/ProcGen" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>ProcGen</span></a></p>
Sikorski Arkadiusz<p>Polly - LLVM Framework for High-Level Loop and Data-Locality Optimizations<br><a href="https://polly.llvm.org/" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="">polly.llvm.org/</span><span class="invisible"></span></a><br>"Polly is a high-level loop and data-locality optimizer and optimization infrastructure for LLVM. It uses an abstract mathematical representation based on integer polyhedra to analyze and optimize the memory access pattern of a program."</p><p><a href="https://floss.social/tags/llvm" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>llvm</span></a> <a href="https://floss.social/tags/polly" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>polly</span></a> <a href="https://floss.social/tags/dev" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>dev</span></a> <a href="https://floss.social/tags/code" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>code</span></a> <a href="https://floss.social/tags/programming" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>programming</span></a> <a href="https://floss.social/tags/optimization" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>optimization</span></a> <a href="https://floss.social/tags/clang" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>clang</span></a> <a href="https://floss.social/tags/optimizer" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>optimizer</span></a> <a href="https://floss.social/tags/polyhedra" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>polyhedra</span></a> <a href="https://floss.social/tags/OpenMP" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>OpenMP</span></a> <a href="https://floss.social/tags/SIMD" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>SIMD</span></a></p>
Astra Kernel :verified:<p>✨Verifying Rust Zeroize with Assembly...including portable SIMD</p><p><a href="https://cipherstash.com/blog/verifying-rust-zeroize-with-assembly-including-portable-simd" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">cipherstash.com/blog/verifying</span><span class="invisible">-rust-zeroize-with-assembly-including-portable-simd</span></a></p><p><a href="https://infosec.exchange/tags/rustlang" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>rustlang</span></a> <a href="https://infosec.exchange/tags/simd" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>simd</span></a> <a href="https://infosec.exchange/tags/programming" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>programming</span></a></p>
FCLC<p>Hi friends! Very excited to announce that I'll be giving an <span class="h-card" translate="no"><a href="https://mast.hpc.social/@easybuild" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@<span>easybuild</span></a></span> Tech Talk on the 13th of October on <a href="https://mast.hpc.social/tags/AVX10" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AVX10</span></a>!</p><p>The Talk is titled "AVX10 for HPC:<br>A reasonable solution to the 7 levels of AVX-512 folly" </p><p>Registration is free, all <a href="https://mast.hpc.social/tags/x86" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>x86</span></a>, <a href="https://mast.hpc.social/tags/AVX" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AVX</span></a>, <a href="https://mast.hpc.social/tags/AVX512" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AVX512</span></a>, <a href="https://mast.hpc.social/tags/SIMD" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>SIMD</span></a>, and <a href="https://mast.hpc.social/tags/HPC" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>HPC</span></a> experience levels welcome!</p><p>The page is here: <a href="https://easybuild.io/tech-talks/008_avx10.html" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">easybuild.io/tech-talks/008_av</span><span class="invisible">x10.html</span></a></p><p>And you can register here! <a href="https://event.ugent.be/registration/ebtechtalk008avx10" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">event.ugent.be/registration/eb</span><span class="invisible">techtalk008avx10</span></a></p>
Nick Doyle<p>An <a href="https://hachyderm.io/tags/introduction" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>introduction</span></a>,</p><p>I'm Nick, a principal software engineer at <a href="https://hachyderm.io/tags/Akamai" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Akamai</span></a>. I've been working on Image &amp; Video Manager to make working with images on the <a href="https://hachyderm.io/tags/web" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>web</span></a> fast and easy. I've worked on bits as high as frontend UI and as low as hand written <a href="https://hachyderm.io/tags/ASM" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>ASM</span></a> and <a href="https://hachyderm.io/tags/SIMD" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>SIMD</span></a>. <a href="https://hachyderm.io/tags/webperf" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>webperf</span></a></p><p>I'm enthusiastic about synthesizers, particularly modular synths. I use a large <a href="https://hachyderm.io/tags/Buchla" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Buchla</span></a> clone system in the studio and a <a href="https://hachyderm.io/tags/Eurorack" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Eurorack</span></a> system when I perform live. <a href="https://hachyderm.io/tags/synth" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>synth</span></a> </p><p>I also enjoy ergonomic mechanical keyboards.</p><p>Nice to meet you!</p>