fosstodon.org is one of the many independent Mastodon servers you can use to participate in the fediverse.
Fosstodon is an invite only Mastodon instance that is open to those who are interested in technology; particularly free & open source software. If you wish to join, contact us for an invite.

Administered by:

Server stats:

9.8K
active users

#fluidx3d

1 post1 participant0 posts today
Dr. Moritz Lehmann<p>Battle of the giants: Nvidia <a href="https://mast.hpc.social/tags/Blackwell" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Blackwell</span></a> B200 takes the lead in FluidX3D CFD performance</p><p><a href="https://mast.hpc.social/tags/Nvidia" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Nvidia</span></a> <a href="https://mast.hpc.social/tags/B200" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>B200</span></a> just launched, and I'm one of the first people to benchmark 8x B200 via Shadeform, in a WhiteFiber server with 2x <a href="https://mast.hpc.social/tags/Intel" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Intel</span></a> <a href="https://mast.hpc.social/tags/Xeon6" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Xeon6</span></a> 6960P 72-core CPUs. 🖖😋</p><p>8x Nvidia B200 go head-to-head with 8x <a href="https://mast.hpc.social/tags/AMD" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AMD</span></a> <a href="https://mast.hpc.social/tags/MI300X" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>MI300X</span></a> in the <a href="https://mast.hpc.social/tags/FluidX3D" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>FluidX3D</span></a> <a href="https://mast.hpc.social/tags/CFD" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>CFD</span></a> benchmark, winning overall (with FP16S storage) at 219300 MLUPs/s (~17TB/s combined VRAM bandwidth), but losing in FP32 &amp; FP16C storage. 8x MI300X achieve 204924 MLUPs/s.</p>
Dr. Moritz Lehmann<p>My <a href="https://mast.hpc.social/tags/IWOCL" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>IWOCL</span></a> 2025 Keynote presentation is online! 🖖🧐<br>Scaling up <a href="https://mast.hpc.social/tags/FluidX3D" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>FluidX3D</span></a> <a href="https://mast.hpc.social/tags/CFD" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>CFD</span></a> beyond 100 Billion cells on a single computer - a story about the true cross-compatibility of <a href="https://mast.hpc.social/tags/OpenCL" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>OpenCL</span></a><br><a href="https://www.youtube.com/watch?v=Sb3ibfoOi0c&amp;list=PLA-vfTt7YHI2HEFrpzPhhQ8PhiztKhHU8&amp;index=1" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">youtube.com/watch?v=Sb3ibfoOi0</span><span class="invisible">c&amp;list=PLA-vfTt7YHI2HEFrpzPhhQ8PhiztKhHU8&amp;index=1</span></a><br>Slides: <a href="https://www.iwocl.org/wp-content/uploads/iwocl-2025-moritz-lehmann-keynote.pdf" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">iwocl.org/wp-content/uploads/i</span><span class="invisible">wocl-2025-moritz-lehmann-keynote.pdf</span></a></p>
Dr. Moritz Lehmann<p>What an honor to start the&nbsp;<a href="https://mast.hpc.social/tags/IWOCL" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>IWOCL</span></a>&nbsp;conference with my keynote talk! Nowhere else you get to talk to so many <a href="https://mast.hpc.social/tags/OpenCL" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>OpenCL</span></a>&nbsp;and&nbsp;<a href="https://mast.hpc.social/tags/SYCL" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>SYCL</span></a>&nbsp;experts in one room! I shared some updates on my <a href="https://mast.hpc.social/tags/FluidX3D" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>FluidX3D</span></a>&nbsp;<a href="https://mast.hpc.social/tags/CFD" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>CFD</span></a>&nbsp;solver, how I optimized it at the smallest level of a single grid cell, to scale it up on the largest <a href="https://mast.hpc.social/tags/Intel" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Intel</span></a>&nbsp;<a href="https://mast.hpc.social/tags/Xeon6" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Xeon6</span></a>&nbsp;<a href="https://mast.hpc.social/tags/HPC" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>HPC</span></a>&nbsp;systems that provide more memory capacity than any <a href="https://mast.hpc.social/tags/GPU" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>GPU</span></a>&nbsp;server. 🖖😃</p>
Dr. Moritz Lehmann<p>Just arrived in wonderful Heidelberg, looking forward to present the keynote talk at <a href="https://mast.hpc.social/tags/IWOCL" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>IWOCL</span></a> tomorrow!! See you there! 🖖😁<br><a href="https://www.iwocl.org/" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="">iwocl.org/</span><span class="invisible"></span></a> <a href="https://mast.hpc.social/tags/OpenCL" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>OpenCL</span></a> <a href="https://mast.hpc.social/tags/SYCL" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>SYCL</span></a> <a href="https://mast.hpc.social/tags/FluidX3D" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>FluidX3D</span></a> <a href="https://mast.hpc.social/tags/GPU" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>GPU</span></a> <a href="https://mast.hpc.social/tags/HPC" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>HPC</span></a></p>
Dr. Moritz Lehmann<p>I made this <a href="https://mast.hpc.social/tags/FluidX3D" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>FluidX3D</span></a> <a href="https://mast.hpc.social/tags/CFD" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>CFD</span></a> simulation run on a frankenstein zoo of 🟥AMD + 🟩Nvidia + 🟦Intel <a href="https://mast.hpc.social/tags/GPU" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>GPU</span></a>​s! 🖖🤪<br><a href="https://www.youtube.com/watch?v=_8Ed8ET9gBU" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">youtube.com/watch?v=_8Ed8ET9gB</span><span class="invisible">U</span></a></p><p>The ultimate SLI abomination setup:<br>- 1x Nvidia A100 40GB<br>- 1x Nvidia Tesla P100 16GB<br>- 2x Nvidia A2 15GB<br>- 3x AMD Instinct MI50<br>- 1x Intel Arc A770 16GB</p><p>I split the 2.5B cells in 9 domains of 15GB - A100 takes 2 domains, the other GPUs 1 domain each. The GPUs communicate over PCIe via <a href="https://mast.hpc.social/tags/OpenCL" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>OpenCL</span></a>.</p><p>Huge thanks to Tobias Ribizel from TUM for the hardware!</p>
Dr. Moritz Lehmann<p>I got access to <span class="h-card" translate="no"><a href="https://mastodon.social/@LRZ_DE" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@<span>LRZ_DE</span></a></span>'s new coma-cluster for <a href="https://mast.hpc.social/tags/OpenCL" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>OpenCL</span></a> benchmarking and experimentation 🖖😋💻🥨🍻<br>I've added a ton of new <a href="https://mast.hpc.social/tags/FluidX3D" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>FluidX3D</span></a> <a href="https://mast.hpc.social/tags/CFD" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>CFD</span></a> <a href="https://mast.hpc.social/tags/GPU" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>GPU</span></a>​/​<a href="https://mast.hpc.social/tags/CPU" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>CPU</span></a> benchmarks:<br><a href="https://github.com/ProjectPhysX/FluidX3D?tab=readme-ov-file#single-gpucpu-benchmarks" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">github.com/ProjectPhysX/FluidX</span><span class="invisible">3D?tab=readme-ov-file#single-gpucpu-benchmarks</span></a></p><p>Notable hardware configurations include:<br>- 4x H100 NVL 94GB<br>- 2x Nvidia L40S 48GB<br>- 2x Nvidia A2 15GB datacenter toaster<br>- 2x Intel Arc A770 16GB<br>- AMD+Nvidia SLI abomination consisting of 3x Instinct MI50 32GB + 1x A100 40GB<br>- AMD Radeon 8060S (chonky Ryzen AI Max+ 395 iGPU with quad-channel RAM) thanks to <span class="h-card" translate="no"><a href="https://mast.hpc.social/@cheese" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@<span>cheese</span></a></span></p>
Dr. Moritz Lehmann<p><a href="https://mast.hpc.social/tags/FluidX3D" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>FluidX3D</span></a> <a href="https://mast.hpc.social/tags/CFD" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>CFD</span></a> v3.2 is out! I've implemented the much requested <a href="https://mast.hpc.social/tags/GPU" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>GPU</span></a> summation for object force/torque; it's ~20x faster than <a href="https://mast.hpc.social/tags/CPU" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>CPU</span></a> <a href="https://mast.hpc.social/tags/multithreading" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>multithreading</span></a>. 🖖😋<br>Horizontal sum in <a href="https://mast.hpc.social/tags/OpenCL" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>OpenCL</span></a> was a nice exercise - first local memory reduction and then hardware-supported atomic floating-point add in VRAM, in a single-stage kernel. Hammering atomics isn't too bad as each of the ~10-340 workgroups dispatched at a time does only a single atomic add.<br>Also improved volumetric <a href="https://mast.hpc.social/tags/raytracing" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>raytracing</span></a>!<br><a href="https://github.com/ProjectPhysX/FluidX3D/releases/tag/v3.2" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">github.com/ProjectPhysX/FluidX</span><span class="invisible">3D/releases/tag/v3.2</span></a></p>
Dr. Moritz Lehmann<p>Hot Aisle's 8x AMD <a href="https://mast.hpc.social/tags/MI300X" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>MI300X</span></a> server is the fastest computer I've ever tested in <a href="https://mast.hpc.social/tags/FluidX3D" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>FluidX3D</span></a> <a href="https://mast.hpc.social/tags/CFD" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>CFD</span></a>, achieving a peak <a href="https://mast.hpc.social/tags/LBM" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>LBM</span></a> performance of 205 GLUPs/s, and a combined VRAM bandwidth of 23 TB/s. 🖖🤯<br>The <a href="https://mast.hpc.social/tags/RTX" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>RTX</span></a> 5090 looks like a toy in comparison.</p><p>MI300X beats even Nvidia's GH200 94GB. This marks a very fascinating inflection point in <a href="https://mast.hpc.social/tags/GPGPU" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>GPGPU</span></a>: <a href="https://mast.hpc.social/tags/CUDA" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>CUDA</span></a> is not the performance leader anymore. 🖖😛<br>You need a cross-vendor language like <a href="https://mast.hpc.social/tags/OpenCL" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>OpenCL</span></a> to leverage its power.</p><p>FluidX3D on <a href="https://mast.hpc.social/tags/GitHub" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>GitHub</span></a>: <a href="https://github.com/ProjectPhysX/FluidX3D" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">github.com/ProjectPhysX/FluidX</span><span class="invisible">3D</span></a></p>
Dr. Moritz Lehmann<p>I'm doing a podcast about <a href="https://mast.hpc.social/tags/FluidX3D" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>FluidX3D</span></a> today with Improbable Matter, going live in 30 minutes! 🖖🤠<br><a href="https://youtu.be/csGLVZqr0SE" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="">youtu.be/csGLVZqr0SE</span><span class="invisible"></span></a></p>
Dr. Moritz Lehmann<p>The 4x <a href="https://mast.hpc.social/tags/Nvidia" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Nvidia</span></a> <a href="https://mast.hpc.social/tags/H100" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>H100</span></a> SXM5 server in the new Festus cluster at Uni Bayreuth is the fastest system I've ever tested in <a href="https://mast.hpc.social/tags/FluidX3D" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>FluidX3D</span></a> <a href="https://mast.hpc.social/tags/CFD" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>CFD</span></a>, achieving 78 GLUPs/s <a href="https://mast.hpc.social/tags/LBM" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>LBM</span></a> performance at ~1650W <a href="https://mast.hpc.social/tags/GPU" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>GPU</span></a> power draw. 🖖😋🖥️🔥<br><a href="https://github.com/ProjectPhysX/FluidX3D?tab=readme-ov-file#multi-gpu-benchmarks" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">github.com/ProjectPhysX/FluidX</span><span class="invisible">3D?tab=readme-ov-file#multi-gpu-benchmarks</span></a><br><a href="https://www.hpc.uni-bayreuth.de/clusters/festus/#__tabbed_1_3" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">hpc.uni-bayreuth.de/clusters/f</span><span class="invisible">estus/#__tabbed_1_3</span></a></p>
Dr. Moritz Lehmann<p><a href="https://mast.hpc.social/tags/FluidX3D" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>FluidX3D</span></a> <a href="https://mast.hpc.social/tags/CFD" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>CFD</span></a> v3.1 is out! I have updated the <a href="https://mast.hpc.social/tags/OpenCL" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>OpenCL</span></a> headers for better device specs detection via device ID and <a href="https://mast.hpc.social/tags/Nvidia" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Nvidia</span></a> compute capability, fixed broken voxelization on some <a href="https://mast.hpc.social/tags/GPU" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>GPU</span></a>​s and added a workaround for a CPU compiler bug that corrupted rendering. Also <a href="https://mast.hpc.social/tags/AMD" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AMD</span></a> GPUs will now show up with their correct name (no idea why AMD can't report it as CL_DEVICE_NAME like every other sane vendor and instead need CL_DEVICE_BOARD_NAME_AMD extension...)<br>Have fun! 🖖😉 <br><a href="https://github.com/ProjectPhysX/FluidX3D/releases/tag/v3.1" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">github.com/ProjectPhysX/FluidX</span><span class="invisible">3D/releases/tag/v3.1</span></a></p>
Dr. Moritz Lehmann<p>RTX 5090 performance numbers for <a href="https://mast.hpc.social/tags/FluidX3D" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>FluidX3D</span></a> are in - thanks to <span class="h-card" translate="no"><a href="https://masto.ai/@phoronix" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@<span>phoronix</span></a></span>! And I finally found a way to format the performance chart on the <a href="https://mast.hpc.social/tags/FluidX3D" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>FluidX3D</span></a> <a href="https://mast.hpc.social/tags/GitHub" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>GitHub</span></a> page a bit better - especially larger font size. Hacking the mermaid gantt chart currently is the only way to embed a compact bar chart directly into markdown, without extra image file.<br>The mermaid language is still horriffic - inconsistent and half the styling commands don't even work. No way yet to color bars blue.<br><a href="https://github.com/ProjectPhysX/FluidX3D?tab=readme-ov-file#single-gpucpu-benchmarks" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">github.com/ProjectPhysX/FluidX</span><span class="invisible">3D?tab=readme-ov-file#single-gpucpu-benchmarks</span></a></p>
Dr. Moritz Lehmann<p><span class="h-card" translate="no"><a href="https://masto.ai/@phoronix" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@<span>phoronix</span></a></span> nice, the 512-bit memory bus doing its thing in <a href="https://mast.hpc.social/tags/FluidX3D" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>FluidX3D</span></a> <a href="https://mast.hpc.social/tags/CFD" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>CFD</span></a>! 🖖😋<br>Thanks for benchmarking!</p>
Dr. Moritz Lehmann<p>3 different <a href="https://mast.hpc.social/tags/GPU" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>GPU</span></a>​s, 1 <a href="https://mast.hpc.social/tags/CFD" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>CFD</span></a> simulation - <a href="https://mast.hpc.social/tags/FluidX3D" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>FluidX3D</span></a> "SLI"-ing (Intel A770 + Intel B580 + Nvidia Titan Xp) for 678 Million grid cells in 36GB combined VRAM <br><a href="https://www.youtube.com/watch?v=9VP3fruwnXc" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">youtube.com/watch?v=9VP3fruwnX</span><span class="invisible">c</span></a></p>
Dr. Moritz Lehmann<p>Finally 2¹² ⭐ for <a href="https://mast.hpc.social/tags/FluidX3D" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>FluidX3D</span></a> on <a href="https://mast.hpc.social/tags/GitHub" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>GitHub</span></a>! 🖖🤓</p>
Dr. Moritz Lehmann<p><span class="h-card" translate="no"><a href="https://mast.hpc.social/@st01014" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@<span>st01014</span></a></span> have added B580 <a href="https://mast.hpc.social/tags/FluidX3D" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>FluidX3D</span></a> benchmarks with non-zero-initialized box: <a href="https://github.com/ProjectPhysX/FluidX3D?tab=readme-ov-file#single-gpucpu-benchmarks" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">github.com/ProjectPhysX/FluidX</span><span class="invisible">3D?tab=readme-ov-file#single-gpucpu-benchmarks</span></a><br>(scroll down below bar chart, expand the section with the full table there)</p>
Dr. Moritz Lehmann<p><span class="h-card" translate="no"><a href="https://mast.hpc.social/@st01014" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@<span>st01014</span></a></span> it's wrong unfortunately. B580 has a hardware optimization to detect if a kernel writes all 0's to VRAM, and in this case skips the write completely, which saves a lot of BW. The <a href="https://mast.hpc.social/tags/FluidX3D" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>FluidX3D</span></a> benchmark is a 0-initialized box, where B580 applies this. In a non-0-initialized simulation, performance is more what you expect from 456GB/s.<br>It's a bit of an edge case, no other <a href="https://mast.hpc.social/tags/GPU" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>GPU</span></a> does that, so I have not yet made adjustments on app side. Will post good benchmarks on the weekend.<br>Cc <span class="h-card" translate="no"><a href="https://masto.ai/@phoronix" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@<span>phoronix</span></a></span></p>
Dr. Moritz Lehmann<p><a href="https://mast.hpc.social/tags/FluidX3D" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>FluidX3D</span></a> x <a href="https://mast.hpc.social/tags/Intel" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Intel</span></a> tshirt is the coolest thing ever!! <a href="https://mast.hpc.social/tags/SC24" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>SC24</span></a></p>
Dr. Moritz Lehmann<p>This is the largest <a href="https://mast.hpc.social/tags/CFD" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>CFD</span></a> simulation ever on a single computer, the <a href="https://mast.hpc.social/tags/NASA" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>NASA</span></a> X-59 at 117 Billion grid cells. This video visualizes 7.6 PetaByte if volumetric data.</p><p>I did this simulation on 2x <a href="https://mast.hpc.social/tags/Intel" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Intel</span></a> Xeon 6980P <a href="https://mast.hpc.social/tags/HPC" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>HPC</span></a> CPUs with 6TB MRDIMM memory at massive 1.7TB/s bandwidth. No <a href="https://mast.hpc.social/tags/GPU" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>GPU</span></a>​s required! 🖖😋🟦</p><p><a href="https://www.youtube.com/watch?v=K5eKxzklXDA" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">youtube.com/watch?v=K5eKxzklXD</span><span class="invisible">A</span></a></p><p>As a little gift to you all: <a href="https://mast.hpc.social/tags/FluidX3D" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>FluidX3D</span></a> v3.0 is out now, enabling 31% larger resolution on CPUs/iGPUs with <a href="https://mast.hpc.social/tags/OpenCL" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>OpenCL</span></a> zero-copy buffers:<br><a href="https://github.com/ProjectPhysX/FluidX3D/releases/tag/v3.0" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">github.com/ProjectPhysX/FluidX</span><span class="invisible">3D/releases/tag/v3.0</span></a></p>
Dr. Moritz Lehmann<p><span class="h-card" translate="no"><a href="https://fediscience.org/@giuseppebilotta" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@<span>giuseppebilotta</span></a></span> yes, <a href="https://mast.hpc.social/tags/FluidX3D" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>FluidX3D</span></a> can real-time render in <a href="https://mast.hpc.social/tags/ASCII" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>ASCII</span></a> mode over SSH! We'll have that at <a href="https://mast.hpc.social/tags/SC24" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>SC24</span></a> as live demo!</p>