cynicalsecurity :cm_2:<p>A. Olgun et al., "PiDRAM: A Holistic End-to-end FPGA-based Framework for Processing-in-DRAM"¹ </p><p>Commodity DRAM-based processing-using-memory (PuM) techniques that are supported by off-the-shelf DRAM chips present an opportunity for alleviating the data movement bottleneck at low cost. However, sys- tem integration of these techniques imposes non-trivial challenges that are yet to be solved. Potential solu- tions to the integration challenges require appropriate tools to develop any necessary hardware and software components. Unfortunately, current proprietary computing systems, specialized DRAM-testing platforms, or system simulators do not provide the flexibility and/or the holistic system view that is necessary to properly evaluate and deal with the integration challenges of commodity DRAM-based PuM techniques.<br>We design and develop Processing-in-DRAM (PiDRAM), the first flexible end-to-end framework that en- ables system integration studies and evaluation of real, commodity DRAM-based PuM techniques. PiDRAM provides software and hardware components to rapidly integrate PuM techniques across the whole system software and hardware stack. We implement PiDRAM on an FPGA-based RISC-V system. To demonstrate the flexibility and ease of use of PiDRAM, we implement and evaluate two state-of-the-art commodity DRAM- based PuM techniques: (i) in-DRAM copy and initialization (RowClone) and (ii) in-DRAM true random num- ber generation (D-RaNGe). We describe how we solve key integration challenges to make such techniques work and be effective on a real-system prototype, including memory allocation, alignment, and coherence. We observe that end-to-end RowClone speeds up bulk copy and initialization operations by 14.6× and 12.6×, respectively, over conventional CPU copy, even when coherence is supported with inefficient cache flush operations. Over PiDRAM’s extensible codebase, integrating both RowClone and D-RaNGe end-to-end on a real RISC-V system prototype takes only 388 lines of Verilog code and 643 lines of C++ code.</p><p><a href="https://bsd.network/tags/ResearchPapers" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>ResearchPapers</span></a> <a href="https://bsd.network/tags/ProcessingInMemory" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>ProcessingInMemory</span></a> <a href="https://bsd.network/tags/ProcessingInDRAM" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>ProcessingInDRAM</span></a> <a href="https://bsd.network/tags/FPGA" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>FPGA</span></a> <a href="https://bsd.network/tags/RISCV" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>RISCV</span></a> <a href="https://bsd.network/tags/MemoryControllers" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>MemoryControllers</span></a><br>__<br>¹ <a href="https://dl.acm.org/doi/pdf/10.1145/3563697" rel="nofollow noopener noreferrer" target="_blank"><span class="invisible">https://</span><span class="ellipsis">dl.acm.org/doi/pdf/10.1145/356</span><span class="invisible">3697</span></a> (<a href="https://bsd.network/tags/PDF" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>PDF</span></a>)</p>