WNCG - Wireless Networking and Communications Group - Linear Algebra
http://wncg.org/tags/linear-algebra
enLinear Algebra Accelerators for Low-Power, High-Performance Multi-Core Computing
http://wncg.org/research/briefs/linear-algebra-accelerators-low-power-high-performance-multi-core-computing
<div class="field field-name-field-publish-date field-type-datetime field-label-hidden"><div class="field-items"><div class="field-item even"><span class="date-display-single">Tuesday, July 1, 2014</span></div></div></div><div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"> <p>With semiconductor technology scaling reaching physical limits, overcoming power limitations is one of the major issues on the path to increased performance. It is well-accepted that specialization and heterogeneity at the hardware level can be keys to achieving orders of magnitude improvements in both power consumption and performance. However, full-custom hardware design is expensive in many ways. The question is whether multi-core processors can be designed that achieve the efficiency of custom hardware with enough flexibility to run a broad class of applications.</p>
<p>WNCG Prof. Andreas Gerstlauer and students, in collaboration with UT Austin computer science Prof. Robert A. Van de Geijn are studying these questions for several domains, including linear algebra computations, which are at the core of many high-performance as well as embedded, signal processing or big data applications. By co-designing algorithms and architectures for a dedicated Linear Algebra Processor (LAP), the team's previous results have shown that a prototypical LAP in 45nm is expected to maintain 600 double-precision GFLOPS in less than 25W with enough flexibility to support the full range of basic linear algebra subroutines (BLAS). This is orders of magnitude more energy efficient (as measured in energy per operation) than existing CPUs or GPUs.</p>
<p>In recent work, the UT Austin research team have been able to show that, with minimal modifications to the LAP base architecture, similar efficiencies are achievable across a wider range of applications, including complete matrix factorizations as well as Fast Fourier Transforms (FFTs). On-going work is concerned with investigating system integration of one or more LAPs into larger, heterogeneous multi-core host architectures, including associated programming models as well as optimized mapping and compilation of parallelized applications onto such platforms. Furthermore, the researchers are investigating LAP implementation and prototyping onto FPGAs or ASICs.</p>
<p>This research is funded by the National Science Foundation. </p>
<p><strong>Paper 1: </strong><a href="http://dx.doi.org/10.1007/s11265-014-0896-x">A Highly Efficient Multi-Core Floating FFT Architecture Based on Hybrid Linear Algebra/FFT Cores</a></p>
<p><strong>Paper 2: </strong><a href="http://dx.doi.org/10.1109/TC.2014.2315627">Algorithm, Architecture and Floating-Point Unit Codesign of a Matrix Factorization Accelerator</a></p>
<p><strong>Paper 3: </strong><a href="http://users.ece.utexas.edu/~gerstl/publications/TC12.LAP.pdf">Codesign Tradeoffs for High-Performance, Low-Power Linear Algebra Architectures</a></p>
</div></div></div><div class="field field-name-field-related-faculty field-type-node-reference field-label-inline clearfix"><div class="field-label">Related Faculty: </div><div class="field-items"><div class="field-item even"><a href="/people/faculty/andreas-gerstlauer">Andreas Gerstlauer</a></div></div></div><div class="field field-name-field-related-students field-type-node-reference field-label-inline clearfix"><div class="field-label">Related Researchers: </div><div class="field-items"><div class="field-item even"><a href="/people/students/ardavan-pedram">Ardavan Pedram</a></div></div></div><div class="field field-name-field-tags field-type-taxonomy-term-reference field-label-inline clearfix"><div class="field-label">Keywords: </div><div class="field-items"><div class="field-item even"><a href="/tags/domain-specific-multi-core-architectures">Domain-specific multi-core architectures</a>, <a href="/tags/heterogenenous-computing">heterogenenous computing</a>, <a href="/tags/low-power-high-performance-computing">low-power high-performance computing</a>, <a href="/tags/linear-algebra">Linear Algebra</a>, <a href="/tags/fft">FFT</a></div></div></div><div class="field-collection-container clearfix"><div class="field field-name-field-affiliates-only-files field-type-field-collection field-label-above"><div class="field-label">Affiliates Only Files: </div><div class="field-items"><div class="field-item even"><div class="field-collection-view clearfix view-mode-full"><div class="entity entity-field-collection-item field-collection-item-field-affiliates-only-files clearfix">
<div class="content">
<div class="field field-name-field-title-of-file field-type-text field-label-hidden affiliates_only_file_title"><div class="field-items"><div class="field-item even">A Highly Efficient Multicore Floating-Point FFT Architecture Based on Hybrid Linear Algebra/FFT Cores (preprint)</div></div></div> </div>
</div>
</div></div><div class="field-item odd"><div class="field-collection-view clearfix view-mode-full"><div class="entity entity-field-collection-item field-collection-item-field-affiliates-only-files clearfix">
<div class="content">
<div class="field field-name-field-title-of-file field-type-text field-label-hidden affiliates_only_file_title"><div class="field-items"><div class="field-item even">Algorithm, Architecture, and Floating-Point Unit Codesign of a Matrix Factorization Accelerator (preprint)</div></div></div> </div>
</div>
</div></div><div class="field-item even"><div class="field-collection-view clearfix view-mode-full field-collection-view-final"><div class="entity entity-field-collection-item field-collection-item-field-affiliates-only-files clearfix">
<div class="content">
<div class="field field-name-field-title-of-file field-type-text field-label-hidden affiliates_only_file_title"><div class="field-items"><div class="field-item even">Codesign Tradeoffs for High-Performance, Low-Power Linear Algebra Architectures (preprint)</div></div></div> </div>
</div>
</div></div></div></div></div>Tue, 01 Jul 2014 15:33:03 +0000lab27993489 at http://wncg.orghttp://wncg.org/research/briefs/linear-algebra-accelerators-low-power-high-performance-multi-core-computing#comments