Achtung:

Sie haben Javascript deaktiviert!
Sie haben versucht eine Funktion zu nutzen, die nur mit Javascript möglich ist. Um sämtliche Funktionalitäten unserer Internetseite zu nutzen, aktivieren Sie bitte Javascript in Ihrem Browser.

High Performance Computing

We co-operate with the group of Prof. Dr. Christian Plessl and the PC2 to develop tools for efficient use of parallel hardware as FPGAs and GP-GPUs for scientific computing .

Next topic: Semiconductor Theory

Related publications by the TET group


Open list in Research Information System

Solving Maxwell's Equations with Modern C++ and SYCL: A Case Study

A. Afzal, C. Schmitt, S. Alhaddad, Y. Grynko, J. Teich, J. Förstner, F. Hannig, in: Proceedings of the 29th Annual IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP), 2018, pp. 49-56

In scientific computing, unstructured meshes are a crucial foundation for the simulation of real-world physical phenomena. Compared to regular grids, they allow resembling the computational domain with a much higher accuracy, which in turn leads to more efficient computations.<br />There exists a wealth of supporting libraries and frameworks that aid programmers with the implementation of applications working on such grids, each built on top of existing parallelization technologies. However, many approaches require the programmer to introduce a different programming paradigm into their application or provide different variants of the code. SYCL is a new programming standard providing a remedy to this dilemma by building on standard C ++17 with its so-called single-source approach: Programmers write standard C ++ code and expose parallelism using C++17 keywords. The application is<br />then transformed into a concrete implementation by the SYCL implementation. By encapsulating the OpenCL ecosystem, different SYCL implementations enable not only the programming of CPUs but also of heterogeneous platforms such as GPUs or other devices. For the first time, this paper showcases a SYCL-<br />based solver for the nodal Discontinuous Galerkin method for Maxwell’s equations on unstructured meshes. We compare our solution to a previous C-based implementation with respect to programmability and performance on heterogeneous platforms.<br


OpenCL-based FPGA Design to Accelerate the Nodal Discontinuous Galerkin Method for Unstructured Meshes

T. Kenter, G. Mahale, S. Alhaddad, Y. Grynko, C. Schmitt, A. Afzal, F. Hannig, J. Förstner, C. Plessl, in: Proc. Int. Symp. on Field-Programmable Custom Computing Machines (FCCM), IEEE, 2018

The exploration of FPGAs as accelerators for scientific simulations has so far mostly been focused on small kernels of methods working on regular data structures, for example in the form of stencil computations for finite difference methods. In computational sciences, often more advanced methods are employed that promise better stability, convergence, locality and scaling. Unstructured meshes are shown to be more effective and more accurate, compared to regular grids, in representing computation domains of various shapes. Using unstructured meshes, the discontinuous Galerkin method preserves the ability to perform explicit local update operations for simulations in the time domain. In this work, we investigate FPGAs as target platform for an implementation of the nodal discontinuous Galerkin method to find time-domain solutions of Maxwell's equations in an unstructured mesh. When maximizing data reuse and fitting constant coefficients into suitably partitioned on-chip memory, high computational intensity allows us to implement and feed wide data paths with hundreds of floating point operators. By decoupling off-chip memory accesses from the computations, high memory bandwidth can be sustained, even for the irregular access pattern required by parts of the application. Using the Intel/Altera OpenCL SDK for FPGAs, we present different implementation variants for different polynomial orders of the method. In different phases of the algorithm, either computational or bandwidth limits of the Arria 10 platform are almost reached, thus outperforming a highly multithreaded CPU implementation by around 2x.


Flexible FPGA design for FDTD using OpenCL

T. Kenter, J. Förstner, C. Plessl, in: Proc. Int. Conf. on Field Programmable Logic and Applications (FPL), IEEE, 2017

DOI


Accelerating Finite Difference Time Domain Simulations with Reconfigurable Dataflow Computers

H. Giefers, C. Plessl, J. Förstner, ACM SIGARCH Computer Architecture News (2014), 41(5), pp. 65-70

DOI


Convey Vector Personalities – FPGA Acceleration with an OpenMP-like Effort?

B. Meyer, J. Schumacher, C. Plessl, J. Förstner, in: Proc. Int. Conf. on Field Programmable Logic and Applications (FPL), IEEE, 2012, pp. 189-196

Although the benefits of FPGAs for accelerating scientific codes are widely acknowledged, the use of FPGA accelerators in scientific computing is not widespread because reaping these benefits requires knowledge of hardware design methods and tools that is typically not available with domain scientists. A promising but hardly investigated approach is to develop tool flows that keep the common languages for scientific code (C,C++, and Fortran) and allow the developer to augment the source code with OpenMPlike directives for instructing the compiler which parts of the application shall be offloaded the FPGA accelerator. In this work we study whether the promise of effective FPGA acceleration with an OpenMP-like programming effort can actually be held. Our target system is the Convey HC-1 reconfigurable computer for which an OpenMP-like programming environment exists. As case study we use an application from computational nanophotonics. Our results show that a developer without previous FPGA experience could create an FPGA-accelerated application that is competitive to an optimized OpenMP-parallelized CPU version running on a two socket quad-core server. Finally, we discuss our experiences with this tool flow and the Convey HC-1 from a productivity and economic point of view.


Transformation of scientific algorithms to parallel computing code: subdomain support in a MPI-multi-GPU backend

B. Meyer, C. Plessl, J. Förstner, in: Symp. on Application Accelerators in High Performance Computing (SAAHPC), IEEE Computer Society, 2011, pp. 60-63

DOI


Open list in Research Information System

Funding, co-operations

Contact

Dr. Yevgen Grynko

Theoretical Electrical Engineering

Yevgen Grynko
Phone:
+49 5251 60-3815
Fax:
+49 5251 60-3524
Office:
P 1.5.02.2

Samer Alhaddad

Theoretical Electrical Engineering

Samer Alhaddad
Phone:
+49 5251 60-3819
Office:
P1.5.02.2

Head of the group

Prof. Dr. Jens Förstner

Theoretical Electrical Engineering

Jens Förstner
Phone:
+49 5251 60-3013
Fax:
+49 5251 60-3524
Office:
P1.5.01.1
Web:

Office hours:

on request (during lecture break)

The University for the Information Society