Florian Mayer
Florian Mayer, M. Sc.
Projects
-
OpenMP for reconfigurable heterogenous architectures
(Third Party Funds Group – Sub project)
Overall project: OpenMP für rekonfigurierbare heterogene Architekturen
Term: 01.11.2017 - 31.12.2023
Funding source: Bundesministerium für Bildung und Forschung (BMBF)
URL: https://www2.cs.fau.de/research/ORKA/High-Performance Computing (HPC) is an important component of Europe's capacity for innovation and it is also seen as a building block of the digitization of the European industry. Reconfigurable technologies such as Field Programmable Gate Array (FPGA) modules are gaining in importance due to their energy efficiency, performance, and flexibility.
There is also a trend towards heterogeneous systems with accelerators utilizing FPGAs. The great flexibility of FPGAs allows for a large class of HPC applications to be realized with FPGAs. However, FPGA programming has mainly been reserved for specialists as it is very time consuming. For that reason, the use of FPGAs in areas of scientific HPC is still rare today.
In the HPC environment, there are various programming models for heterogeneous systems offering certain types of accelerators. Common models include OpenCL (http://www.opencl.org), OpenACC (https://www.openacc.org) and OpenMP (https://www.OpenMP.org). These standards, however, are not yet available for the use with FPGAs.Goals of the ORKA project are:
- Development of an OpenMP 4.0 compiler targeting heterogeneous computing platforms with FPGA accelerators in order to simplify the usage of such systems.
- Design and implementation of a source-to-source framework transforming C/C++ code with OpenMP 4.0 directives into executable programs utilizing both the host CPU and an FPGA.
- Utilization (and improvement) of existing algorithms mapping program code to FPGA hardware.
- Development of new (possibly heuristic) methods to optimize programs for inherently parallel architectures.
In 2018, the following important contributions were made:
- Development of a source-to-source compiler prototype for the rewriting of OpenMP C source code (cf. goal 2).
- Development of an HLS compiler prototype capable of translating C code into hardware. This prototype later served as starting point for the work towards the goals 3 and 4.
- Development of several experimental FPGA infrastructures for the execution of accelerator cores (necessary for the goals 1 and 2).
In 2019, the following significant contributions were achieved:
- Publication of two peer-reviewed papers: "OpenMP on FPGAs - A Survey" and "OpenMP to FPGA Offloading Prototype using OpenCL SDK".
- Improvement of the source-to-source compiler in order to properly support OpenMP-target-outlining for FPGA targets (incl. smoke tests).
- Completion of the first working ORKA-HPC prototype supporting a complete OpenMP-to-FPGA flow.
- Formulation of a genome for the pragma-based genetic optimization of the high-level synthesis step during the ORKA-HPC flow.
- Extension of the TaPaSCo composer to allow for hardware synchronization primitives inside of TaPaSCo systems.
In 2020, the following significant contributions were achieved:
- Improvement of the Genetic Optimization.
- Engineering of a Docker container for reliable reproduction of results.
- Integration of software components from project partners.
- Development of a plugin architecture for Low-Level-Platforms.
- Implementation and integration of two LLP plugin components.
- Broadening of the accepted subset of OpenMP.
- Enhancement of the test suite.
In 2021, the following significant contributions were achieved:
- Enhancement of the benchmark suite.
- Enhancement of the test suite.
- Successful project completion with live demo for the project sponsor.
- Publication of the paper "ORKA-HPC - Practical OpenMP for FPGAs".
- Release of the source code and the reproduction package.
- Enhancement of the accepted OpenMP subset with new clauses to control the FPGA related transformations.
- Improvement of the Genetic Optimization.
- Comparison of the estimated performance data given by the HLS and the real performance.
- Synthesis of a linear regression model for performance prediction based on that comparison.
- Implementation of an infrastructure for the translation of OpenMP reduction clauses.
- Automated translation of the OpenMP pragma `parallel for` into a parallel FPGA system.
In 2022, the following significant contributions were achieved:
- Generation and publication of an extensive dataset on HLS area estimates and actual performance.
- Creation and comparative evaluation of different regression models to predict actual system performance from early (area) estimates.
- Evaluation of the area estimates generated by the HLS.
- Publication of the paper “Reducing OpenMP to FPGA Round-trip Times with Predictive Modelling”.
- Development of a method to detect and remove redundant read operations in FPGA stencil codes based on the polyhedral model.
- Implementation of the method for ORKA-HPC.
- Quantitative evaluation of that method to show the strength of the method and to show when to use it.
- Publication of the paper “Employing Polyhedral Methods to Reduce Data Movement in FPGA Stencil Codes”.
Current courses
Ausgewählte Kapitel aus dem Übersetzerbau
Basic data
Title | Ausgewählte Kapitel aus dem Übersetzerbau |
---|---|
Short text | inf2-ueb3 |
Module frequency | nur im Wintersemester |
Semester hours per week | 2 |
Es ist keine Anmeldung erforderlich.
Parallel groups / dates
In der Vorlesung werden Aspekte des Übersetzerbaus beleuchtet, die über die Vorlesungen "Grundlagen des Übersetzerbaus" und "Optimierungen in Übersetzern" hinausgehen.
Voraussichtliche Themen sind:
- Übersetzer u. Optimierungen für funktionale Programmiersprachen
- Übersetzung aspektorientierter Programmiersprachen
- Erkennung von Wettlaufsituationen
- Software Watermarking
- Statische Analyse und symbolische Ausführung
- Binden von Objektcode und Unterstützung für dynamische Bibliotheken
- Strategien zur Ausnahmebehandlung
- Just-in-Time-Übersetzer
- Speicherverwaltung und Speicherbereinigung
- LLVM
Die Materialien zur Lehrveranstaltung werden über StudOn bereitgestellt: https://www.studon.fau.de/crs4533480.html
1. Parallelgruppe
Semester hours per week | 2 |
---|---|
Teaching language | German |
Responsible |
Prof. Dr. Michael Philippsen Florian Mayer Julian Brandner Tobias Heineken |
Date and Time | Start date - End date | Cancellation date | Lecturer(s) | Comment | Room |
---|---|---|---|---|---|
wöchentlich Wed, 08:15 - 09:45 | 18.10.2023 - 07.02.2024 | 01.11.2023 27.12.2023 03.01.2024 |
|
11302.02.133 |
Grundlagen des Übersetzerbaus
Basic data
Title | Grundlagen des Übersetzerbaus |
---|---|
Short text | inf2-ueb |
Module frequency | nur im Wintersemester |
Semester hours per week | 2 |
Voraussetzung zur Teilnahme an der Modulprüfung ist die erfolgreiche Bearbeitung der Übungsaufgaben.
Parallel groups / dates
*Motivation:*
At first glance, it may appear less important to focus on compiler
construction. Other areas seem to be much more applicable to current
tasks in industrial practice. But appearences are deceptive:
- Compilers are among the most thoroughly studied middle-sized sequential software systems. Hence, there is a lot to learn from the experience made in the past.
- In the exercises that accompany this lecture, you will construct your own (small) compiler.
- For many participants, this project will be their first bigger software project.
- Normally, you expect every compiler you use to generate correct code. In the lecture, you will learn how one can achieve the required degree of correctness and reliability.
- You will gain an understanding of the concepts of programming languages and of how high-level language features are translated into machine code. Keeping this knowledge at the back of your mind, you will improve your capability to write good and efficient programs.
- Compilers are used not only for programming languages. Special compilers are needed in many areas of every-day life in computer science, e.g. for text formatting, program transformations, aspect oriented programming, XML processing etc.
- Every engineer should be able to build the tools he/she is using. For computer scienctists, this requires an in-depth understanding of the guts of compilers.
.
*Main Focus of this Lecture:*
The lecture teaches concepts and techniques of compiler construction from a compiler developer view, following the structure of the compiler frontend, middle end, and backend. Exercise sessions and practical assignments complement the lecture; the students implement their own compiler (based on a framework) for the e2 programming language, which is designed for this series of compiler construction lectures.
*Topics covered in the lecture:*
- Principles of compiling imperative programming languages
- Structure of a compiler
- Scanner and parser
- Abstract syntax trees (ASTs)
- Visitor design pattern
- AST transformations, desugaring
- Symbol tables and scopes
- Semantic analysis: name analysis, type checking
- Compilation of arithmetic expressions and control flow structures to register-based and stack-based intermediate languages
- Compilation of functions and function calls, activation records
- Compilation of object-oriented languages with single inheritance, interfaces, and multiple inheritance
- Method resolution in Java (overloaded and overridden methods)
- Code generation with Sethi-Ullmann algorithm, Graham-Glanville algorithm, tree transformations, and dynamic programming
- Register allocation with local techniques and graph coloring
- Instruction scheduling with the list scheduling technique
- Debuggers
.
*Lecture Topics:*
1. Introduction (Class overview, modular structure of compilers (front-, middle-, and backend), compilation bootstrapping)
2. Lexer and Parser (Tokens, literals, symbol table, grammar classes (LR(k), LL(k), ...), concrete syntax tree, shift-reduce parser)
3. ASTs and semantic analysis (Abstract syntax tree, visitor pattern, double dispatch, scopes, definition table)
4. Type consistency (Type safety, type system, type checks, type inference, type conversions, attributed grammars)
5. AST transformations (Transformation patterns (arithmetics), transformation of nested and generic classes)
6. Intermediate representations (Types of IRs, arithmetic operations, assignments, multidimensional array access, structs, control flow instructions, short-circuit evaluation)
7. Activation record and stack frame (Relative addresses, call by value/reference/name, nested functions, function pointers, stack pointer and frame pointer, function calls: prolog and epilog)
8. Object-oriented languages: single inheritance (Symbol and type analysis, method selection with method overloading and overriding, virtual method calls, class descriptors, dynamic type checks and casts)
9. Object-oriented languages II: interfaces, multiple inheritance (Interface v-tables, dynamic type checks and casts with interfaces, interfaces with default implementations and state, diamond problem, virtual inheritance)
10. Basic code generation (Code selection, register allocation, instruction order, basic blocks, optimal code generation for expression trees)
11. Optimized code selection (Code selection as tree transformation, Graham-Glanville code generators, dynamic programming)
12. Optimized register allocation (Performance approximations, liveness analysis, collision and interference graph, register spilling, coloring heuristics, optimistic extension, live range splitting, register coalescing, data structures)
13. Instruction level parallelism, instruction order, debugger (Data, structural, and control conflicts in CPU pipelines, list scheduling, delay slots, branch predictions, superscalar and VLIW architectures, ptrace, break- and watch-points, DWARF)
.
*Assignment Milestones:*
For the assignments of this course, the students put the concepts and techniques presented in the lecture for implementing a compiler into practice. The goal of the assignments is to implement a functional compiler for the e2 programming language by the end of the semester. The e2 language is specifically designed for educational purposes; the students obtain a description of the language.
A framework for the implementation is provided to the students. The students implement the core components of the compiler in five milestones.
All milestones need to be fulfilled to pass the module; the last milestone contains two tasks. In particular, the milestones are:
- Milestone 1: Grammar definition and construction of the AST: ANTLR productions, AST visitor interface, and generic AST visitor for array accesses and return and loop statements; AST visitor for AST visualization.
- Milestone 2: Name analysis: symbol table; declaring standard functions; AST visitor for name analysis.
- Milestone 3: Constant folding and type analysis: AST transformations for constant folding; AST visitor for bottom-up type analysis, adding AST nodes for implicit casts;
- Milestone 4: AST translation to intermediate representation: AST visitor to generate IR; translation of arithmetic, return, and assign statements, logical expressions, conditions, loops.
- Milestone 5.0: Memory assignment: definition and implementation of the ABI calling convention; memory assignment of variables; stack frame allocation; caller-save and callee-save registers.
- Milestone 5.1: Code generation: implementation of the e2 standard library; IR visitor to generate assembly code.
For milestones one through three, the compiler needs to support both integer and floating-point arithmetic. For the last two milestones, only integer arithmetic is required. null
1. Parallelgruppe
Semester hours per week | 2 |
---|---|
Teaching language | German |
Responsible |
Prof. Dr. Michael Philippsen Tobias Heineken Florian Mayer |
Literature references: - "Modern Compiler Implementation in Java", A. Appel, Cambridge University Press, 1998
- "Compilers - Principles, Techniques and Tools", A. Aho, M. Lam, R. Sethi, J. Ullmann, Addison-Wesley, 2006
- "Modern Compiler Design", D. Grune, H. Bal, C. Jacobs, K. Langendoen, Wiley, 2002 null
Date and Time | Start date - End date | Cancellation date | Lecturer(s) | Comment | Room |
---|---|---|---|---|---|
wöchentlich Thu, 08:15 - 09:45 | 19.10.2023 - 08.02.2024 | 28.12.2023 04.01.2024 |
|
11301.00.005 |
Übungen zu Ausgewählte Kapitel aus dem Übersetzerbau
Basic data
Title | Übungen zu Ausgewählte Kapitel aus dem Übersetzerbau |
---|---|
Short text | inf2-ueb3-ex |
Module frequency | nur im Wintersemester |
Semester hours per week | 2 |
Blockveranstaltung n.V. nach der Vorlesungszeit.
Parallel groups / dates
Die Übungen zu Übersetzerbau 3 stellen eine Ergänzung zur
Vorlesung dar. In der Vorlesung wird unter anderem die
Architektur und Funktionsweise einer virtuellen Maschine
beleuchtet. In den Übungen soll dies praktisch umgesetzt werden.
Hierzu sollen die Studenten in einer Blockveranstaltung eine
kleine virtuelle Maschine selbst implementieren. Den Anfang
bildet das Einlesen des Byte-Codes und am Ende soll ein
funktionsfähiger optimierender Just-in-Time-Übersetzer entstehen.
Die Materialien zur Lehrveranstaltung werden über StudOn bereitgestellt: https://www.studon.fau.de/crs4533480.html
1. Parallelgruppe
Semester hours per week | 2 |
---|---|
Teaching language | German |
Responsible |
Prof. Dr. Michael Philippsen Tobias Heineken Florian Mayer Julian Brandner |
Date and Time | Start date - End date | Cancellation date | Lecturer(s) | Comment | Room |
---|---|---|---|---|---|
Blockveranstaltung Mon, 09:00 - 18:00 | 04.03.2024 - 08.03.2024 |
|
11302.02.135 |
Übungen zu Grundlagen des Übersetzerbaus
Basic data
Title | Übungen zu Grundlagen des Übersetzerbaus |
---|---|
Short text | inf2-ueb-ex |
Module frequency | nur im Wintersemester |
Semester hours per week | 2 |
Parallel groups / dates
Im Rahmen der Übungen werden die in der Vorlesung vorgestellten Konzepte und Techniken zur Implementierung eines Übersetzers in die Praxis umgesetzt. Ziel der Übungen ist es, bis zum Ende des Semesters einen funktionsfähigen Übersetzer für die Beispiel-Programmiersprache e2 zu implementieren.
Die hierfür nötigen zusätzlichen Kenntnisse (z.B. Grundlagen des Assemblers für x86-64) werden in den Tafelübungen vermittelt.
Die im Laufe des Semesters zu erreichenden Meilensteine sind im UnivIS-Eintrag der Vorlesung aufgelistet.
Die Materialien zur Lehrveranstaltung werden über StudOn bereitgestellt: https://www.studon.fau.de/crs4533479.html
1. Parallelgruppe
Semester hours per week | 2 |
---|---|
Teaching language | German |
Responsible |
Prof. Dr. Michael Philippsen Tobias Heineken Florian Mayer |
Date and Time | Start date - End date | Cancellation date | Lecturer(s) | Comment | Room |
---|---|---|---|---|---|
wöchentlich Mon, 14:15 - 15:45 | 16.10.2023 - 05.02.2024 | 01.01.2024 25.12.2023 |
|
11302.02.133 |
Im Rahmen der Übungen werden die in der Vorlesung vorgestellten Konzepte und Techniken zur Implementierung eines Übersetzers in die Praxis umgesetzt. Ziel der Übungen ist es, bis zum Ende des Semesters einen funktionsfähigen Übersetzer für die Beispiel-Programmiersprache e2 zu implementieren.
Die hierfür nötigen zusätzlichen Kenntnisse (z.B. Grundlagen des Assemblers für x86-64) werden in den Tafelübungen vermittelt.
Die im Laufe des Semesters zu erreichenden Meilensteine sind im UnivIS-Eintrag der Vorlesung aufgelistet.
Die Materialien zur Lehrveranstaltung werden über StudOn bereitgestellt: https://www.studon.fau.de/crs4533479.html
2. Parallelgruppe
Semester hours per week | 2 |
---|---|
Teaching language | German |
Responsible |
Prof. Dr. Michael Philippsen Tobias Heineken Florian Mayer |
Date and Time | Start date - End date | Cancellation date | Lecturer(s) | Comment | Room |
---|---|---|---|---|---|
wöchentlich Fri, 08:15 - 09:45 | 20.10.2023 - 09.02.2024 | 29.12.2023 05.01.2024 |
|
11302.02.133 |
Im Rahmen der Übungen werden die in der Vorlesung vorgestellten Konzepte und Techniken zur Implementierung eines Übersetzers in die Praxis umgesetzt. Ziel der Übungen ist es, bis zum Ende des Semesters einen funktionsfähigen Übersetzer für die Beispiel-Programmiersprache e2 zu implementieren.
Die hierfür nötigen zusätzlichen Kenntnisse (z.B. Grundlagen des Assemblers für x86-64) werden in den Tafelübungen vermittelt.
Die im Laufe des Semesters zu erreichenden Meilensteine sind im UnivIS-Eintrag der Vorlesung aufgelistet.
Die Materialien zur Lehrveranstaltung werden über StudOn bereitgestellt: https://www.studon.fau.de/crs4533479.html
3. Parallelgruppe
Semester hours per week | 2 |
---|---|
Teaching language | German |
Responsible |
Prof. Dr. Michael Philippsen Tobias Heineken Florian Mayer |
Date and Time | Start date - End date | Cancellation date | Lecturer(s) | Comment | Room |
---|---|---|---|---|---|
wöchentlich Fri, 10:15 - 11:45 | 20.10.2023 - 09.02.2024 | 29.12.2023 05.01.2024 |
|
11302.02.133 |
Publications
2023
Multipurpose Cacheing to Accelerate OpenMP Target Regions on FPGAs (Best Paper Award)
Proceedings of the 19th International Workshop on OpenMP, IWOMP 2023 (Bristol, GBR, 13.09.2023 - 15.09.2023)
In: Simon McIntosh-Smith, Tom Deakin, Michael Klemm, Bronis R. de Supinski, Jannis Klinkenberg (ed.): OpenMP: Advanced Task-Based, Device and Compiler Programming 2023
DOI: 10.1007/978-3-031-40744-4_10
BibTeX: Download
, , :
Multipurpose Cacheing to Accelerate OpenMP Target Regions on FPGAs [Data set]
14114 (2023), p. 147 - 162
ISSN: 0302-9743
DOI: 10.5281/zenodo.8055889
BibTeX: Download
(online publication)
, , :
Employing Polyhedral Methods to Reduce Data Movement in FPGA Stencil Codes
Languages and Compilers for Parallel Computing (LCPC 2022) (Chicago, IL, 12.10.2022 - 14.10.2022)
In: Charith Mendis, Lawrence Rauchwerger (ed.): Proc. of the 35rd Intl. Workshop on Languages and Compilers for Parallel Computing (LCPC 2022), Cham: 2023
DOI: 10.1007/978-3-031-31445-2_4
BibTeX: Download
, , :
2022
Reducing OpenMP to FPGA Round-trip Times with Predictive Modelling
18th International Workshop on OpenMP (IWOMP 2022) (Chattanooga, TN, 27.09.2022 - 30.09.2022)
In: Michael Klemm, Bronis R. de Supinski, Jannis Klinkenberg, Brandon Neth (ed.): OpenMP in a Modern World: From Multi-device Support to Meta Programming 2022
DOI: 10.1007/978-3-031-15922-0
BibTeX: Download
, , :
Reducing OpenMP to FPGA Round-trip Times with Predictive Modelling [Data set]
Zenodo (2022)
DOI: 10.5281/zenodo.7534795
BibTeX: Download
(online publication)
, , :
The ORKA-HPC Compiler — Practical OpenMP for FPGAs
34th International Workshop on Languages and Compilers for Parallel Computing (LCPC 2021) (Newark, DE, 13.10.2021 - 14.10.2021)
In: Xiaoming Li, Sunita Chandrasekaran (ed.): Proceedings of the 34th International Workshop on Languages and Compilers for Parallel Computing (LCPC 2021), Cham: 2022
DOI: 10.1007/978-3-030-99372-6
URL: https://lcpc2021.github.io/pre_workshop_papers/Mayer_lcpc21.pdf
BibTeX: Download
, , , , :
2019
OpenMP to FPGA Offloading Prototype using OpenCL SDK
IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) (Rio de Janeiro, Brazil, 20.05.2019 - 24.05.2019)
In: IEEE (ed.): 2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) 2019
DOI: 10.1109/IPDPSW.2019.00072
URL: https://ieeexplore.ieee.org/abstract/document/8778393
BibTeX: Download
, , :
OpenMP on FPGAs - A Survey
15th International Workshop on OpenMP (IWOMP 2019) (Auckland, 11.09.2019 - 13.09.2019)
In: Xing Fan, Bronis R. de Supinski, Oliver Sinnen, Nasser Giacaman (ed.): OpenMP: Conquering the Full Hardware Spectrum - Proceedings of the 15th International Workshop on OpenMP (IWOMP 2019), Cham: 2019
DOI: 10.1007/978-3-030-28596-8_7
URL: https://link.springer.com/content/pdf/10.1007/978-3-030-28596-8_7.pdf
BibTeX: Download
, , :