Parallel Programming Workshop

Training details
Location
Barcelona (On-site)
Date
20/10/2025
Time
09 : 30
Target Audiance
Scientist
Teaching language(s)
English
Organizing institution
BSC
Delivery mode
On-site
Level
Intermediate
Format
Hands-on session, Lecture, Workshop
Topics / Keywords
Parallelization, programming models, architecture
What You Will Learn
The students who finish this course will be able to develop benchmarks and applications with the MPI, OpenMP and mixed MPI/OpenMP programming models, as well as analyze their execution and tune their behaviour in parallel architectures.
Agenda
Day 1 (Monday October 20th)
Session 1 / 9:30 – 13:00 (20`rest in between at 11am)
1. Introduction to parallel architectures, algorithms design and performance parameters
2. Introduction to the MPI programming model
3. Practical: How to compile and run MPI applications
13:00 – 14:30 Lunch Break
Session 2 / 14:30 – 17:30 (20`rest in between)
1. MPI: Non-blocking communication, collective communication, datatypes
2. Practical: Simple stencil
3. MPI: One-sided communication
Day 2 (Tuesday October 21st)
Session 1 / 9:30 – 13:00 (20`rest in between at 11am)
1. MPI: Hybrid Programming with Shared Memory and Accelerators; Non-blocking Collectives, Topologies, and Neighborhood Collectives
2. Practical: One-sided, shared-memory, topologies
13:00 – 14:30 Lunch Break
Session 2 / 14:30 – 17:30 (20`rest in between)
1. Parallel debugging in MareNostrumIII, options from print to Totalview
2. Practical: GDB and IDB
3. Practical: Totalview
4. Practical: Valgrind for memory leaks
Day 3 (Wednesday October 22nd)
Session 1 / 9:30 – 13:00 (20`rest in between at 11am)
1. Introduction to Paraver: tool to analyze and understand performance
2. Practical: Trace generation and trace analysis
13:00 – 14:30 Lunch Break
Session 2 / 14:30 – 17:30 (20`rest in between)
1. Shared-memory programming models, OpenMP fundamentals
2. Parallel regions and work sharing constructs
3. Synchronization mechanisms in OpenMP
4. Practical: sample parallel and worksharing exercises
Day 4 (Thursday October 23rd)
Session 1 / 9:30 – 13:00 (20`rest in between at 11am)
1. Worksharing constructs (continued)
2. OpenMP Tasking
3. Practical: Worksharing and Tasking
13:00 – 14:30 Lunch Break
Session 2 / 14:30 – 17:30 (20`rest in between)
1. Codee: static code analysis for performance and Fortran modernization
2. Practical session: Codee for guided parallelization and Fortran modernization through auto-fixes
Day 5 (Friday October 24th)
Session 1 / 9:30 – 13:00 (20`rest in between at 11am)
1. Introduction to the OmpSs-2 programming model
2. Practical: cholesky, matrix multiplication, axpy, dot-product
13:00 – 14:30 Lunch Break
Session 2 / 14:30 – 17:30 (20`rest in between)
1. Programming using a hybrid MPI/OmpSs approach and TAMPI
2. Practical: heat equation example and n-body examples
Instructor name(s)
Course Convener: Xavier Martorell, CS/Programming Models
Lecturers:
BSC – Computer Sciences department
Judit Giménez – Performance Tools – Group Manager
German Llort – Performance Tools – Senior Researcher
Marc Jordà – Accelerators and Communications for High Performance Computing – Research Engineer
Antonio Peña – Accelerators and Communications for High Performance Computing – Senior Researcher
Xavier Teruel – Best Practices for Performance and Programmability – Group Coordinator
Xavier Martorell – Programming Models – Parallel programming model – Group Manager
Course Description
The objectives of this course are to understand the fundamental concepts supporting message-passing and shared memory programming models. The course covers the two widely used programming models: MPI for the distributed-memory environments, and OpenMP for the shared-memory architectures. The course also presents the main tools developed at BSC to get information and analyze the execution of parallel applications, Paraver and Extrae.
It also presents the Parallware Assistant tool, which is able to automatically parallelize a large number of program structures, and provide hints to the programmer with respect to how to change the code to improve parallelization. It deals with debugging alternatives, including the use of GDB and Totalview. The use of OpenMP in conjunction with MPI to better exploit the shared-memory capabilities of current compute nodes in clustered architectures is also considered. Paraver will be used along the course as the tool to understand the behavior and performance of parallelized codes. The course is taught using formal lectures and practical/programming sessions to reinforce the key concepts and set up the compilation/execution environment.
Prerequisites
<p>Prerequisites:<em> Fortran, C or C++</em> programming. All examples in the course will be done in C.<br />
Attendants can bring their own applications and work with them during the course for parallelization and analysis.</p>
Technical setup
<p>Software requirements: SSH client (to connect HPC systems), X Server (enabling remote visual tools).</p>
