This repository contains educational materials and programming exercises for the 2025 NCSA CI Pathway Parallel Computing Course. The course provides hands-on experience with parallel programming paradigms including OpenMP, MPI, and OpenACC, focusing on high-performance computing concepts and practical implementation.
This course is sponsored by NSF Award #2417789 and led by the Pittsburgh Supercomputing Center (PSC) and the National Center for Supercomputing Applications (NCSA).
The repository is organized into comprehensive modules covering different aspects of parallel computing:
- Exercises/: Practical programming exercises with multiple paradigms
- HW/: Three comprehensive homework assignments with detailed analysis
- Lecture/: Course lecture materials and presentations
- Setup: Environment configuration and setup instructions
Platform: NCSA Delta HPC Cluster
- Login Node: dt-login04.delta.ncsa.illinois.edu
- Compute Nodes: AMD EPYC 7763 64-Core processors
- Architecture: x86_64 with 128 hardware threads per node
- Interconnect: HPE Slingshot high-speed network
Quick Start: NCSA OnDemand Portal
Training Platform: HPC Moodle
- Thread-based parallelization for multicore systems
- Compiler directives for parallel loops and reductions
- Performance scaling from 1-32 threads
- Example:
laplace_omp.c- 2D heat equation solver
- Message passing for cluster computing
- Domain decomposition techniques
- Inter-process communication patterns
- Example:
laplace_mpi.c- Distributed Laplace solver
- Accelerator-based parallel computing
- GPU optimization strategies
- Performance portability across architectures
- Example:
laplace_acc.c- GPU-accelerated solver
Students learn to:
- Measure parallel speedup and efficiency
- Analyze scalability characteristics
- Compare different parallelization strategies
- Optimize for specific hardware architectures
- Compile and run parallel applications
- Use job scheduling systems (Slurm)
- Debug parallel code issues
- Apply algorithmic optimizations (e.g., Red-Black iterative methods)
- Conduct systematic performance experiments
- Document results with statistical analysis
- Create professional technical reports
- Visualize performance data effectively
| Threads | Time (s) | Speedup | Efficiency |
|---|---|---|---|
| 1 | 21.7 | 1.00Γ | 100% |
| 8 | 4.2 | 5.21Γ | 65.1% |
| 32 | 2.0 | 10.91Γ | 34.1% |
- Execution Time: 6.4 seconds
- Speedup: 7.30Γ (vs serial baseline)
- Communication Overhead: Optimized ghost cell exchanges
- Parallel scaling analysis with thread count variation
- Comparison of serial, OpenMP, and enhanced parallel algorithms
- Red-Black checkerboard optimization implementation
- Prime number calculation with parallel optimization
- Race condition identification and resolution
- Thread synchronization techniques
- 2D domain decomposition strategies
- Communication pattern optimization
- Performance analysis with scaling studies
- Jupyter notebook visualization of results
# OpenMP
nvc -mp laplace_omp.c
# MPI
mpicc laplace_mpi.c
# OpenACC
nvc -acc -Minfo=accel laplace_acc.c# Interactive OpenMP job
srun --account=becs-delta-cpu --partition=cpu-interactive \
--nodes=1 --cpus-per-task=32 --pty bash
# MPI job submission
srun --account=becs-delta-cpu --partition=cpu-interactive \
--nodes=1 --tasks=4 --tasks-per-node=4 --pty bashUpon completion, students will be able to:
- Implement parallel algorithms using multiple programming models
- Analyze parallel performance characteristics and bottlenecks
- Optimize code for different hardware architectures
- Evaluate trade-offs between programming paradigms
- Communicate technical results through professional documentation
This course prepares students for careers in:
- High-Performance Computing: National laboratories, research institutions
- Scientific Computing: Weather modeling, computational physics, bioinformatics
- Industry Applications: Financial modeling, machine learning, data analytics
- Research Computing: Academic research support and development
CI-Pathway-exercise/
βββ README.md # This file
βββ parallel_computing/
βββ Exercises/ # Practice implementations
β βββ MPI/ # Message passing examples
β βββ OpenMP/ # Shared memory examples
β βββ OpenACC/ # GPU computing examples
β βββ Test/ # Validation programs
βββ HW/ # Graded assignments
β βββ hw1/ # OpenMP performance study
β βββ hw2/ # Race conditions & optimization
β βββ hw3/ # Advanced MPI techniques
βββ Lecture/ # Course materials
βββ Setup # Environment configuration
- Access the computing environment: Connect to NCSA Delta
- Load required modules: Set up compilers and MPI libraries
- Clone this repository: Download course materials
- Start with exercises: Begin with
Exercises/directory - Progress to assignments: Complete
HW/in sequence
This project is licensed under a modified Apache License 2.0 with academic use restrictions. All rights reserved to Hochan Son (ohsono@gmail.com or hochanson@g.ucla.edu).
Important: This software is for academic and educational use only. Commercial use is prohibited and prior authorization from the author is required for any use. See LICENSE.md for complete terms.
This educational program advances the national cyberinfrastructure workforce through hands-on parallel computing training, preparing the next generation of computational scientists and HPC practitioners.