HPC Specialist

Arcadion

ottawa, on, Canada
Full-time
Posted May 20, 2026

Job Description

HPC Specialist – Role Overview

Arca dion is seeking an HPC (High-Performance Computing) Specialist responsible for the design, deployment, optimization, and management of high-performance computing systems, often used in scientific research, engineering simulations, AI/ML workloads, and large-scale data analytics.

Core Responsibilities

  1. Architecture & System Design
    • Design scalable HPC clusters (on-prem, cloud, or hybrid)
    • Choose appropriate CPUs, GPUs, interconnects (e.g., InfiniBand), and storage
    • Configure Slurm, PBS, or OpenHPC job schedulers
  2. Cluster Deployment & Maintenance
    • Install and manage Linux-based compute nodes
    • Maintain job schedulers and resource managers
    • Integrate monitoring tools (Prometheus, Grafana, Nagios)
  3. Performance Tuning & Optimization
    • Benchmark workloads and tune for performance (e.g., MPI, CUDA, OpenMP)
    • ...