{"22221340":{"jobPath":"/jobs/22221340/financial-services-cybersecurity-senior-consultant","source":"naylor","job":"22221340","jobTitle":"Financial Services Cybersecurity Senior Consultant"},"22253555":{"jobPath":"/jobs/22253555/manager-of-information-technology","source":"naylor","job":"22253555","jobTitle":"Manager of Information Technology"},"22221341":{"jobPath":"/jobs/22221341/sr-it-security-ops-engineer-hybrid","source":"naylor","job":"22221341","jobTitle":"Sr. IT Security Ops Engineer (Hybrid)"},"22256828":{"jobPath":"/jobs/22256828/chief-information-officer","source":"naylor","job":"22256828","jobTitle":"Chief Information Officer"},"22267835":{"jobPath":"/jobs/22267835/institute-auditor","source":"naylor","job":"22267835","jobTitle":"Institute Auditor"},"22262464":{"jobPath":"/jobs/22262464/director-of-information-technology-audits-hybrid","source":"naylor","job":"22262464","jobTitle":"Director of Information Technology Audits - Hybrid"},"22241129":{"jobPath":"/jobs/22241129/information-systems-assistant-director","source":"naylor","job":"22241129","jobTitle":"Information Systems Assistant Director"},"22214794":{"jobPath":"/jobs/22214794/sr-hpc-systems-engineer-x28-it-x40-jh-research-computing-x29","source":"naylor","job":"22214794","jobTitle":"Sr. HPC Systems Engineer (IT@JH Research Computing)"},"22134905":{"jobPath":"/jobs/22134905/senior-audit-manager-information-systems","source":"naylor","job":"22134905","jobTitle":"Senior Audit Manager - Information Systems"},"22262469":{"jobPath":"/jobs/22262469/director-of-information-technology-audits-hybrid","source":"naylor","job":"22262469","jobTitle":"Director of Information Technology Audits - Hybrid"},"22224304":{"jobPath":"/jobs/22224304/sr-auditor-it-audit","source":"naylor","job":"22224304","jobTitle":"Sr. Auditor, IT Audit"},"22273056":{"jobPath":"/jobs/22273056/it-audit-manager","source":"naylor","job":"22273056","jobTitle":"IT Audit Manager"},"22253462":{"jobPath":"/jobs/22253462/director-business-integrations-application","source":"naylor","job":"22253462","jobTitle":"Director, Business Integrations & Application"},"22166418":{"jobPath":"/jobs/22166418/it-auditor-cisa-certified","source":"naylor","job":"22166418","jobTitle":"IT Auditor - CISA certified "},"22273337":{"jobPath":"/jobs/22273337/senior-internal-auditor-it-risk-advisory","source":"naylor","job":"22273337","jobTitle":"Senior Internal Auditor - IT Risk & Advisory"},"22218542":{"jobPath":"/jobs/22218542/software-architect-software-engineer","source":"naylor","job":"22218542","jobTitle":"Software Architect/Software Engineer"},"22270824":{"jobPath":"/jobs/22270824/security-engineer-application-security","source":"naylor","job":"22270824","jobTitle":"Security Engineer, Application Security"},"22233678":{"jobPath":"/jobs/22233678/information-security-operations-manager","source":"naylor","job":"22233678","jobTitle":"Information Security Operations Manager "},"22259329":{"jobPath":"/jobs/22259329/senior-information-security-analyst","source":"naylor","job":"22259329","jobTitle":"Senior/Information Security Analyst"},"22251028":{"jobPath":"/jobs/22251028/manager-physical-security-systems","source":"naylor","job":"22251028","jobTitle":"Manager Physical Security Systems "},"22262214":{"jobPath":"/jobs/22262214/supervisor-health-information-and-data","source":"naylor","job":"22262214","jobTitle":"Supervisor, Health Information and Data"},"22259654":{"jobPath":"/jobs/22259654/senior-manager-cyber-security-product-innovation","source":"naylor","job":"22259654","jobTitle":"Senior Manager, Cyber Security Product & Innovation"},"22241111":{"jobPath":"/jobs/22241111/cyber-security-architect","source":"naylor","job":"22241111","jobTitle":"Cyber Security Architect"},"22241232":{"jobPath":"/jobs/22241232/security-engineer","source":"naylor","job":"22241232","jobTitle":"Security Engineer"},"22241116":{"jobPath":"/jobs/22241116/senior-cyber-security-specialist","source":"naylor","job":"22241116","jobTitle":"Senior Cyber Security Specialist"}}
Sr. HPC Systems Engineer (IT@JH Research Computing)
Johns Hopkins University
Application
Details
Posted: 20-Apr-26
Location: Baltimore, Maryland
Internal Number: 120772-en_US 1
IT@JH Research Computingis seeking a Sr. HPC Systems Engineer who will design, build, and maintain advanced high-performance computing environments supporting Johns Hopkins University’s research mission. This position focuses on the reliable operation, configuration, and optimization of HPC and AI systems, including multi-node CPU and GPU clusters, high-speed InfiniBand and Ethernet networks, and large-scale parallel and object storage. The engineer implements and automates secure, efficient, and reproducible computing platforms used by faculty, researchers, and students across diverse scientific disciplines. Assignments include both ticket-based support and project-based deployments. The role operates with moderate independence, collaborating closely with the IT Architect, Research Computing, and reporting to the IT Manager for Research Computing to ensure scalable, sustainable, and high-performance systems that enable cutting-edge scientific discovery.
Specific Duties & Responsibilities
Support and administer production systems used by researchers and Research Centers.
Provide technical leadership/project management for system configuration, implementation, management, and user support for both new and existing systems.
Research and recommend new functionality for HPC management and administration tools by exploring system-wide impacts, working with functional users to define current and future processes.
Expertise with architecting, operating, and debugging large scale HPC network and storage infrastructure, including MPI, NCCL, RDMA, Infiniband, and parallel file systems
Works with scientific support specialists and assigns tasks and provides oversight as appropriate to HPC engineering team to support scientific researchers who use a broad spectrum of applications from diverse fields.
Analyze results of server monitoring and implement changes to improve performance, processing, and utilization.
Propose, maintain, and enforce policies, practices and security procedures.
Provide break/fix support, setup/installation support, escalation support, and solutions support.
Collaborate closely with a variety of stakeholders, both internal and external, on all aspects of projects.
Other duties as assigned.
In Addition to the Duties Described Above
Deploy, configure, and maintain large-scale Linux-based HPC clusters comprising CPU and GPU nodes, high-speed interconnects, and parallel file systems.
Implement and optimize workload schedulers (Slurm) and job submission policies to maximize system throughput and fair-share usage.
Administer and monitor distributed storage systems (GPFS, Lustre, WekaFS, Ceph, MinIO) to ensure reliability and performance across multi-petabyte environments.
Maintain high-speed fabric and network infrastructure (Infiniband, Ethernet) to support low-latency data transfer and MPI workloads.
Support research groups in deploying, testing, and optimizing scientific applications and AI/ML workflows on shared computing resources.
Develop and maintain automation and monitoring frameworks for system provisioning, metrics collection, and alerting (Prometheus, Grafana, ELK).
Participate in capacity planning, hardware lifecycle management, and evaluation of new technologies in collaboration with architects and management.
Ensure security and compliance through configuration hardening, patch management, and integration with campus identity and access control systems.
Document system designs, procedures, and troubleshooting guides to support knowledge transfer and team continuity.
Contribute to a collaborative engineering culture that emphasizes service quality, innovation, and continuous improvement in research computing operations.
Minimum Qualifications
Bachelor’s degree.
Six years of related experience.
Additional education may substitute for required experience and additional related experience may substitute for required education beyond a high school diploma/graduation equivalent, to the extent permitted by the JHU equivalency formula.
Preferred Qualifications
Eight plus years of experience in high-performance computing systems administration or engineering, including experience with cluster management, workload scheduling (e.g., Slurm), and distributed or parallel storage.
Deep proficiency in Linux systems administration, configuration management (Ansible, Puppet, or Salt), performance monitoring, and tuning for HPC workloads.
Experience with high-speed interconnects (Infiniband, 100/400 Gb Ethernet) and parallel file systems (e.g., GPFS, Lustre, BeeGFS, or WekaFS).
Working knowledge of containerization and orchestration (Singularity, Docker, Kubernetes for HPC).
Ability to automate deployments and routine operations through scripting (Bash, Python).
Familiarity with data-center operations, GPU acceleration, and research software environments (e.g., CUDA, MPI, AI/ML frameworks).
Strong analytical and troubleshooting skills, with proven ability to support complex research workloads in multi-user, multi-tenant environments.
Experience collaborating with faculty and research groups to translate scientific requirements into practical and performant computing solutions.
Classified Title: Sr. HPC Systems Engineer Role/Level/Range: ATP/04/PF Starting Salary Range: $85,500 - $149,800 Annually (Commensurate w/exp.) Employee group: Full Time Schedule: Mon-Fri, 8:30am-5pm FLSA Status: Exempt Location: Johns Hopkins Bayview Department name: IT@JH Research Computing Personnel area: University Administration
The Johns Hopkins University (JHU) was founded in 1876 as the nation's first research university, dedicated to bringing the benefits of discovery to the world. JHU is the largest private employer in Baltimore and Maryland, making a large economic impact in the city and state. We enroll more than 30,000 full- and part-time students across ten academic divisions, offering in-person and remote learning in over 400 programs. Not only are we located in Baltimore, but we also have a presence in Washington, D.C.