• HPC SYSTEMS ENGINEER

    University of WashingtonSeattle, WA 98194

    Job #2695404900

  • Req #: 234406

    Department: UW INFORMATION TECHNOLOGY

    Appointing Department Web Address: ~~~

    Posting Date: 05/15/2024

    Closing Info

    Open Until Filled

    Salary: $8,090 - $11,916 per month

    Shift: First Shift

    Notes

    As a UW employee, you will enjoy generous benefits and work/life programs. For a complete description of our benefits for this position, please visit our website, click here. (~~~)

    As a UW employee, you have a unique opportunity to change lives on our campuses, in our state, and around the world. UW employees offer their boundless energy, creative problem-solving skills, and dedication to build stronger minds and a healthier world. By being deeply invested in our work, showing compassion in our interactions, and embodying the spirit of a team player, each member contributes to a thriving community. UW is committed to attracting and retaining a diverse staff; your experiences, perspectives, and unique identities will be honored at the University of Washington. Together, our community strives to create and maintain working and learning environments that are inclusive, equitable, and welcoming.

    UW Information Technology (UW-IT) is the central information technology organization for the University of Washington, responsible for strategic planning, oversight, and direction of the UW's IT infrastructure, resources, and services. UW-IT provides critical technology support to all three campuses, UW Medicine, and research operations around the world, partnering with the UW community to enable innovation, learning, discovery, and service."

    The Research Cyberinfrastructure group within the University's central IT organization designs, implements, and operates Linux-based High-Performance Computing (HPC) clusters and storage systems to offer UW researchers cost-effective yet powerful high-performance computing options that unlock new possibilities for their research.

    This position focuses on supporting HPC efforts for the research computing group, while providing expertise and support to other endeavors when needed. This position requires a team-oriented professional, experienced in managing large IT systems for research using automated and software-defined approaches. This position regularly interfaces with other UW-IT teams as well as research customers across campus.

    Expertise in process automation, software development, Linux system administration, attention to detail, eagerness to evaluate and push new technologies, an optimistic "can do" disposition, collaborative growth mindset, and a customer-centric perspective are all key for success in this role.

    REQUIREMENTS:

    • Bachelor's Degree in computer science, information technology, scientific, engineering, or related field or experience.

    • Minimum of four years' experience in Linux system administration experience or substantial experience working with Linux.

    • Basic knowledge of networking hardware, software, protocols, and concepts.

    • Demonstrated excellent written/oral communication skills, technical documentation skills, user liaison skills, and personal interaction abilities.

    • Demonstrated ability to work with minimal supervision, both independently and as part of a team. DESIRED:

    • Knowledge of containerization platforms such as Docker or Singularity.

    • Progressively responsible experience as an engineer, architect, or role with comparable technical responsibilities in a large Linux HPC environment.

    • Experience with administration of Linux operating systems in a production environment, including experience with Red Hat Enterprise Linux or derivatives such as CentOS or Rocky Linux.

    • Familiarity with SLURM or other HPC scheduler (PBS Pro, PBS/Torque, SGE/UGE, LSF, etc).

    • Experience designing, configuring, and troubleshooting networks using both Ethernet and high-performance interconnects such as Infiniband.

    • Proficiency in programming/scripting languages in the context of systems engineering or administration, preferably including Bash, Golang, or Python.

    • Experience in the configuration and use of mass deployment tools such as MAAS, Foreman, xCAT, Cobbler, Warewulf, or similar.

    • Employment of configuration management tools such as Ansible, Chef, SaltStack, or Puppet.

    • Proficiency with the use of Git for source control in collaboration with a team with multiple contributors.

    • Ability to administer and troubleshoot large high-performance parallel filesystems such as IBM Storage Scale (GPFS), Lustre, BeeGFS, Ceph.

    • Experience with the use of configuration management tools such as Ansible, SaltStack, or Puppet.

    • Experience with use of containers, preferably including use of Apptainer in an HPC environment.

    • Experience in a data center environment (e.g., racking equipment, running cables, labeling, asset tracking).

    • Scientific background, research experience, and/or experience in a University setting. CONDITIONS OF EMPLOYMENT Requires monitoring of e-mail and trouble ticket system for questions needing immediate response during business ~~~-call responsibilities for after-hours system outages.Server management will include both production (24x7x365) and development systems.This is an essential position is required to work remotely when the University suspends operations. Application Process: The application process may include completion of a variety of online assessments to obtain additional information that will be used in the evaluation process.These assessments may include Work Authorization, Cover Letter and/or others. Any assessments that you need to complete will appear on your screen as soon as you select "Apply to this position". Once you begin an assessment, it must be completed at that time; if you do not complete the assessment, you will be prompted to do so the next time you access your "My Jobs" page. If you select to take it later, it will appear on your "My Jobs" page to take when you are ready. Please note that your application will not be reviewed, and you will not be considered for this position until all required assessments have been completed.

    University of Washington is an affirmative action and equal opportunity employer. All qualified applicants will receive consideration for employment without regard to, among other things, race, religion, color, national origin, sexual orientation, gender identity, sex, age, protected veteran or disabled status, or genetic information.