Want to be part of a team that's revolutionizing the field of AI with data center scale solutions?
We are looking for a hardworking Solution Architect with experience in designing, building, and maintaining large scale HPC and AI hybrid computing solutions to join our team at NVIDIA.
As Solution Architects on the NVIDIA Partner Network team, we are actively helping NVIDIA DGX and DGX SuperPOD solutions bring the benefits of large scale AI to customers through our partners.
We work closely with customers and partners to address unsolved problems in the industry and help to deploy and operationalize AI solutions at scale.
What you'll be doing:
Our day-to-day work involves guiding partners in their adoption of end-to-end Machine Learning and Deep Learning solutions, using NVIDIA's compute, networking, and software stacks.
Don't think this is a high-level slideshow job - we are the voice of experience, using Kubernetes, SaaS, infrastructure-as-code tools, network debugging, and problem solving skills to help build modern AI factories.
We also excel at sharing knowledge with others, whether it's delivering demos, assisting with proof-of-concepts, or writing papers and developer blogs.
By collaborating with executives and engineering, we solve complex problems and help bring NVIDIA's premiere technologies to life in the cloud and in the datacenter.
Our mission is to solve the problems that nobody else has solved yet, and we need someone to be an instrumental part of that!
What we need to see:
Strong foundational expertise and a BS, MS, or PhD in Engineering, Computer Science, or a related field or equivalent experience.
Established track record working with AI and HPC clusters, both on-premises and cloud based.
12+ years of proven experience with cluster management and related tools, including Docker Containers, Slurm, Kubernetes, and Ansible.
Hands-on experience with network, storage, cluster configuration and debugging.
Strong analytical and problem-solving skills, along with an ability to articulate what you know to others.
Ability to multitask efficiently in a dynamic environment.
Ways to stand out from the crowd:
Strong coding and debugging skills, including experience with Python, C/C++, Bash, and Linux utilities.
Demonstrated expertise through projects or Open Source contributions involving GPU workloads, Kubernetes, InfiniBand, Ethernet, or other areas related to high-performance clusters and hybrid cloud solutions.
Exhibit hands on experience with NVIDIA AI Enterprise, Base Command Manager and NEMO cloud native framework.
Willingness and ability to learn quickly and solve advanced problems.
NVIDIA is widely considered to be one of the technology world’s most desirable employers.
We have some of the most forward-thinking and hardworking people in the world working for us.
If you're creative and autonomous, we want to hear from you!
You will also be eligible for equity and .
NVIDIA accepts applications on an ongoing basis.