Job description
 
                         NVIDIA’s Data Center MODS organization is looking for an Engineering Manager to help Cloud Service Providers (CSPs) and OEMs scale out current and next generation datacenter products.
You will be responsible for validating and scaling NVIDIA’s GPU products at the system level, pushing hardware to its limits to ensure adaptability and reliability across diverse environments — from internal validation labs to hyperscale data centers.
Our organization partners closely with architecture, ASIC, operations, and data center teams to build methodologies that stress every subsystem of the GPU and server platform.
The team also supports diagnostics for customer deployments, tailoring stress workloads to specific configurations and use cases.    
What you'll be doing:
+ Lead and mentor a high-performing engineering team, fostering technical growth and leadership.
+ Collaborate with architecture and hardware teams to drive development of stress and diagnostic software targeting GPUs, CPUs, memory, storage, and interconnects.
+ Lead multiple concurrent projects, balancing long-term strategy with short-term execution.
+ Work with Cloud Service Providers (CSPs), OEMs, and data center operators to support deployment and customization of diagnostics.
+ Champion continuous improvement in product quality, debug efficiency, and operational scalability.      
What we need to see:
+ Bachelor’s or Master’s degree in Computer Science, Electrical Engineering, or related field or equivalent experience.
+ 10+ overall years of experience in system software development, with 4+ years in engineering management.
+ Experience with C/C++/Python
+ Deep understanding of operating systems, kernel drivers, and hardware-software interaction.
+ Experience with PC/server architecture, including PCIe, NVLink, Infiniband, or Ethernet.
+ Consistent track record of leading feature development and multi-team debugging efforts.      
Ways to Stand Out from the Crowd:
+ Experience with diagnostics or stress testing in large-scale data center environments.
+ Familiarity with GPU compute, graphics, memory subsystems, or high-speed interfaces.
+ Prior experience working with CSPs or OEMs on system-level validation and deployment.
+ Strong communication and multi-functional leadership skills.
+ Passion for building tools that ensure product excellence and customer success.      
Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.
The base salary range is 272,000 USD - 425,500 USD for Level 4, and 308,000 USD - 471,500 USD for Level 5.          
 You will also be eligible for equity and benefits (https://www.nvidia.com/en-us/benefits/) .          
Applications for this job will be accepted at least until October 10, 2025.    
NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer.
As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.      
 
                    
                    
Required Skill Profession
 
                     
                    
                    Other General