Know ATS Score
CV/Résumé Score
  • Expertini Resume Scoring: Our Semantic Matching Algorithm evaluates your CV/Résumé before you apply for this job role: Deep Learning Solutions Architect – Inference Optimization.
United States Jobs Expertini

Urgent! Deep Learning Solutions Architect – Inference Optimization Job Opening In Remote – Now Hiring NVIDIA

Deep Learning Solutions Architect – Inference Optimization



Job description

NVIDIA’s Worldwide Field Operations (WWFO) team is seeking a Solution Architect with a deep understanding of neural network inference.

As our customers adopt increasingly complex inference pipelines on state of the artinfrastructure, there is a growing need for experts who can guide the integration of advanced inference techniques such as speculative decoding, request scheduler optimizations or FP4 quantization.

The ideal candidate will be proficient using tools such as TRT LLM, vLLM, SGLang or similar, and have strong systems knowledge, enabling customers to fully use the capabilities of the new GB300 NVL72 systems (for example work on efficient KV cache offloading or help with inference of new architectures like hybrid or diffusion models, or architect the pre- and post-processing pipelines).




Solutions Architects work with the most exciting computing hardware and software, driving the latest breakthroughs in artificial intelligence! We need individuals who can enable customer productivity and develop lasting relationships with our technology partners, making NVIDIA an integral part of end-user solutions.

We are looking for someone always passionate about artificial intelligence, someone who can maintain understanding of a fast paced field, someone able to coordinate efforts between corporate marketing, industry business development and engineering.

Solutions Architects, are the first line of technical expertise between NVIDIA and our customers.

Your duties will vary from working on proof-of-concept demonstrations, to driving relationships with key executives and managers in order to promote adoption of NVIDIA based AI technology.

Engaging with developers, scientific researchers, data scientists, IT managers and senior leaders is a significant part of the Solutions Architect role.




What you will be doing:
+ Work directly with key customers to understand their technology and provide the best AI solutions.
+ Perform in-depth analysis and optimization to ensure the best performance on GPU architecture systems (in particular Grace/ARM based systems).

This includes support in optimization of large scale inference pipelines.
+ Partner with Engineering, Product and Sales teams to develop, plan best suitable solutions for customers.

Enable development and growth of product features through customer feedback and proof-of-concept evaluations.





What we need to see:
+ Excellent verbal, written communication, and technical presentation skills in English.
+ MS/PhD or equivalent experience in Computer Science, Data Science, Electrical/Computer Engineering, Physics, Mathematics, other Engineering fields.
+ 5+ years work or research experience with Python/ C++ / other software development
+ Work experience and knowledge of modern NLP including good understanding of transformer, state space, diffusion, MOE model architectures.

This can include either expertise in training or optimization/compression/operation of DNNs.
+ Understanding of key libraries used for NLP/LLM training (such as Megatron-LM, NeMo, DeepSpeed etc.) and/or deployment (e.g. TensorRT-LLM, vLLM, Triton Inference Server).
+ Enthusiastic about collaborating with various teams and departments—such as Engineering, Product, Sales, and Marketing—this person thrives in dynamic environments and stays focused amid constant change.
+ Self-starter with demeanor for growth, passion for continuous learning and sharing findings across the team.





Ways to Stand Out from The Crowd:
+ Demonstrated experience in running and debugging large-scale distributed deep learning training or inference processes.
+ Experience working with larger transformer-based architectures for NLP, CV, ASR or other.
+ Applied NLP technology in production environments.
+ Proficient with DevOps tools including Docker, Kubernetes, and Singularity.
+ Understanding of HPC systems: data center design, high speed interconnect InfiniBand, Cluster Storage and Scheduling related design and/or management experience.





Widely considered to be one of the technology world’s most desirable employers, NVIDIA offers highly competitive salaries and a comprehensive benefits package.

As you plan your future, see what we can offer to you and your family www.nvidiabenefits.com/


NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer.

As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.









Required Skill Profession

Other General



Your Complete Job Search Toolkit

✨ Smart • Intelligent • Private • Secure

Start Using Our Tools

Join thousands of professionals who've advanced their careers with our platform

Rate or Report This Job
If you feel this job is inaccurate or spam kindly report to us using below form.
Please Note: This is NOT a job application form.


    Unlock Your Deep Learning Potential: Insight & Career Growth Guide


  • Real-time Deep Learning Jobs Trends in Remote, United States (Graphical Representation)

    Explore profound insights with Expertini's real-time, in-depth analysis, showcased through the graph below. This graph displays the job market trends for Deep Learning in Remote, United States using a bar chart to represent the number of jobs available and a trend line to illustrate the trend over time. Specifically, the graph shows 6392 jobs in United States and 29 jobs in Remote. This comprehensive analysis highlights market share and opportunities for professionals in Deep Learning roles. These dynamic trends provide a better understanding of the job market landscape in these regions.

  • Are You Looking for Deep Learning Solutions Architect – Inference Optimization Job?

    Great news! is currently hiring and seeking a Deep Learning Solutions Architect – Inference Optimization to join their team. Feel free to download the job details.

    Wait no longer! Are you also interested in exploring similar jobs? Search now: .

  • The Work Culture

    An organization's rules and standards set how people should be treated in the office and how different situations should be handled. The work culture at NVIDIA adheres to the cultural norms as outlined by Expertini.

    The fundamental ethical values are:
    • 1. Independence
    • 2. Loyalty
    • 3. Impartiality
    • 4. Integrity
    • 5. Accountability
    • 6. Respect for human rights
    • 7. Obeying United States laws and regulations
  • What Is the Average Salary Range for Deep Learning Solutions Architect – Inference Optimization Positions?

    The average salary range for a varies, but the pay scale is rated "Standard" in Remote. Salary levels may vary depending on your industry, experience, and skills. It's essential to research and negotiate effectively. We advise reading the full job specification before proceeding with the application to understand the salary package.

  • What Are the Key Qualifications for Deep Learning Solutions Architect – Inference Optimization?

    Key qualifications for Deep Learning Solutions Architect – Inference Optimization typically include Other General and a list of qualifications and expertise as mentioned in the job specification. Be sure to check the specific job listing for detailed requirements and qualifications.

  • How Can I Improve My Chances of Getting Hired for Deep Learning Solutions Architect – Inference Optimization?

    To improve your chances of getting hired for Deep Learning Solutions Architect – Inference Optimization, consider enhancing your skills. Check your CV/Résumé Score with our free Tool. We have an in-built Resume Scoring tool that gives you the matching score for each job based on your CV/Résumé once it is uploaded. This can help you align your CV/Résumé according to the job requirements and enhance your skills if needed.

  • Interview Tips for Deep Learning Solutions Architect – Inference Optimization Job Success
    NVIDIA interview tips for Deep Learning Solutions Architect – Inference Optimization

    Here are some tips to help you prepare for and ace your job interview:

    Before the Interview:
    • Research: Learn about the NVIDIA's mission, values, products, and the specific job requirements and get further information about
    • Other Openings
    • Practice: Prepare answers to common interview questions and rehearse using the STAR method (Situation, Task, Action, Result) to showcase your skills and experiences.
    • Dress Professionally: Choose attire appropriate for the company culture.
    • Prepare Questions: Show your interest by having thoughtful questions for the interviewer.
    • Plan Your Commute: Allow ample time to arrive on time and avoid feeling rushed.
    During the Interview:
    • Be Punctual: Arrive on time to demonstrate professionalism and respect.
    • Make a Great First Impression: Greet the interviewer with a handshake, smile, and eye contact.
    • Confidence and Enthusiasm: Project a positive attitude and show your genuine interest in the opportunity.
    • Answer Thoughtfully: Listen carefully, take a moment to formulate clear and concise responses. Highlight relevant skills and experiences using the STAR method.
    • Ask Prepared Questions: Demonstrate curiosity and engagement with the role and company.
    • Follow Up: Send a thank-you email to the interviewer within 24 hours.
    Additional Tips:
    • Be Yourself: Let your personality shine through while maintaining professionalism.
    • Be Honest: Don't exaggerate your skills or experience.
    • Be Positive: Focus on your strengths and accomplishments.
    • Body Language: Maintain good posture, avoid fidgeting, and make eye contact.
    • Turn Off Phone: Avoid distractions during the interview.
    Final Thought:

    To prepare for your Deep Learning Solutions Architect – Inference Optimization interview at NVIDIA, research the company, understand the job requirements, and practice common interview questions.

    Highlight your leadership skills, achievements, and strategic thinking abilities. Be prepared to discuss your experience with HR, including your approach to meeting targets as a team player. Additionally, review the NVIDIA's products or services and be prepared to discuss how you can contribute to their success.

    By following these tips, you can increase your chances of making a positive impression and landing the job!

  • How to Set Up Job Alerts for Deep Learning Solutions Architect – Inference Optimization Positions

    Setting up job alerts for Deep Learning Solutions Architect – Inference Optimization is easy with United States Jobs Expertini. Simply visit our job alerts page here, enter your preferred job title and location, and choose how often you want to receive notifications. You'll get the latest job openings sent directly to your email for FREE!