Job description
Applications are accepted until further notice
Who We Are
The Cisco’s AI team consists of AI researchers, and software developers who collaborate to build innovative products and platforms for Cisco.
We are motivated by t
About the Role
We are seeking a highly experienced Senior Engineering Manager to lead teams building, deploying, and optimizing Large Language Model (LLM)-based applications, with a strong emphasis on LLMOps (LLMoperations), Retrieval-Augmented Generation (RAG) pipelines, and scalable production systems.
This role involves managing cross-functional engineers, collaborating closely with product, ML research, and infrastructure teams, and ensuring the successful delivery of robust, secure, and efficient AI-powered systems.
Key Responsibilities
Team Leadership & Management
Lead and grow a high-performing engineering team focused on LLM applications and infrastructure.
Foster a culture of engineering excellence, continuous learning, and innovation.
Drive team performance through mentoring, goal-setting, and technical guidance.
LLMOps & Platform Engineering
Design and oversee scalable LLMOps pipelines including fine-tuning, evaluation, deployment, monitoring, and optimization of large language models.
Work closely with ML researchers to transition experimental models into production.
Manage model lifecycle tooling (e.g., LangChain, MLflow, Weights & Biases, Hugging Face, Ray).
Retrieval-Augmented Generation (RAG)
Oversee the design and implementation of RAG pipelines including vector database management, chunking strategies, embedding selection, retrieval tuning, and relevance evaluation.
Optimize latency, accuracy, and context window handling for high-traffic LLM services.
Architecture & Scalability
Own architectural decisions for high-availability, low-latency systems powering generative AI applications.
Collaborate with infrastructure and DevOps teams on scaling inference workloads (e.g., with GPU clusters, model quantization, caching, and sharding).
Cross-Functional Collaboration
Work with product, design, and data science to define requirements, translate business needs into engineering tasks, and prioritize effectively.
Maintain high communication standards across teams, ensuring alignment and transparency.
Quality, Security, and Governance
Champion model observability, incident response, prompt versioning, and feedback loops.
Ensure responsible AI practices and data governance are followed.
Qualifications
Required
8+ years of software engineering experience, with 3+ years in engineering management or technical leadership roles.
Proven track record of shipping production-grade ML/LLM systems.
Strong understanding of LLMs, fine-tuning, prompt engineering, vector databases (e.g., Pinecone, Weaviate, FAISS), and RAG patterns.
Experience with cloud-native architectures (AWS, GCP, or Azure) and container orchestration (Kubernetes).
Proficiency in Python and familiarity with AI/ML frameworks such as PyTorch, Transformers, LangChain, or similar.
Preferred
Experience managing or working with multi-modal or multi-agent systems.
Exposure to regulatory or compliance frameworks for ML systems (e.g., GDPR, SOC 2).
Hands-on experience with observability and evaluation tools for LLMs.
Required Skill Profession
Computer Occupations