Job DescriptionPlease Note:
- This is a 100% Onsite position and 5 days a week
- Selected candidate must be willing to work on-site in Woodlawn, MD
Key Required Skills:
- Machine Learning, ML model deployment, Python, CI/CD for ML (Jenkins, Sagemaker), Cloud Platforms (AWS) and related ML services, and data pipeline management.
Position Description:
- Ensure that ML models can be effectively developed, deployed, managed, and monitored in production environments.
- Productionize ML models – integrate trained ML models with Production systems
- Build and manage ML pipelines – design, build, and maintain automated pipelines including data ingestion, data preprocessing, model training, validation, and deployment utilizing CI/CD practices.
- Infrastructure management – set up and manage infrastructure for ML workloads utilizing cloud platforms and containerization technologies.
- Monitoring and alerting – implement monitoring systems to track performance of ML models in Production
- Automation – automate various tasks within the ML workflow to improve efficiency and reproducibility
- Performance optimization – identify ways to optimize the performance, efficiency, and scalability of ML models and their supporting infrastructure.
- All other duties as assigned or directed.
Requirements
Skills Requirements:
Basic Qualifications
- Bachelor's Degree and 12+ years' experience in Computer Science, Mathematics, Engineering or a related field.
- Masters or Doctorate degree may substitute for required experience
- Minimum 5 years of hands-on experience designing, developing, implementing and maintaining ML workflows and data pipelines
- Must be able to obtain and maintain a Public Trust.
Contract requirement.
Required Skills
- Strong foundation AI, ML and LLMs including understanding of concepts, algorithms, model training and frameworks (TensorFlow, PyTorch, scikit-learn).
- Strong programming skills, especially Python, and relevant libraries (scikit-Learn, TensorFlow, PyTorch, NumPy, Pandas).
- Strong understanding of MLOps principles and experience with MLOps platforms and tools (e.g., AWS Sagemaker, MLflow, Kubeflow, DataRobot).
- Experience with CI/CD tools (Jenkins required), and containerization (Docker) and orchestration (Kubernetes) for managing and scaling applications.
- Proficiency with cloud platforms (AWS preferred) including ML services, Infrastructure as Code (CloudFormation, Terraform), compute, storage (S3, EFS), and networking.
- Knowledge of data engineering fundamentals including understanding of data pipelines, data storage (PostgreSQL, MySQL, MongoDB), and data processing frameworks (Apache Spark).
- Strong communication, collaboration, problem-solving, analytical, and critical thinking skills.
Desired Skills
- Prior experience with federal or state government IT projects.
- Ability to design scalable, reliable, and efficient ML systems.
- Willingness to continuously learn new technologies and best practices.
- Familiarity with other programming languages such as Java and Scala.
- Experience with Natural Language Processing (NLP) for text and language generation.
RequirementsTensorFlow, PyTorch, scikit-learn, NumPy, Pandas, AWS Sagemaker, MLflow, Kubeflow, DataRobot