Software Engineer – Cloud Storage Solutions
Walmart Global Tech
Reston, VA or Sunnyvale, CA / Remote
Responsibilities
· Design, implement, optimize private cloud-based Cloud Storage solutions using Ceph
· Build, tune, troubleshoot block and object storage systems in high availability Ceph storage clusters
· Design automation for storage optimization and other processes on private cloud platforms
· Enable Application teams to follow best practices to deploy on Cloud Platforms and optimize Storage spend and improve efficiency and performance based on utilization with Ceph
· Create and maintain technical documentation for operational readiness
· Design and maintain cloud storage best practices
· Become a solid contributor on our team, and build, extend and maintain some of the key infrastructure that powers our Private Cloud platform
· Provide troubleshooting expertise for storage performance and other issues
· Develop and integrate provisioning and lifecycle tools for storage services components
Qualifications
· 2+ years of experience supporting large scale, highly available, production Cloud Storage deployments with Ceph
· Strong familiarity with any combination of the following: OpenStack/Kubernetes/Rook with Ceph as an underlying storage solution
· Experience in programming, scripting and development
o Python or Shell Scripting
· Experience with Linux Administration and System Troubleshooting
· Networking concepts and administration
o TCP/IP, routing, switching, VLANs, Load balancing
· Experience using source control systems (git)
· Experience in configuration management tools like Ansible/Puppet/Chef
· Experience with Containers (Kubernetes, Docker, etc.)
· Experience in Architecting infra solutions for applications
· Experience with monitoring, reporting tools and data analytics
· Good understanding of clustered/distributed systems
· Apply best practice and team standards while meeting service level objectives
· Experience working with cloud deployments (scaling, resiliency, load balancing etc) and solid understanding of Service Monitoring, KPI, SLA, Disaster Recovery
· Deep experience with the Linux ecosystem, automation of common tasks, and configuration of systems monitoring tools
· Experience with capacity/performance management, monitoring and tuning
· Experience with firewalls, VPN, routing, switching, load balancers, monitoring, security and DNS
· Bachelor's or Master's degree in CS or similar field of study OR work equivalent