We are seeking a Web Application Technical Lead with strong expertise in managing and supporting large-scale, customer-facing applications.
This role requires hands-on incident handling experience and deep knowledge of observability tools to ensure application stability, scalability, and performance.
·        Lead technical operations and incident handling for high-traffic web applications.
·        Troubleshoot and resolve critical production issues, ensuring minimal downtime.
·        Conduct root cause analysis and drive preventive measures.
·        Monitor system performance and reliability using Splunk, Prometheus, Grafana, Datadog.
·        Collaborate with engineering and product teams to implement enhancements and scalability improvements.
·        Mentor junior engineers and establish best practices for observability and incident response.
·        10+ years of technical leadership in web applications, operations, or site reliability.
·        Strong incident management and troubleshooting expertise.
·        Hands-on experience with Splunk, Prometheus, Grafana, and Datadog.
·        Proven ability to manage large-scale, customer-facing applications.
·        Strong communication and cross-functional collaboration skills.
·        Ability to thrive in a hybrid work environment.