Know ATS Score
CV/Résumé Score
  • Expertini Resume Scoring: Our Semantic Matching Algorithm evaluates your CV/Résumé before you apply for this job role: Site Reliability Engineer II.
United States Jobs Expertini

Urgent! Site Reliability Engineer II Job Opening In Redmond – Now Hiring Microsoft Corporation

Site Reliability Engineer II



Job description

Microsoft is a company where passionate innovators come to collaborate, envision what can be and take their careers further.

This is a world of more possibilities, more innovation, more openness, and the sky is the limit thinking in a cloud-enabled world.

Microsoft’s Azure Data engineering team is leading the transformation of analytics in the world of data with products like databases, data integration, big data analytics, messaging & real-time analytics, and business intelligence.

The products our portfolio include Microsoft Fabric, Azure SQL DB, Azure Cosmos DB, Azure PostgreSQL, Azure Data Factory, Azure Synapse Analytics, Azure Service Bus, Azure Event Grid, and Power BI.

Our mission is to build the data platform for the age of AI, powering a new class of data-first applications and driving a data culture.

Within Azure Data, the databases team builds and maintains Microsoft's operational Database systems.

We store and manage data in a structured way to enable multitude of applications across various industries.

We are on a journey to enable developer friendly, mission-critical, AI enabled operational Databases across relational, non-relational and OSS offerings.

We believe in making the day in the life of the On-Call Engineer boring while living up to the expectations of a massive cloud service with stringent Service Level Objectives (SLO’s).

We do this by thinking differently, stretching ourselves to go all the way to the root of the problem, keeping data in front and center for all our decisions and taking a systems approach for generating outcomes that far exceeds the expectations.

Helping attain the aspirational Service Level Objectives (SLO’s) through pragmatic innovation is what sets the SRE’s in Cosmos DB apart.

If you share the same purpose, cause and belief and have passion to follow this pursuit, please read the rest of the Job description on what we do, and we would love to have you join us!

Azure Cosmos DB is Microsoft’s next generation of globally distributed, massively scalable, multi-model cloud database service.

It is designed to enable developers to build planet-scale applications.

Azure Cosmos DB is one of the fastest growing Azure services.

Joining the Azure Cosmos DB team is a fantastic opportunity to work with incredibly talented engineers operating like a startup and be at the forefront of building and shaping the Livesite Automation and AI Ops stack in Cosmos DB and lead the path for broader adoption across Microsoft Azure.

Cosmos DB is a database of choice for the spectrum spanning from the hobbyist developer to the largest of Fortune 500 companies.

The database provides the data backbone of many critical systems in Health Care, Retail, Telecommunications, IoT and many more where the Service Availability and Latency is paramount.

Cosmos DB provides financially backed SLA (service level agreements) around 99.99 Availability and < 10 MS Latency and we are responsible for upholding ourselves to even more stringent Service Level Objectives (SLO) that delight our customers.

Other than a resilient and fault tolerant architecture, a key to attaining the SLO’s is automating the root cause analysis and mitigation of issues and a lot of times proactively addressing the issues even before any customer impact.

This team supports on building systems where a vast majority of Livesite issues are automatically mitigated without the need for human intervention.

We are looking for a self-driven Site Reliability Engineer (SRE) who likes taking a data driven and systems-based approach to solve Service Reliability problems.

You will be responsible for building and optimizing solutions that can analyze massive amounts of telemetry and other Service Health indicators in near real time and perform automated root cause analysis and necessary mitigations to restore SLO’s.

Microsoft’s mission is to empower every person and every organization on the planet to achieve more.

As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals.

Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.


**Responsibilities**

+ Collaborating closely with engineering teams on building and enhancing tooling and automation solutions for faster resolution of issues impacting SLO’s and averting incidents altogether when possible.
+ Collaborating with the customers to understand their pain points around Supportability and SLO attainment and formulate strategies for addressing recurring issues in a sustainable way.
+ Communicate on a technical level and be the single point of contact for interfacing with enterprise customers for handling service escalations and driving the issues to resolution.
+ Ability to design and implement any changes to service telemetry for the automation to consume if it is not already available.
+ Enhancing customer facing experience by proactive alerting based on utilization, trends, resource health, etc.
+ Analyze data and provide operational insights into customer experience to design and product teams, so that we can design features with Supportability in mind.
+ Embody our culture (https://careers.microsoft.com/v2/global/en/culture) and values (https://www.microsoft.com/en-us/about/corporate-values) .


**Qualifications**

**Required/Minimum Qualifications**


+ 4+ years technical experience in software engineering, network engineering, or systems administration
+ OR Bachelor's Degree in Computer Science, Information Technology, or related field AND 1+ year(s) technical experience in software engineering, network engineering, or systems administration
+ OR Master's Degree in Computer Science, Information Technology, or related field.
+ 3+ years of experience running large scale cloud services.


**Other Requirements**

Ability to meet Microsoft, customer and/or government security screening requirements are required for this role.

These requirements include, but are not limited to the following specialized security screenings:


+ Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.


**Preferred/Additional Qualifications**


+ 2+ years of operational experience in improving Service Reliability, Availability and Performance.
+ Understanding of Observability and MELT implementation patterns for large-scale services.
+ Experience in Logic Apps and authoring Jupyter Notebooks.
+ Experience in analyzing, troubleshooting, and automating root cause analysis and mitigation of incidents impacting large-scale distributed systems.
+ Systematic problem-solving approach, coupled with communication skills and a sense of curiosity.
+ Ability to deal with the ambiguity associated with working in a fast-paced environment.
+ Influencing the product architecture and roadmap to make sure the customer-experienced supportability is always a key consideration when evolving the product.


**Site Reliability Engineering IC3** - The typical base pay range for this role across the U.S. is USD $100,600 - $199,000 per year.

There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $131,400 - $215,400 per year.



Certain roles may be eligible for benefits and other compensation.

Find additional benefits and pay information here: https://careers.microsoft.com/us/en/us-corporate-pay

Microsoft will accept applications for the role until October 30, 2025.


#azdat


#azuredata


#SRE

Microsoft is an equal opportunity employer.

Consistent with applicable law, all qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances.

If you need assistance and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations (https://careers.microsoft.com/v2/global/en/accessibility.html) .



Required Skill Profession

Other General



Your Complete Job Search Toolkit

✨ Smart • Intelligent • Private • Secure

Start Using Our Tools

Join thousands of professionals who've advanced their careers with our platform

Rate or Report This Job
If you feel this job is inaccurate or spam kindly report to us using below form.
Please Note: This is NOT a job application form.


    Unlock Your Site Reliability Potential: Insight & Career Growth Guide


  • Real-time Site Reliability Jobs Trends in Redmond, United States (Graphical Representation)

    Explore profound insights with Expertini's real-time, in-depth analysis, showcased through the graph below. This graph displays the job market trends for Site Reliability in Redmond, United States using a bar chart to represent the number of jobs available and a trend line to illustrate the trend over time. Specifically, the graph shows 9370 jobs in United States and 38 jobs in Redmond. This comprehensive analysis highlights market share and opportunities for professionals in Site Reliability roles. These dynamic trends provide a better understanding of the job market landscape in these regions.

  • Are You Looking for Site Reliability Engineer II Job?

    Great news! is currently hiring and seeking a Site Reliability Engineer II to join their team. Feel free to download the job details.

    Wait no longer! Are you also interested in exploring similar jobs? Search now: .

  • The Work Culture

    An organization's rules and standards set how people should be treated in the office and how different situations should be handled. The work culture at Microsoft Corporation adheres to the cultural norms as outlined by Expertini.

    The fundamental ethical values are:
    • 1. Independence
    • 2. Loyalty
    • 3. Impartiality
    • 4. Integrity
    • 5. Accountability
    • 6. Respect for human rights
    • 7. Obeying United States laws and regulations
  • What Is the Average Salary Range for Site Reliability Engineer II Positions?

    The average salary range for a varies, but the pay scale is rated "Standard" in Redmond. Salary levels may vary depending on your industry, experience, and skills. It's essential to research and negotiate effectively. We advise reading the full job specification before proceeding with the application to understand the salary package.

  • What Are the Key Qualifications for Site Reliability Engineer II?

    Key qualifications for Site Reliability Engineer II typically include Other General and a list of qualifications and expertise as mentioned in the job specification. Be sure to check the specific job listing for detailed requirements and qualifications.

  • How Can I Improve My Chances of Getting Hired for Site Reliability Engineer II?

    To improve your chances of getting hired for Site Reliability Engineer II, consider enhancing your skills. Check your CV/Résumé Score with our free Tool. We have an in-built Resume Scoring tool that gives you the matching score for each job based on your CV/Résumé once it is uploaded. This can help you align your CV/Résumé according to the job requirements and enhance your skills if needed.

  • Interview Tips for Site Reliability Engineer II Job Success
    Microsoft Corporation interview tips for Site Reliability Engineer II

    Here are some tips to help you prepare for and ace your job interview:

    Before the Interview:
    • Research: Learn about the Microsoft Corporation's mission, values, products, and the specific job requirements and get further information about
    • Other Openings
    • Practice: Prepare answers to common interview questions and rehearse using the STAR method (Situation, Task, Action, Result) to showcase your skills and experiences.
    • Dress Professionally: Choose attire appropriate for the company culture.
    • Prepare Questions: Show your interest by having thoughtful questions for the interviewer.
    • Plan Your Commute: Allow ample time to arrive on time and avoid feeling rushed.
    During the Interview:
    • Be Punctual: Arrive on time to demonstrate professionalism and respect.
    • Make a Great First Impression: Greet the interviewer with a handshake, smile, and eye contact.
    • Confidence and Enthusiasm: Project a positive attitude and show your genuine interest in the opportunity.
    • Answer Thoughtfully: Listen carefully, take a moment to formulate clear and concise responses. Highlight relevant skills and experiences using the STAR method.
    • Ask Prepared Questions: Demonstrate curiosity and engagement with the role and company.
    • Follow Up: Send a thank-you email to the interviewer within 24 hours.
    Additional Tips:
    • Be Yourself: Let your personality shine through while maintaining professionalism.
    • Be Honest: Don't exaggerate your skills or experience.
    • Be Positive: Focus on your strengths and accomplishments.
    • Body Language: Maintain good posture, avoid fidgeting, and make eye contact.
    • Turn Off Phone: Avoid distractions during the interview.
    Final Thought:

    To prepare for your Site Reliability Engineer II interview at Microsoft Corporation, research the company, understand the job requirements, and practice common interview questions.

    Highlight your leadership skills, achievements, and strategic thinking abilities. Be prepared to discuss your experience with HR, including your approach to meeting targets as a team player. Additionally, review the Microsoft Corporation's products or services and be prepared to discuss how you can contribute to their success.

    By following these tips, you can increase your chances of making a positive impression and landing the job!

  • How to Set Up Job Alerts for Site Reliability Engineer II Positions

    Setting up job alerts for Site Reliability Engineer II is easy with United States Jobs Expertini. Simply visit our job alerts page here, enter your preferred job title and location, and choose how often you want to receive notifications. You'll get the latest job openings sent directly to your email for FREE!