Our Technology team is the backbone of our company: constantly creating, testing, learning and iterating to better meet the needs of our customers. If you thrive in a fast-paced, ideas-led environment, you’re in the right place.
Site Reliability Engineering (SRE) is an engineering discipline that combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. Responsible for building infrastructure and eliminating manual work through automation. Practices such as limiting time spent on operational work and proactive identification of potential outages factor into iterative improvement that is key to both product quality and interesting and dynamic day-to-day work.
Key Responsibilities: As a Senior Site Reliability Engineer, you will be responsible for handling Linux based infrastructure in a hybrid cloud environment. You will support a team in creating highly robust infrastructure architectures with systems that support development and enhance performance of highly scalable web-based services catering to the global market. You will also help us in managing our site availability and scalability needs. You will actively participate in deploying and supporting applications on our private and public cloud environments.
Our web site serves hundreds of thousands of customers a day, and is one of the best-known brands on the internet. At the same time, we're in the middle of a transformation to a cloud architecture and providing resources "as a service" to developers. To keep up with rapid growth, you will be working with our development teams to support the current environment while helping us in this transition to the cloud.
Your few other key responsibilities include:
- Analysis & Problem Solving: You will need to understand our application architecture and systems and the business requirements they implement so you can effectively make changes to our applications and investigate issues.
- Communication: Whether via face-to-face discussion, phone, email, chat, white-boarding, or other collaboration platforms, you must be an effective communicator who can inform, explain, enable, teach, persuade, coordinate, etc.
- Team Collaboration: You must be able to effectively collaborate and share ownership of your team’s codebase and applications. You must be willing to fully engage in team efforts, speak up for what you think are the best solutions, and be able to converse respectfully and compromise when necessary.
- Automation: Promote best practices to identify and solve complex infrastructure & monitoring issues by introducing automation initiatives, you should be excited about reducing manual intervention & turn-around time to solve repetitive problems.
- Managing thousands of servers and hundreds of applications in multiple geo locations.
- Responsible to build, tune and perform operational efficiency on the production environments.
- Responsible to automate and monitor the health of the Priceline.com website.
- Incident management, root cause analysis and production support
- Experience with Linux infrastructure.
- Proficiency in one or more of these scripting languages – Shell, Perl, Python & PowerShell scripting.
- Hands-on experience working with Red Hat Enterprise Linux 6 & 7/ CentOS.
- Familiar with large scale production systems and technologies, for example load balancing, monitoring, distributed systems, and configuration management.
- Experience with implementation and support of a user-facing, large-scale, tech stacks.
- Experience in configuration management tools Salt/Chef/Puppet/Ansible. Docker is a plus.
- Deep understanding of infrastructure scalability issues.
- Passionate about automating and improving processes.
- A 4-year degree in Computer Science (or related field). Graduate degree helpful.
- At least 3 years of working experience as site reliability or DevOps engineer.
- Demonstrated history of living the values important to Priceline: Customer, Innovation, Team, Accountability and Trust. Unquestionable integrity and ethics.
- Overall 5+ years’ experience or equivalent proficiency with the skills outlined above.