At Odido, we believe in optimism, forward-thinking, and a human-centered approach to technology. As a Site Reliability Engineer (SRE), you will be at the heart of our digital operations, ensuring that millions of customers experience smooth and uninterrupted service across mobile, fiber, and TV platforms.
Imagine this: A high-traffic event is happening-thousands of customers are streaming, making calls, and using data-intensive applications simultaneously. Behind the scenes, your work ensures that Odido's infrastructure remains resilient, preventing bottlenecks before they occur. You've automated key operational processes, set up proactive monitoring, and designed self-healing systems that minimize downtime.
At 3 AM, an alert comes in-an anomaly detected in system performance. Thanks to your predictive monitoring setup, your automation script kicks in, rerouting traffic seamlessly, preventing an outage before customers even notice. Your team follows up with a post-incident analysis, identifying root causes and refining resilience strategies for the future.
You'll be part of a high-impact engineering team, collaborating with developers, platform engineers, and operations specialists to continuously improve Odido's service reliability, scalability, and efficiency. Your work will drive the automation, instrumentation, and observability that power Odido's digital services.
Key Responsibilities
Functional Application Management: Understand the key/critical flows and assure 24*7 availability for care and sales flows.
Continuous Improvement: Oversee and enhance incident-response processes, ensuring lessons learned translate into structural improvements.
Automation & Application as Code: Develop reusable patterns for automation, configuration management, and deployment across teams and products.
Service Ownership: Take full responsibility for several critical services, ensuring high availability and reliability.
Incident Management: Lead or participate in outage response calls, quickly resolving incidents and minimizing downtime.
Monitoring & Observability: Design and implement proactive monitoring strategies using tools like Prometheus, Grafana, and Kibana to improve system performance.
Troubleshooting & Debugging: Analyze and fix system issues in a complex distributed environment and application stack
On-Call Rotation: Participate in on-call schedules, ensuring 24/7 coverage for mission-critical systems.
Engineering Best Practices: Advocate for DevOps and SRE principles, mentoring junior engineers on automation and operational excellence.
Together we areWe are Odido, the new provider of mobile, fiber optic and TV. And with almost 2,000 colleagues, we show that telecom can be improved. Because technology is for everyone. Wherever you come from, wherever you go. With Odido everyone participates in the digital world. That is our ambition. Everyone at Odido helps to build a brand that is human, optimistic and progressive.
Is that really something for you? Then we might fit well together.
This is what we stand forOur name - you can also read it from back to front - consists of different shapes. Which together are one. Because that's how we look at the world around us. As a place where people, no matter how different, move forward together. We're there for each other. We always look at opportunities. We celebrate diversity and are committed to an inclusive work environment with equal opportunities for all. That sounds good of course. But we don't stop at fine words: at Odido we are a recognized Top Employer. A confirmation that we are proud of.
What we offer.- Good salary and variable bonus scheme;
- Hybrid working;
- A progressive pension scheme;
- 30 vacation days (if you work for us full-time) and an extra day off after Ascension Day;
- Redeemable holidays;
- An Odido subscription;
- Real growth opportunities;
- Personal annual learning budget and over 200 digital training and courses;
- Workshops, learning weeks, annual ski trip, fun outings and parties.
Some things you cannot learn in a book or school, but you possess them by nature. Your core personality is empathic, decisive, humorous, people-focused, authentic, and honest. You know the difference between managing and coaching and understand when to apply which. You embrace challenges and are eager to grow. These are traits we are looking for not only in managerial positions but in every member at every level of Odido!
Must-Have Skills and Qualifications
· Experience on NodeJs, Python and Rest Services
· Experience with public cloud platforms (AWS, Azure) and related technologies (Docker, Kubernetes, CloudFormation).
· Strong understanding of storage, database systems, caching, queueing, and networking.
· Experience in leading technical recoveries and troubleshooting distributed systems.
· Ability to debug, optimize code, and automate routine operational tasks.
· Solid foundation in Linux or Windows administration and troubleshooting.
· Strong knowledge of monitoring/observability tools (Prometheus, Grafana, Kibana, Elasticsearch).
· Understanding of Service Level Agreements (SLAs) and Service Level Objectives (SLOs).
· Proficiency in at least one programming language for automation and scripting.
· Excellent command of English, both written and spoken.
Nice-to-Have Skills
· Experience in telecom or large-scale cloud environments.
· Knowledge of AI-driven operational solutions for predictive monitoring.
· Background in security practices and compliance for cloud environments.
Learn every dayAt Odido we learn every day. All of us. You are responsible for your own development. That is why you decide how, what and when you learn. We have more than 200 digital training courses with which you can work on professional and personal goals. We don't do old-fashioned performance reviews and assessments. You keep your manager and colleagues informed of your goals and progress. You are in control.
Press on the buttonAre you as excited about Odido as we are? Then we are probably a good match. We are looking forward to meet you! You can apply via the application button. Done in a minute!