This is Adyen
Adyen provides payments, data, and financial products in a single solution for customers like Meta, Uber, H&M, and Microsoft - making us the financial technology platform of choice. At Adyen, everything we do is engineered for ambition.
For our teams, we create an environment with opportunities for our people to succeed, backed by the culture and support to ensure they are enabled to truly own their careers. We are motivated individuals who tackle unique technical challenges at scale and solve them as a team. Together, we deliver innovative and ethical solutions that help businesses achieve their ambitions faster.
Site Reliability Engineer - Data PlatformAt Adyen, our Data Platform Site Reliability Engineers (SREs) sit at the crossroads of Data Engineering, Backend Engineering, and Systems Engineering. This role blends software and systems engineering to develop the essential tools and processes that power our on-premise Big Data Platforms. These platforms support dozens of products, hundreds of developers, and thousands of daily jobs, reinforcing Adyen's industry-leading capabilities.
As a
Data Platform SRE
, you will be responsible for managing one of the largest data platforms in the world. Your focus will be on ensuring that data, data services, and infrastructure arereliable, fault-tolerant, efficiently scalable, and cost-effective
. Beyond operations, you'll have the opportunity todesign, build, and deliver scalable systems
as a software engineer. If you thrive on automation and reducing manual toil, you'll play a strategic role in shaping the future of automation for our Big Data Platforms.You will collaborate with
data and ML scientists and engineers
to develop and roll out tools that enhance platform performance while operating and scaling multiple big data platforms. This includes managing a fleet of a few thousand nodes, tens of thousands of cores, hundreds of terabytes of RAM, and tens of petabytes of storage. This is a unique opportunity to work at scale and influence the infrastructure behind Adyen's cutting-edge data capabilities. What you'll doDesign, develop, operate, and maintain
scalable, reliable, fault-tolerant, and high-performance big data platforms.- Work with
distributed systems
in all shapes and flavors (databases, file systems, compute, etc.). - Shape and maintain
continuous release and deployment environments (CI/CD)
. - Establish
best engineering practices
for data professionals by measuring and monitoringavailability, latency, and system health
of big data products. Implement observability solutions
(logging, monitoring, alerting) to ensure system reliability and performance.Optimize performance
of large-scale distributed data platforms, including tuning queries, cluster configurations, and resource management.Practice sustainable incident response and blameless postmortems
, with a focus on automation and root cause analysis.Improve security and compliance
by implementing access controls, encryption, and best practices for data governance.Design and build automation tools
to reduce toil and enhance the reliability of data workflows.- Design, develop, and maintain:
- Tools to enhance
data discoverability, data quality, and data monitoring
. Streaming and batch data processing frameworks
and applications.Self-service infrastructure
to empower data teams while maintaining platform stability.
- Tools to enhance
Fluent in Python
. Java, Golang, or Rust are also appreciated.- Strong
software engineering
background with experience in large-scale distributed systems. - A
team player with strong communication skills
, able to work closely with diverse stakeholders (analysts, data scientists, data engineers, infrastructure, and security teams). - Experience developing and maintaining:
Distributed data and compute systems
(Spark, Trino, Druid, Delta, etc.).CI/CD pipelines and DevOps ecosystems
.Real-time and batch data pipelines
(Kafka, Spark Streaming, Flink), with an emphasis on scalability and usability.Kubernetes (k8s, Docker) and/or Hadoop ecosystems
(Hive, YARN, HDFS, Kerberos).Observability & monitoring
tools (Prometheus, Grafana, OpenTelemetry).
Experience as a Data Platform Engineer, Site Reliability Engineer or a Software Engineer for Data Platforms
Experience managing large-scale private cloud or on-premise infrastructure
.Background in performance optimization
for distributed computing.Security-focused mindset
, including best practices for access control and data governance.Previous experience in an on-call rotation
for incident response in a production environment.Familiarity
with networking, automation and configuration management tools (e.g., Ansible, Terraform, Puppet).
Our Diversity, Equity and Inclusion commitments
Our unique approach is a product of our diverse perspectives. This diversity of backgrounds and cultures is essential in helping us maintain our momentum. Our business and technical challenges are unique, and we need as many different voices as possible to join us in solving them - voices like yours. No matter who you are or where you're from, we welcome you to be your true self at Adyen.
Studies show that women and members of underrepresented communities apply for jobs only if they meet 100% of the qualifications. Does this sound like you? If so, Adyen encourages you to reconsider and apply. We look forward to your application!
What's next?
Ensuring a smooth and enjoyable candidate experience is critical for us. We aim to get back to you regarding your application within 5 business days. Our interview process tends to take about 4 weeks to complete, but may fluctuate depending on the role. . Don't be afraid to let us know if you need more flexibility.