Senior Java Engineer

As a Senior SRE, you’ll balance your passion for both software development and reliability engineering, applying engineering discipline to solve operational challenges at scale. You’ll collaborate closely with development teams as a trusted advisor, influencing system design, establishing reliability standards, and driving quality improvements across the platform. Your role dynamically shifts between hands-on coding—building tools, automation, and infrastructure—and incident response, performance optimisation, and operational excellence.

What You’ll Do

System Reliability & Performance

Implement comprehensive monitoring and observability using OpenTelemetry standards
Identify single points of failure in distributed systems
Analyse system performance across OS and network layers, identifying resource utilisation patterns and bottlenecks to optimise efficiency
Define and maintain Service Level Objectives (SLOs) for critical trading services

Technical Leadership

Partner with development teams on system design, capacity planning, and architectural reviews
Provide technical guidance and hands-on support to help development teams transition their applications from traditional deployment models to containerised infrastructure.
Lead incident response efforts and conduct blameless postmortems

Infrastructure & Messaging

Optimise message-driven systems by ensuring reliable event streaming and asynchronous communication patterns
Scale systems through automation and infrastructure-as-code practices

Software Development Fundamentals

Write clean, maintainable code following industry best practices and design patterns
Apply software engineering best practices, including version control, code reviews, and testing strategies

Essential Technical Skills

What you’ll need for this role

Strong Java development experience with a deep understanding of JVM internals and performance tuning
Hands-on expertise with message brokers (ActiveMQ, Kafka or similar) in production environments
Proven experience with containerization and orchestration (Nomad would be an advantage)
Practical knowledge of OpenTelemetry and distributed tracing concepts
Solid understanding of reliability patterns, circuit breakers, and fault tolerance

Experience Requirements

Experience in high-throughput, low-latency production environments
Track record of improving system reliability and performance at scale
Experience with continuous delivery and DevOps practices
Strong troubleshooting skills in distributed systems
Background in financial services or similar mission-critical domains (preferred)
Are you interested in this position?

Apply by clicking on the “Apply Now” button below!

#AlbionarcJobs#FintechJobs

#AsiaJobs#MiddleEastCareers

#TechTalent#FintechRecruitment

#FinanceOpportunities#