Site Reliability Engineer / Cloud Specialist ESSENTIAL SKILLS REQUIREMENTS: Experience with Container Orchestrations Platforms preferably Kubernetes Working knowledge with software development in one or more of the following languages: C#, JavaScript/Typescript Experience with Unix/Linux operating systems internals and administration or in-depth knowledge of the Unix networking stack. In-depth network know-how: Subnetting, Routing, Firewalling, DNS, (reverse)-proxies & understanding of OSI layers In-depth low-level understanding of the TCP stack and can debug problems on the “socket/driver” level. Understanding of modern microservices-communication architecture (gRPC, Protobufs, service meshes etc.) ADVANTAGEOUS SKILLS REQUIREMENTS: Ability to build highly automated CI/CD pipelines with customized automation scripts using bash or Python. Experience with standard networking debug tools: traceroute, MTR & Wireshark/tcpdump IaaS knowledge, e.g., deployment and maintenance of Linux VMs (ansible) Experience with cloud technologies: Confluent Kafka Managed Kubernetes Engines, like Azure Kubernetes Service (AKS) Proxies, like Azure Application gateway Managed databases, such as PostgreSQL Working with CI/CD pipeline systems from GitLab and/or GitHub Actions Git and repository hosting platforms such as GitLab, Nexus & GitHub (Self hosted) Highly confident and structured approach to work even in the event of high impact, high stress-level incidents. Excellent interpersonal and organizational skills with the ability to communicate effectively (both verbally and in writing) with colleagues from all types of educational/intellectual backgrounds. Flexibility to take up different tasks in the project, willingness to roam around a subset of project components. Ability and willingness to coach and give training to fellow colleagues in deep-dive workshops and pair-programming sessions Willing and able to travel internationally (twice a year) Above-board work ethics QUALIFICATIONS/EXPERIENCE Minimum 3 years IT working experience or IT Diploma / Degree Minimum 2 years operations working experience ROLE AND RESPONSIBILITIES: A team member of a larger product team that focusses on the development and support of a several mission-critical components. Working on large-scale fault-tolerant systems and thrive to always improve the resiliency of our systems. Work closely with the development team and the product owner and are responsible for the planning and co-ordination of all "Design for Run” activities. Strive to improve the whole lifecycle of our services from inception, architecture design through deployment and operating these mission-critical systems. As a SRE with deep understanding of the underlaying systems, you take part in 24/7 on-call rotations with teams around the world and are able to restore systems in efficient manner. Apply online or send CV to thelma.greylinglimpersonnel.com
Apply Now