Jobs in

Site Reliability Engineering Manager Cape Town - Cape Town Region

Tasiso Consulting Ltd

Reference: PTA000101-SK-1 The Site Reliability Engineering (SRE) Manager, is responsible for building and leading the Site Reliability Engineering team for the telescope. This role will use Site Reliability Engineering and other leading principles to support the planning, monitoring, and controlling of the day-to-day operations and delivery aspects of the global IT and Networks of the Observatory, with a particular focus on the systems in South Africa. The construction of the software and computing systems adheres to large-scale agile principles, using a tailored version of the Scaled Agile Framework (SAFe); this role will be a key stakeholder within this framework as it evolves from construction to operations. This role is also an active participant in implementing all aspects of Site Reliability Engineering across the Global Observatory, including technical vision, observability, automation strategy, solution delivery, and platform incident and problem management. This is a leadership role with both technical and people leadership responsibilities. As such, this role participates in short and long-term system and capability planning, teams and organizational planning. This position reports directly to the Head of Computing and Software. Key Responsibilities: Build, lead and manage the SRE and IT Telescope Operations Team. Operations and Service management - Work with stakeholders within the organisation to develop and detail Computing and Software operations and service framework, processes and tools required to operate the telescope as intended. Service delivery and support - Continuously assess and recommend improvements to our platform and processes to enhance the effectiveness of our services. Infrastructure, network and platform management. Support telescope construction and deployment. Key Requirements: Qualification: BTech/ Degree/ Masters/ PHD in Computer Science, Information Technology, Information Systems, Computer Engineering or related fields Experience: BTech in Computer Science, Information Technology, Information Systems, Computer Engineering or related fields coupled with 13 years relevant working experience; or Degree in Computer Science, Information Technology, Information Systems, Computer Engineering or related fields coupled with 9 years relevant working experience; or Master's Degree in Computer Science, Information Technology, Information Systems, Computer Engineering or related fields coupled with 7 years relevant working experience; or PHD in Computer Science, Information Technology, Information Systems, Computer Engineering or related fields coupled with 5 years relevant working experience. Computer and network infrastructure implementation IT service, operations and management, including significant responsibility over Service Level Agreements IT Infrastructure or software Team leadership IT Architecture and Governance Project management IT systems engineering, application support, and user management IT governance and security Data governance and security IT availability, resilience and redundancy Systems analysis, design and engineering Experience in supporting distributed software systems in a production environment such as Cloud and/or Data Centres Procurement and IT asset management Knowledge: Track record of building and managing high-performance teams in a Software, IT or Technology related industry or organisation. Experience in asset lifecycle management and software asset management. Experience in managing resources and prioritisation. Knowledge and background with IT Service Management disciplines and Frameworks such as ITIL and Change Management. Experience of Lean Agile project management. Experience of working in a globally diverse team. Programming/scripting experience and capability across multiple platforms. Additional Notes: SKILLS/ABILITIES/COMPENTENCIES: Essential: Experience working with Linux and within the Open Source Software Ecosystem Experience with DevOps tools, processes and culture. Experience and/or certification and knowledge in SRE, ITIL or related IT Management processes. Experience supporting and maintaining large-scale High-Performance Computing (HPC) and storage systems. Advanced experience with programming and/or scripting languages such as Python. Desirable: Certification in Project management Experience in agile project management e.g. SAFe, Scrum. Demonstrate interest in astronomy and understanding of the challenges of controlling telescopes. Strong Leadership Quality Strategic thinker Problem solving skills Planning and Time Management Team building and collaboration Resource Management Planning and Design Communication and Interpersonal skills Skills: Teamwork and Collaboration: Cooperates with others to achieve organisational objectives and may share team resources in order to do this. Collaborates with other teams as well as industry colleagues. Influence and Communication: Identifies critical stakeholders and influences them via an influential third party, for example through an established network, to gain support for sometimes contentious proposals/ideas. Resource Management/Leadership: Provides leadership that fosters an environment that encourages new ideas and provides support for the development of emerging skills. Creates trust by displaying consistency, understanding, integrity and patience. Plans, seeks, allocates and monitors resources to achieve outcomes. Judgement and Problem Solving: Anticipates and manages problems in ambiguous situations. Develops and selects an appropriate course of action and provides for contingencies. Evaluates, interprets and integrates complex bodies of information and draws logical conclusions, synthesises proposals and defends options with reasoned arguments. Independence: Assesses the risk and opportunity of identified strategies, options and actions. Overcomes problems and setbacks in achieving goals. Invariably includes consideration of value-added future impact on the bottom line when determining the optimal and efficient use of resources. Adaptability: Demonstrates flexibility in thinking and adapts to and manages the increasing rate of organisational change by adjusting strategies, goals and priorities. Apply Now
Share this job with someone you think should apply!
Facebook buttonFacebook   Whatsapp buttonWhatsapp

Related Jobs

Site Reliability Engineering Manager Cape Town - Cape Town Region

Phanda Personnel

...

Site Reliability Engineering Manager - Cape Town City Centre

...

Pipeline Engineer Cape Town - Cape Town Region

Phanda Personnel

...

Senior Engineer Cape Town - Cape Town Region

Tasiso Consulting Ltd

...

Senior Pipeline Engineer Cape Town - Cape Town Region

Phanda Personnel

...

Want to do another search?

Jobs in