
47137
Make an amazing climb in your career in an international team of experts. Our company provides technological services for the whole Schwarz group of more than 30 countries in Europe and the US. Our vision is to be the leading ecosystem for a better life. We built the European sovereign cloud STACKIT. With XM Cyber we set new standards in differing cyber crimes. We run AI better than anyone. With us you will find a variety of opportunities to grow and do your best at your calling – IT. We exist to improve life with our products and services - for today's generation and future generations. We act future proof!
The impact you will create:
- You will be a part of the Infrastructure-as-a-Service Site Reliability Engineering (IaaS SRE) team, helping to build a high-performance cloud platform based on OpenStack, with a focus on scaling and expansion across data centers and national borders
- You will continuously operate and optimize technical processes through efficient automation and further development using Golang and/or Python
- You will be responsible for and optimize the provisioning of bare-metal resources from various manufacturers using OpenStack Ironic and internal and/or Open-Source-based tools
- You will operate and manage the surrounding Linux-based system landscapes (e.g., Kubernetes, Proxmox) and ensure the high availability of our cloud infrastructure
- You will create and maintain documentation and be responsible for the implementation and maintenance of monitoring and logging (e.g., Prometheus, Grafana, ELK Stack) for stable platform operation
- You will be part of a motivated team that constantly strives for improvement and continuously develops itself and its products
Experience and skills you will need:
- You have a passion and enthusiasm for new technologies and topics related to Linux, automation, virtualization, and networking
- You are proactive in driving improvements in availability and scaling and are eager to automate processes
- You are capable of analyzing and solving technical problems and have experience in conducting root cause analyses
- You have several years of experience in implementing and managing Kubernetes environments, including deployment and scaling
- You have experience in programming and scripting in Python and/or Golang
- You have experience in build and release management
STACKIT CLOUD INFRASTRUCTURE SITE RELIABILITY ENGINEER (SRE) (m/f/d)