CloudLinux is looking for a brilliant Senior Site Reliability Engineer (SRE) to join the Release Engineering Department, a team that plays a critical role in maintaining both external and internal infrastructure related to package repositories, with a strong focus on delivering and managing repository distribution to users.
This role offers a unique opportunity to collaborate with multiple development teams, accelerate progress, and provide enterprise-level solutions globally. Responsibilities include Linux OS administration, designing system solutions at an architectural level, advancing cloud technologies, system programming, Python/Linux scripting, and working with virtualization. This is a remote position best suited for professionals located in Europe and CIS, as the team primarily operates within European time zones.
As our Senior Site Reliability Engineer, you will:
- Design, implement, and manage scalable, resilient, and secure wide company repository infrastructure for CloudLinux products as a first assignment.
- Automate software operations for re-usability and consistency across private and public clouds, taking into consideration the complexities of distributed systems.
- Monitor system performance and troubleshoot issues proactively to ensure optimal uptime and reliability.
- Automate deployment processes using Infrastructure as Code (IaC) principles.
- Share your experience, know-how, and best practices with other team members in design sessions, system architecture discussions, mentorship, and "doing work together".
Requirements
To be successful, you should have:
- Strong background in development: an ideal candidate had started a career as a developer, then rolled to infrastructure-based projects on a large scale.
- Proven experience as a leading SRE or in a similar role, with a strong focus on Linux environments.
- Proficiency in modern agile SDLC practices and principles, orchestration, and CI/CD tooling i.e. Python, Java, Terraform, Ansible, Cloudformation, Puppet, Chef, or similar.
- Knowledge of the Grafana ecosystem or similar, building dashboards, alert rules, PromQL, as well as frontend observability.
- Excellent technical knowledge of IT Infrastructure, including network and application load balancers, switches, routers, and IP addressing.
- Strong analytical and problem-solving skills with a focus on root cause analysis and mitigation.
- Excellent communication and teamwork skills with the ability to collaborate effectively across engineering teams.
- English: at least Intermediate level required.
Benefits
What's in it for you?
- A focus on professional development.
- Interesting and challenging projects.
- Fully remote work with flexible working hours, that allows you to schedule your day and work from any location worldwide.
- Paid 24 days of vacation per year, 10 days of national holidays, and unlimited sick leaves.
- Compensation for private medical insurance.
- Co-working and gym/sports reimbursement.
- Budget for education.
- The opportunity to receive a reward for the most innovative idea that the company can patent.
Apply for this job
Ключевые навыки
- Английский — B2 — Средне-продвинутый
Задайте вопрос работодателю
Он получит его с откликом на вакансию
Вакансия опубликована 27 марта 2025 в Тбилиси