Ver oferta completa

SITE RELIABILITY ENGINEERING MANAGER

Descripción de la oferta de empleo

CompanyOnum is a data optimization and analytics company based in Madrid. We specialize in real-time data analysis to enable rapid decision-making regarding cybersecurity, network performance, and infrastructure management. Onum helps you optimize your data analytics costs by reducing data, avoiding vendor lock-in, and aligning the value of each dataset with actions taken.About the RoleAs an SRE Manager, you will be responsible for leading and managing a team of Site Reliability Engineers while staying actively involved in day-to-day technical operations. This is a hands-on leadership role where you will help your team solve complex problems, drive operational excellence, and ensure that our platform remains highly reliable, scalable, and efficient. You will work closely with software engineers and DevOps teams to identify opportunities to improve infrastructure reliability and automation.ResponsibilitiesTeam Leadership & Development:Manage and mentor a small team of SREs, helping them to grow their skills through coaching, feedback, and development plans.Foster a collaborative team environment where knowledge sharing, continuous learning, and innovation are encouraged.Assist in recruiting and onboarding new SRE team members, ensuring they are set up for success.Conduct regular one-on-ones with team members, set clear performance goals, and provide ongoing support.Hands-on Technical Guidance:Lead by example by participating in technical discussions, incident resolution, and troubleshooting critical system issues.Provide guidance on best practices for system reliability, automation, and performance optimization.Support the team in designing and implementing reliable, scalable cloud infrastructure, ensuring smooth deployment pipelines and reducing manual toil.Incident & Operations Management:Help the team manage the on-call rotation and be available to support incident response when necessary.Ensure timely resolution of incidents, participate in post-mortems, and track follow-up actions to prevent recurrence.Establish effective processes for monitoring, alerting, and improving system health, working with your team to ensure high availability.Collaboration & Cross-functional Partnership:Collaborate closely with software engineering, DevOps, and product teams to define reliability standards and improve the overall stability of our platform.Communicate technical issues, resolutions, and improvements clearly to non-technical stakeholders.Work with teams to set Service Level Objectives (SLOs) and improve performance based on data-driven decisions.Automation & Process Improvement:Identify opportunities for automation in daily operations, helping to improve deployment speed, incident response, and reliability of the platform.Ensure the team is leveraging infrastructure-as-code (e.g., Terraform) and other automation tools to reduce manual processes and increase scalability.Operational Metrics & Monitoring:Work with your team to ensure systems are well-monitored and metrics are effectively captured using tools like Prometheus, Grafana, or Datadog.Track key performance indicators (KPIs) for system uptime, reliability, and team performance, identifying areas for continuous improvement.Qualifications:5+ years of experience in Site Reliability Engineering, DevOps, or a similar role, with at least 1+ years experience leading a small team or mentoring junior engineers.Strong understanding of cloud platforms (AWS, GCP, or Azure) and modern infrastructure practices (e.g., containerization with Docker/Kubernetes, CI/CD pipelines).Hands-on experience with infrastructure-as-code tools (Terraform, Ansible, etc.) and cloud automation.Proven ability to troubleshoot complex infrastructure issues, perform root cause analysis, and implement system improvements.Experience with monitoring and alerting systems like Prometheus, Grafana, Datadog, or equivalent.Excellent communication and collaboration skills, with the ability to work cross-functionally and explain technical concepts to non-technical stakeholders.
Ver oferta completa

Detalles de la oferta

Empresa
  • Onum
Localidad
  • En toda España
Dirección
  • Sin especificar - Sin especificar
Fecha de publicación
  • 03/04/2025
Fecha de expiración
  • 02/07/2025
Customer Success Manager
Recruit4work SL

Requisitos del puesto completion of a master’s degree/mba or an equivalent qualification is required, coupled with practical work experience, preferably in hr tech, within roles such as customer success manager, account manager, or delivery manager... active participation in management meetings is essential......

VIP Manager
Triskel Consulting

Nuestro cliente, una empresa líder en el sector del igaming, está buscando un vip manager para sus oficinas de sant cugat en barcelona... como vip manager serás responsable de identificar nuevos clientes de alto potencial, así como establecer y desarrollar las relaciones con clientes de alto valor existentes......

Community Manager con experiencia en Artes Gráficas
ZOEGA LTD

Necesitamos un community manager habilidoso con una sólida formación en diseño gráfico y dominio de illustrator y photoshop... ¡estamos contratando: community manager con experiencia en artes gráficas! ubicación: barcelona - presencial descripción del puesto: buscamos a una persona talentosa para unirse......

VIP Manager - iGaming
Triskel Consulting

Nuestro cliente, una empresa líder en el sector del igaming, está buscando un vip manager para sus oficinas de sant cugat en barcelona... como vip manager serás responsable de identificar nuevos clientes de alto potencial, así como establecer y desarrollar las relaciones con clientes de alto valor existentes......

Store Manager
Anonimo

Requisitos del puestocurrently, we are on the lookout for a visionary store manager who embodies the essence of our brand – someone who can seamlessly blend the art of salesmanship with the science of team motivation... we are seeking an individual who thrives on challenges, embraces creativity, and......

Supply Chain Manager
Involve rh

Confidencial cuenta con una posición como supply chain manager para optimizar la cadena de suministro para garantizar la eficiencia en la distribución de productos y la satisfacción del cliente... 000,00 € eur por año bruto jornada laboral: de lunes a viernes modalidad: presencial horario: 9am a 6pm......

Community Manager Medio Tiempo
Involve rh

Involve rh cuenta con una posición como community manager para crear y mantener una comunidad online activa y comprometida, gestionando la presencia de la marca en redes sociales y generando interacción con los seguidores... funciones: gestionar las redes sociales de la empresa, interactuar con la comunidad......

Technical property manager
Mvgm

¿qué estamos buscando? un/a property manager para responsabilizarse de la gestión de los activos inmobiliarios asignados, controlando la gestión operativa y prestando apoyo en la gestión financiera del inmueble, así como supervisando todos los servicios, operaciones y funciones acordadas en el mandato......

Community Manager
VisionarIA

Descripción del puesto: buscamos un community manager estratégico y altamente proactivo, con experiencia en la gestión de redes sociales y habilidad para crear contenido de alto impacto, aplicando edición multimedia y herramientas de inteligencia artificial... excelentes habilidades comunicativas......

Community Manager
Clínica Oviedo

Buscamos community manager con experiencia marketing digital y comunicación en redes sociales, publicación de contenido y campañas... requisitos del puestose valorarán conocimientos de: fotografía, edición de video, wordpress, comunicación... se valorarán conocimientos de: fotografía, edición de video......