···
Log in / Register

Observability Specialist

Indeed
Full-time
Onsite
No experience limit
No degree limit
R. Benedita Guerra Zendron, 21 - Vila Sao Joao, Barueri - SP, 06401-190, Brazil
Favourites
Share
Some content was automatically translatedView Original

Description

Job Summary: We are seeking an Observability Specialist to lead initiatives focused on reliability, automation, and continuous improvement of the monitoring ecosystem. Key Highlights: 1. Lead reliability, observability, and automation initiatives 2. Collaborate with SRE, DevOps, Infrastructure, and Development teams 3. Develop Python automations for monitoring data **About Alelo** Here, we **evolve through exchange**, **care deeply about every customer**, and take great **pride in making a difference** in the world. That’s why we constantly **take risks, correct course, and surprise**, actively **ECOing our culture** from within outward—and shaping our unique Alelo way of being! We respect differences and value diversity and equity across gender, race, color, religion, disabilities, sexual orientation, ancestry, or age—because together, we are stronger and uniquely ourselves! **So, does this match you?!** **Learn more at our careers website.** **What will your day-to-day look like?** We seek an Observability Specialist with strong strategic capability and hands-on expertise to lead initiatives focused on reliability, automation, and continuous improvement of the monitoring ecosystem. This person will work cross-functionally with SRE, DevOps, Infrastructure, and Development teams, directly contributing to incident reduction, increased availability, operational efficiency, and advancement of a reliability-focused culture. * The professional will be responsible for designing, evolving, and operating observability tools, as well as disseminating best practices, metrics, and automations that support technical and business decisions. * Lead reliability, observability, and automation initiatives, ensuring end-to-end visibility across environments and services. * Design, evolve, and maintain dashboards, alerts, executive dashboards, and infrastructure and business performance metrics. * Optimize observability costs (FinOps), defining retention and sampling policies (licensing). * Implement and promote DORA metrics, availability indicators, SLOs/SLIs, and operational KPIs. * Collaborate with technical teams to reduce incidents, perform root cause analysis (RCA), and drive continuous improvement. * Develop Python automations for monitoring data collection, enrichment, and standardization. * Manage and optimize tools such as Datadog, Zabbix, and Grafana, including integrations across multiple cloud platforms. * Support the evolution of SRE and DevOps culture by spreading practices and frameworks. * Lead observability initiatives in strategic projects and new technology implementations. * Ensure governance of monitoring platforms and promote standardization across teams. **Requirements and Qualifications:** * Solid experience with observability tools, especially Datadog. * Familiarity with tools such as Zabbix and Grafana. * Knowledge of DORA metrics, including deployment frequency, lead time, MTTR, and change failure rate. * Experience with Site Reliability Engineering (SRE) and DevOps practices. * Proficiency in Python for automations, integrations, and data processing. * Experience in multicloud environments (e.g., AWS, Azure, GCP). **Nice-to-have:** * Certifications in SRE, DevOps, or monitoring platforms. **What we offer:** At Alelo, we value health and well-being, offering an extensive portfolio of benefits to all our employees: * PLR or Bonus: Based on position eligibility*; * Health Insurance; * Dental Insurance; * Pharmacy Allowance; * Alelo Culture (optional enrollment); * Life Insurance; * Alelo Food; * Alelo Meal; * Alelo Year-End Bonus; * Wellhub or Totalpass; * Commuter Benefit (optional enrollment), Fuel Allowance, or Shuttle Service; * Home Office Stipend; * Home Office Setup Allowance; * Livelo Points and Partner Discounts; * Private Pension Plan (optional enrollment, with financial contribution from Bradesco Organization); * Fique bem Auster: Guidance program for health, well-being, and quality of life; * Unialelo: Alelo Corporate University; * Unico Skill; * Childcare or Babysitter Assistance; * Extended Paternity Leave (20 days); * Maternity Leave (180 days), pregnancy-to-postpartum support, and maternity kit.

Source:  indeed View original post
João Silva
Indeed · HR

Company

Indeed
Cookie
Cookie Settings
Our Apps
Download
Download on the
APP Store
Download
Get it on
Google Play
© 2025 Servanan International Pte. Ltd.