




Job Summary: A monitoring professional responsible for ensuring the integrity and availability of systems and applications, with a focus on analysis, incident resolution, and continuous improvement. Key Highlights: 1. Monitor systems, networks, and distributed applications. 2. Proactively resolve problems and manage incidents. 3. Propose and implement improvements to monitoring processes. Description: 1\. Technical: Applications and Systems: Understanding of internal Hiper distributed applications and data ingestion, transformation, and output processes. * Basic infrastructure knowledge: Servers, databases, Google Cloud, and connectivity. * Monitoring: Experience with infrastructure monitoring tools such as Nagios and Grafana. * Analysis and troubleshooting: Ability to diagnose issues and respond to failures (logs, alerts, incidents, requests). * * IS: Information security fundamentals and LGPD compliance. * Knowledge of and ability to monitor data processing workflows. 2\. Behavioral Attention to detail: Ability to identify failures and inconsistencies in monitoring processes. * Critical and analytical mindset: Ability to identify improvements, optimize workflows, and analyze logs, alerts, and metrics to interpret potential failures and predict impacts. * Investigative mindset: Ability to identify possible root causes and suggest corrective solutions. * Commitment and accountability: Strict adherence to procedures and SLAs. Remaining attentive and available during 24x7 operational shifts. * Strong communication skills: Ability to collaborate effectively with Engineering, Development, Infrastructure, and Customer Support teams. * Resilience and emotional control: Handling incidents and tight deadlines without compromising quality—maintaining composure and emotional intelligence to make sound decisions during crises. * Proactivity and adaptability: Anticipating problems and suggesting process improvements; possessing curiosity, willingness, and readiness to learn new technologies and continuously advance. * * Organization and documentation: Recording incidents, managing tickets, and documenting best practices for the team. Responsibilities and Duties Perform real-time monitoring of servers, networks, applications, Google Cloud, Nagios, and Grafana. * Analyze incidents and classify their impact and urgency. * Proactively resolve issues while preserving SLA compliance. * Log incidents clearly in Jira and ServiceNow, maintaining a detailed history including applied solutions and root causes. * Keep the Knowledge Base updated with recurring procedures and solutions. * Prepare performance and availability reports for monitored resources. * Propose and implement improvements to monitoring processes. * Stay current with industry best practices in monitoring. * Ensure the integrity and security of monitored information. * * Understand and follow the entire change management process (GMUD). Requirements and Qualifications Experience monitoring systems and networks. * Experience interpreting logs and analyzing failures. * Familiarity with Jira and ServiceNow. * Cloud knowledge — Google Cloud. * * Degree in Information Technology (or related fields). 2512040202181439101


