





About Kruzer We are a technology company that simplifies unified commerce. We deliver robust solutions in Order Management System (OMS), Product Information Management (PIM), and DevTools (iPaaS and API Management). We help major brands build omnichannel customer journeys, accelerate operations, and scale efficiently. We are now seeking an Infrastructure Tech Lead to lead our performance, automation, and observability initiatives in distributed environments. **The Challenge** You will be responsible for ensuring our critical systems operate with high availability, low latency, and maximum reliability—leading initiatives that directly impact the experience of millions of consumers. **Responsibilities** * Lead technical triage and routing of tickets (Incidents, Problems, Requests, and Changes), ensuring quality, traceability, and prioritization; * Respond to critical incidents (P1), coordinating mitigation, technical communication, and containment actions; * Conduct **RCA** (Root Cause Analysis) and define action plans to reduce recurrence and improve system stability; * Design and implement fixes and enhancements in **Node.js (Koa) / TypeScript**, adhering to architectural standards, testing practices, and engineering best practices; * Enhance integrations and asynchronous workflows using **Kafka**, and optimize processing and reprocessing pipelines as needed; * Ensure performance and reliability of data and search layers (**MongoDB, Redis, Elasticsearch/Kibana**), collaborating with Infrastructure/DevOps teams where applicable; * Support observability and debugging best practices (logs, metrics, tracing) to accelerate diagnosis and reduce MTTR; * Perform code reviews, guide the team daily, and promote technical consistency (patterns, guidelines, Definition of Done, quality gates); Support change and release governance (maintenance windows, rollback, validation) to reduce operational risk. * **Required Skills** Hard Skills * Experience developing software in production environments; * Proficiency in **Node.js** with **TypeScript** (backend) and experience with **Koa** (or equivalent frameworks); * Hands-on experience with high-demand databases and components: **MongoDB**, **Redis**, and **Elasticsearch** (querying, modeling, performance tuning, troubleshooting); * Experience with event-driven architecture and/or messaging systems (**Kafka**); * Familiarity with **Docker** and deployment/operational routines in containerized environments; * Ability to operate in a governed support environment (SLA, incident management, RCA, changes); Soft Skills * Proven experience as a **Tech Lead** or Senior Engineer with strong technical leadership responsibilities; * Strong communication skills to manage incidents, align priorities, and influence technical decisions across stakeholders; * Ability to perform effectively under pressure during critical incidents; * SRE (Site Reliability Engineering) mindset—focused on reliability and automation; * Team collaboration and mentorship, fostering technical excellence. **Nice-to-Have** * Experience with **React** (BFF/Frontend, production troubleshooting, API integration); * Experience with **Kibana** and advanced observability practices (dashboards, event correlation, alerting); * Prior experience operating critical systems / retail / omnichannel platforms; * Knowledge of resilience patterns (retry, circuit breaker, idempotency, DLQ, backpressure) applied to integrations/events; * Participation in architecture design and reliability improvement initiatives (SRE mindset). **What We Offer** * Autonomy to lead and scale the company’s infrastructure; * Direct interaction with technical teams and high-impact projects; * A collaborative environment that encourages innovation and technical excellence; * Involvement in challenging projects with leading retail and healthcare brands.


