Observability Architect

Posted 106 weeks ago

Job Description

Job Summary

We are seeking a seasoned Observability Architect to define and lead our end-to-end observability strategy across highly distributed, cloud-native, and hybrid environments. This role requires a visionary leader with deep hands-on experience in New Relic and a strong working knowledge of other modern observability platforms like Datadog, Prometheus/Grafana, Splunk, OpenTelemetry, and more. You will design scalable, resilient, and intelligent observability solutions that empower engineering, SRE, and DevOps teams to proactively detect issues, optimize performance, and ensure system reliability. This is a senior leadership role with significant influence over platform architecture, monitoring practices, and cultural transformation across global teams. Key Responsibilities

  • Architect and implement full-stack observability platforms, covering metrics, logs, traces, synthetics, real user monitoring (RUM), and business-level telemetry using New Relic and other tools like Datadog, Prometheus, ELK, or AppDynamics.
  • Design and enforce observability standards and instrumentation guidelines for microservices, APIs, front-end applications, and legacy systems across hybrid cloud environments.
  • Experience in OpenTelemetry adoption, ensuring vendor-neutral, portable observability implementations where appropriate.
  • Build multi-tool dashboards, health scorecards, SLOs/SLIs, and integrated alerting systems tailored for engineering, operations, and executive consumption.
  • Collaborate with engineering and DevOps teams to integrate observability into CI/CD pipelines, GitOps, and progressive delivery workflows.
  • Partner with platform, cloud, and security teams to provide end-to-end visibility across AWS, Azure, GCP, and on-prem systems.
  • Lead root cause analysis, system-wide incident reviews, and reliability engineering initiatives to reduce MTTR and improve MTBF.
  • Evaluate, pilot, and implement new observability tools/technologies aligned with enterprise architecture and scalability requirements.
  • Deliver technical mentorship and enablement, evangelizing observability best practices and nurturing a culture of ownership and data-driven decision-making.
  • Drive observability governance and maturity models, ensuring compliance, consistency, and alignment with business SLAs and customer experience goals.

 Required Qualifications

  • 15+ years of overall IT experience, hands-on with application development, system architecture, operations in complex distributed environments, troubleshooting and integration for applications and other cloud technology with observability tools.
  • 5+ years of hands-on experience with observability tools such as New relic, Datadog, Prometeus, etc. including APM, infrastructure monitoring, logs, synthetics, alerting, and dashboard creation.
  • Proven experience and willingness to work with multiple observability stacks, such as:
      • Datadog, Dynatrace, AppDynamics
      • Prometheus, Grafana, etc.
      • Elasticsearch, Fluentd, Kibana (EFK/ELK)
      • Splunk, OpenTelemetry,
  • Solid knowledge of Kubernetes, service mesh (e.g., Istio), containerization (Docker) and orchestration strategies.
  • Strong experience with DevOps and SRE disciplines, including CI/CD, IaC (Terraform, Ansible), and incident response workflows.
  • Fluency in one or more programming/scripting languages: Java, Python, Go, Node.js, Bash.
  • Hands-on expertise in cloud-native observability services (e.g., CloudWatch, Azure Monitor, GCP Operations Suite).
  • Excellent communication and stakeholder management skills, with the ability to align technical strategies with business goals.

  Preferred Qualifications

  • Architect level Certifications in New Relic, Datadog, Kubernetes, AWS/Azure/GCP, or SRE/DevOps practices.
  • Experience with enterprise observability rollouts, including organizational change management.
  • Understanding of ITIL, TOGAF, or COBIT frameworks as they relate to monitoring and service management.
  • Familiarity with AI/ML-driven observability, anomaly detection, and predictive alerting.

 Why Join Us?

  • Lead enterprise-scale observability transformations impacting customer experience, reliability, and operational excellence.
  • Work in a tool-diverse environment, solving complex monitoring challenges across multiple platforms.
  • Collaborate with high-performing teams across development, SRE, platform engineering, and security.
  • Influence strategy, tooling, and architecture decisions at the intersection of engineering, operations, and business.

Job Summary

Chandigarh Location
Full Time Permanent Job type
15+ years Experience
1 Openings

Contact

Unit #E1J, First Floor, Tower B, Godrej Eternia, Plot #70, Industrial Area, Phase 1, Chandigarh
Chandigarh, Chandigarh, 160002
Phone: +91 - 7814302836

Share