CSC logo

Saas Operations Team Lead

CSC
Full-time
Remote
United States
Description

SaaS Operations Team Lead

Buffalo Grove, IL

Monday – Friday 8:00 am – 5:00 pm 

Remote

We’re seeking a talented and motivated hands-on Team Lead to lead our SaaS Operations with a strong Site Reliability Engineering (SRE) mindset. You’ll own and improve the reliability of our SaaS platform—treating availability, performance, and operational excellence as core product features. This role is Azure-first and cloud-forward, while operating in a hybrid environment (Microsoft Azure plus private infrastructure).

 

Some of the things you’ll be doing:

  • Lead the SaaS Operations/SRE team: prioritize work, mentor engineers, set standards, and act as the primary escalation point
  • Own reliability outcomes: define and improve service health, availability, latency, and operational readiness
  • Operate and optimize Azure services including Azure Front Door, Azure Container Apps, virtual networking, PaaS databases, and Key Vault
  • Lead incident response end-to-end: triage, coordination, clear communications, and follow-through
  • Drive root cause analysis and postmortems; ensure corrective actions are implemented and tracked
  • Reduce operational toil through automation, self-service, and repeatable runbooks
  • Build and refine observability: monitoring, logging, dashboards, and actionable alerting
  • Manage day-to-day operational tickets and change activity following defined controls (incident/problem/change)
  • Partner with Engineering, Infrastructure, and Security to improve operability and safe delivery (release readiness, rollout/rollback planning)
  • Participate in an on-call rotation and planned maintenance windows after hours/weekends when needed

 

What technical skills, experience and qualifications do you need?

  • 5+ years in production operations (SRE, platform engineering, DevOps, SaaS operations, systems engineering, or similar)
  • Demonstrated technical leadership (team lead responsibilities, mentoring, ownership of operational standards)
  • Strong troubleshooting across distributed systems: web platform, networking, containers, identity, certificates/secrets, and performance bottlenecks
  • Azure production experience with:
    • Azure Front Door
    • Azure Container Apps
    • Azure virtual networking (VNets, private endpoints, DNS patterns, hybrid connectivity concepts)
    • Azure Key Vault
    • PaaS databases
  • Automation and scripting: PowerShell, Bash, Azure CLI, and YAML-based pipelines/workflows
  • DevOps toolchain experience (GitHub and/or Azure DevOps); automation/config tooling such as Ansible (or equivalent)
  • ITSM/process discipline and tools (e.g., ServiceNow): incident, problem, change management

 

Hybrid environment requirements

This position supports a hybrid platform. You must be able to operate and troubleshoot components running in private infrastructure, including:

  • Enterprise identity systems (e.g., Active Directory, Group Policy)
  • Web Platform (IIS)
  • Microsoft server-based platforms and related operational practices (patching/maintenance, certificate lifecycle, file services such as DFS)
  • Virtualization/hypervisor platforms (Nutanix AHV, VMware, or similar)

 

Nice to have

  • Infrastructure as Code experience (Bicep preferred; Terraform/ARM also valuable)
  • Experience implementing SLOs and improving alerting hygiene (noise reduction, paging policies)
  • Experience improving incident response practices (runbooks, escalation paths, reliability reviews)

 

#CSC #CSCCareers #LI-HL1