Gridware

Senior DevOps Engineer

Join Gridware in San Francisco as a Senior DevOps Engineer. Lead AWS and Kubernetes infrastructure for wildfire detection systems. Enjoy paid parental leave and a unique two-week paid break called 'Off the Grid'.
Gridware
Gridware
San Francisco, California, United States On-site Full time USD 190k–215k yearly UTC-07:00

Gridware

Company Overview

Gridware

California, United States

2020

Approximately 30 employees (source: businessinsider.com).

What They Do

Gridware is a pioneering startup focused on enhancing grid resilience through innovative technology. Their core offering, Active Grid Response (AGR), utilizes pole-mounted Gridscope sensors that monitor various conditions on distribution power lines in real-time. These sensors are powered by solar energy and communicate through device-to-device, cellular, and satellite networks, ensuring continuous operation regardless of grid voltage (source: gridware.io). The technology is designed to detect a range of issues, including vegetation contact, downed lines, and equipment failures, enabling utilities to perform predictive maintenance and dynamic de-energization to prevent wildfires and outages (source: cbsnews.com). Target markets include U.S. electric utilities, particularly in wildfire-prone areas like California and the Midwest, with plans for international expansion in the future (source: promptloop.com). Their products integrate seamlessly with utility operations management systems, covering over 90 million field hours and serving 40% of U.S. customers through partnerships (source: gridware.io).

Projects & Track Record

Gridware has successfully deployed approximately 13,000 sensors across eight states, particularly in California's high-risk wildfire areas, covering around 1,000 miles of power lines. Their technology has been credited with preventing wildfires, as evidenced by reports of alerts that stopped smoldering vegetation from igniting (source: cbsnews.com). Notable integrations include partnerships with PG&E for high impedance fault detection and a pilot project with Puget Sound Energy aimed at improving storm and wildfire repair prioritization. In one instance, a Northern California circuit experienced a 70% reduction in outage patrol times, while a Midwest utility saved 400,000 outage minutes across four circuits (source: gridware.io). Ongoing projects include a 2024 pilot with NorthWestern Energy in Montana City, focusing on real-time asset monitoring to mitigate wildfire risks (source: northwesternenergy.com).

Recent Developments

In the past two years, Gridware has raised significant funding to support its growth and expansion efforts. In 2024, they announced a $26.4 million Series A funding round led by Sequoia Capital, following a $10.5 million seed extension in 2023 co-led by Lowercarbon Capital and Fifty Years (source: gridware.io). This funding is aimed at enhancing their U.S. operations and preparing for international deployments. Additionally, Gridware has received recognition for its innovative technology, including being named one of Time magazine's Best Inventions in 2022 for their Gridscope sensors and having founders featured in Forbes 30 Under 30 in 2023 (source: engineering.berkeley.edu). The company continues to deepen its partnerships with utilities, including a pilot program with NorthWestern Energy set for late 2024 (source: northwesternenergy.com).

Working There

Gridware offers a variety of engineering-focused roles, including positions in software engineering, electrical design engineering, data engineering, and technical recruiting. The company is built by linemen and engineers, reflecting a culture that emphasizes collaboration with field workers and a mission-driven approach to wildfire prevention (source: climatepeople.com). Hiring is concentrated at their Bay Area headquarters, where they are rapidly expanding their team to support sensor production and deployment efforts. The culture at Gridware is described as rigorous and mission-oriented, with a strong focus on achieving real-world results, such as the significant field hours logged by their technology (source: gridware.io). While specific employee benefits are not detailed in the sources, the venture-backed nature of the company suggests competitive startup perks may be available (source: cbsnews.com).


Last updated on Feb 23, 2026 | Report an issue

Job Description

We're scaling the deployment of critical infrastructure monitoring devices to detect real-world fault events that lead to wildfires. The platform you'll build and operate ingests millions of events per day from devices in the field, powers customer-facing dashboards and alerting, and supports the data science work that turns raw signals into grid intelligence.

You will own AWS infrastructure, Kubernetes (EKS), CI/CD, and observability end-to-end, partnering with our Cloud Security team to keep the platform safe and compliant, and with backend, firmware, and data teams to keep them shipping fast. As an early member of the DevOps team, you'll have a direct hand in shaping how Gridware builds, deploys, and runs production systems for years to come.

Responsibilities

  • Design, build, and maintain scalable, secure, and highly available infrastructure on AWS (EKS, EC2, RDS / Aurora Postgres, MSK, S3, VPC, IAM).
  • Manage and optimize Kubernetes clusters (EKS) across multiple environments, and deploy applications using Argo CD with GitOps best practices.
  • Implement and maintain CI/CD pipelines using GitHub Actions, including reusable workflows, build/push/scan flows for ECR, and frontend deployment pipelines.
  • Operate and tune Kafka-based event streaming on Amazon MSK for high-throughput, low-latency device data pipelines.
  • Define and manage Infrastructure as Code with Terraform and Terragrunt, with reusable modules, sensible environment separation, and review-friendly plans.
  • Manage identity and access across platforms with Auth0 / EntraID integrations, IAM roles for service accounts (IRSA), and short-lived credentials.
  • Build and maintain observability with Grafana, Loki, Prometheus / Mimir, and related tooling so on-call engineers can quickly find and fix issues.
  • Monitor and optimize infrastructure cost across environments, partnering with engineering teams on right-sizing, capacity planning, and waste reduction.
  • Partner with our Cloud Security team to enforce security standards, integrate with SIEM tooling, and respond to vulnerabilities and incidents.
  • Debug complex production issues across infrastructure, deployment, and networking layers, and turn the lessons learned into automation and runbooks.

Required Skills

  • 5+ years in DevOps, SRE, or Platform Engineering with production experience operating AWS infrastructure.
  • Deep hands-on experience administering Kubernetes (EKS or equivalent) and deploying via GitOps (Argo CD or Flux).
  • Proficiency with Infrastructure as Code using Terraform; comfort with Terragrunt or a similar wrapper.
  • Hands-on experience designing and maintaining CI/CD pipelines, preferably with GitHub Actions and reusable workflows.
  • Production experience operating distributed systems such as Kafka (MSK).
  • Strong understanding of networking, DNS, TLS, and security best practices, including IdP-driven access control (Auth0, EntraID, or similar).
  • Solid experience with monitoring and logging stacks such as Grafana, Loki, Prometheus, Mimir, or equivalents.
  • Ability to debug complex production issues across infrastructure, deployment, and networking layers.
  • Comfortable working in Linux environments with strong scripting skills (Python or Bash preferred for automation).
  • Knowledge of version control workflows, automated testing, and release management.

Bonus Skills

  • Experience operating Apollo Router / GraphQL federation gateways in production.
  • Experience operating Argo Workflows or similar Kubernetes-native job / pipeline runners in production.
  • Familiarity with Databricks or ML Ops pipelines for data and model deployment.
  • Experience designing, operating, and exercising Disaster Recovery (DR) environments, including cross-region replication, backups, and tested failover runbooks.
  • Experience with Tailscale or other zero-trust networking tools.
  • Experience supporting IoT / embedded fleets at scale, including secure device-to-cloud connectivity.
  • Experience in high-growth startup environments where you must wear many hats.

$190,000 - $215,000 a year

This describes the ideal candidate; many of us have picked up this expertise along the way. Even if you meet only part of this list, we encourage you to apply!

Benefits

  • Health, Dental & Vision (Gold and Platinum with some providers plans fully covered)
  • Paid parental leave
  • Alternating day off (every other Monday)
  • "Off the Grid", a two week per year paid break for all employees.
  • Commuter allowance
  • Company-paid training

Apply now

Job expired?

Please let Gridware know you found this job on Rejobs. This will help us grow and get more people to work on renewable energy!

About the role

May 14, 2026

Full time

Company

May 14, 2026

On-site

USD 190k–215k yearly

Smart Grid

Gridware

gridware.io

  •  San Francisco, California, United States

5+ years in DevOps, SRE, or Platform Engineering

UTC-07:00