PagerDuty x Cloudsmith
See how we've saved PagerDuty from pipeline disruption, support bottlenecks and more with first-class performance and service.
PagerDuty is a global leader in digital operations management, enabling customers to achieve operational efficiency at scale with the PagerDuty Operations Cloud. Founded in 2009 and reaching IPO in 2019 (NYSE: PD), the platform revolutionizes how companies run their digital offerings by bringing issues and opportunities to light and automatically rallying the right people to respond. By blending AI with human intelligence, it’s helping more than 27K clients to bolster operational excellence across business-critical work like incident management and customer support through the proactive prevention of problems, resolving critical incidents faster, and building resilience.
Service instability + support apathy threatens production
While PagerDuty’s performance and service to clients went from strength to strength, behind the scenes in 2020 its engineering teams began encountering their own business-critical incidents—with their artifact management tool. According to Kalpesh Patel, Director of Infrastructure Services at PagerDuty, the most acute of these was recurring downtime, which was disrupting their CI/CD pipeline, causing build failures, and ultimately jeopardizing their ability to reliably roll out product releases, updates, and fixes on schedule. For a cloud-native SaaS solution like PagerDuty that’s constantly innovating and continually releasing, instability was a major concern.
Cumbersome customer service made a bad situation worse. Kalpesh’s teams never found their vendor’s customer service model or quality outstanding. “We have a full service ownership model here where every engineering team goes on call for the services that they own,” explains Dave Bresci, PagerDuty’s Senior Manager of Site Reliability Engineering. “They deploy directly to production, and we give them as much autonomy as possible. [But with our former provider], you could only have two people contact customer support. So everything having to go through us was a real burden when something would go wrong.”
Crisis moments like outages highlighted just how poor their support was.
We weren’t very happy with the customer service we were getting. I’d contact them saying, ‘We’re having a problem with your service,’ and they’d reply, ‘Well, here are the logs.’ And I’d say, ‘I don’t want to see your logs. I want you to tell me what the problem is with the service!’
Finally, security and compliance were becoming more important as PagerDuty set its sights on achieving FedRAMP certification. Knowing that success would require them to implement stricter controls over integration of public packages—and keen to resolve unpredictable downtime disruptions—they started searching for alternatives.
The search for a modern solution
PagerDuty conducted a comprehensive search and evaluation. Mindful that their existing tool had been in situ for seven-years, they went looking for a replacement platform that could satisfy their requirements long term. The team compared seven solutions, scoring them across a range of criteria.
Topping the list were platform stability and a cloud-native offering. “Because we’re a cloud-native company, we like to use cloud-native solutions,” says Dave. “The way those platforms are architected is similar to the way we architect our own product, they’re optimized for cloud-native operations, and those teams understand the velocity we need to work at in a way that traditional solutions that only release every six months can’t.”
Selecting a cloud-native solution would also make it easy for PagerDuty to immediately benefit from new functionality in future—by eliminating the cost, risk, and hassle of system upgrades. Other essential criteria included:
customer support speed and quality
supply chain security
range of package formats supported
experience with large enterprise clients
After narrowing the field, PagerDuty selected three vendors for POC exercises. Cloudsmith’s team impressed the PagerDuty employees most, blowing them away with friendliness and a willingness to engage and collaborate like a true partner.
“Out of all of the vendors we talked to, Cloudsmith was the one that seemed to want to work with us the most. Right away they were saying, ‘Here are all the things we want to do for you. Here are the things we can build…’ That won us over. And everything that’s happened since then has only confirmed that initial impression.”
Stability restored, assistance unlocked + risk resolved
Since completing their migration to Cloudsmith in summer 2023, PagerDuty’s engineering teams now enjoy a level of operational resilience, support, incident prevention, and incident management that mirrors the benefits their own product provides clients globally.
Thanks to our robust uptime rates, production disruptions are in the past—better enabling them to deliver on their ambitions for their product and their promise to customers. “We’ve had no platform-impacting downtime for our artifact repository since migrating to Cloudsmith,” says Dave.
Likewise, our speed and agility means there’s never long to wait for platform improvements. “When I’ve said, ‘I think I’m going to need X in the next few months,’ Cloudsmith is moving at cloud-native speed, which is short, fast iterations,” he continues. “And that’s how we move. Working with a partner like that has made things very smooth.”
After nearly a decade of enduring their previous platform’s vendor-centric support model, the switch has also introduced a new era of customer service—expediting problem-solving, minimizing disruptions, and boosting team morale.
With Cloudsmith, getting help and answers is no longer a burden or a disappointment. PagerDuty gets our best-in-market technical support for accurate fixes along with unlimited access. “The teams now are able to self-service directly with Cloudsmith’s experts,” says Dave. “Between our shared Slack channel and our entire team’s ability to directly open support tickets, my team isn’t a bottleneck.”
Furthermore, with Cloudsmith’s artifact management nestled at the center of their build pipeline, PagerDuty now has the enhanced security and control required to confidently pursue their FedRAMP certification, including:
private, centralized repositories for better observability;
OIDC and policy management for greater publishing control;
upstream proxying and caching of third-party dependencies to protect against upstream outages or security breaches
key signing, metadata signing, and checksums to detect package tampering
“Cloudsmith is one of our favorite partners,” says Dave. “Our technically-minded engineers love the capabilities of the platform. For other folks, it’s their customer service. But for me, it’s their responsiveness, flexibility, and willingness to partner with us. That’s really important to me, because we know what our requirements are today, but we don’t necessarily know what our requirements will be tomorrow as we evolve.
If you’re looking for someone who’s not just going to be a vendor - but a long-term partner that’s invested in you - you’re invested in them, and you’re going to figure out how to make each other better going forward, then that would be my high recommendation on why you should go with Cloudsmith.
ProGlove's technology is built to work in harmony with existing systems, facilitating effortless integration of their wearables into a company's workflows. Their roster of clients speaks volumes about their capabilities, including heavyweights in the automotive, aerospace, and consumer goods manufacturing sectors.Read the story
See how we've saved PagerDuty from pipeline disruption, support bottlenecks + more with first-class performance + serviceRead the story
The Internet Systems Consortium (ISC) maintains 3000 packages on Cloudsmith. Each download of one of these packages may represent many thousands, or even millions, of end-users. With Cloudsmith, the ISC is able to ensure consistent, controlled availability of those packages and deliver on its vision: ‘Open source for an open internet’Read the story
Humanising Autonomy is redefining the realm of autonomous systems with their groundbreaking predictive intent technology. Dive into their journey of making these systems seamlessly blend into human-centric environments while prioritizing safety, and how Cloudsmith fits into their operations.Read the story
Font Awesome’s business relies on the distribution of private packages to customers in a timely, reliable fashion. That wasn’t a use case supported by conventional package management platforms, and attempting to build a solution in-house was causing ongoing issues around uptime and performance.Read the story