Jun 9, 2026

What Institutional Clients Should Ask About Uptime SLAs and Disaster Recovery Before Signing a WaaS Contract

Cregis

Marketing

3 min. read


Before signing any Wallet-as-a-Service agreement, institutional clients need to interrogate uptime SLAs and disaster recovery terms with the same rigor they apply to capital adequacy or counterparty risk. A WaaS contract that looks solid on paper can leave a bank, payment processor, or exchange exposed the moment infrastructure fails. The right questions reveal whether a provider is genuinely built for institutional continuity or simply offers standard cloud commitments dressed up in financial language.

TL;DR

  • SLA uptime percentages look impressive until you calculate what the permitted downtime actually costs your operations.
  • Disaster recovery terms, including RTO and RPO, are as important as the uptime figure itself.
  • Compliance-sensitive institutions need SLAs that address custody integrity, not just availability.
  • A provider's track record, certifications, and architecture matter more than contractual language alone.
  • The right WaaS partner treats continuity as infrastructure, not a contractual obligation to manage around.

About the Author: This article is written by the Cregis team, drawing on nine years of operating enterprise-grade digital asset infrastructure for 3,500+ institutional clients across 50+ countries.

What Does an Uptime SLA Actually Mean in a WaaS Context?

An uptime SLA (Service Level Agreement) is a contractual commitment from a provider specifying the minimum percentage of time a service will be operational within a defined period [atlassian.com]. In a WaaS context, this covers wallet creation, transaction signing, key management, and API availability.

The catch is that headline percentages obscure the real operational impact. Consider the math:

Uptime CommitmentPermitted Downtime Per Year
99%~3.65 days
99.9%~8.7 hours
99.99%~52 minutes
99.999%~5 minutes

Suppliers regularly cite high availability figures, but the permitted downtime at even 99.9% uptime can represent hours of inaccessible custody, blocked settlements, or failed client-facing transactions [computerweekly.com]. For an exchange processing continuous order flow or a payment service provider running 24/7 remittance rails, even 52 minutes of annual downtime is a material operational event.

Questions to ask:

  • Is the SLA measured per month or per year? Monthly measurement limits the maximum downtime in any single period.
  • Does the SLA cover all components, or just the primary API gateway?
  • Are scheduled maintenance windows excluded from uptime calculations?

What Is the Difference Between RTO and RPO, and Why Does It Matter?

Building on the uptime discussion, a separate but equally important concern is what happens after a failure occurs. Uptime commitments describe frequency; recovery terms describe consequences.

Two metrics define recovery capability [ussignal.com]:

  • RTO (Recovery Time Objective): The maximum time a provider commits to restoring service after an incident. A 4-hour RTO means the system could be down for up to 4 hours before the provider is in breach.
  • RPO (Recovery Point Objective): The maximum age of data that can be lost during recovery. An RPO of 1 hour means up to 1 hour of transaction records, wallet states, or key activity logs could be unrecoverable.

For custody-grade WaaS infrastructure, these numbers carry direct financial and regulatory weight. A long RPO in a custody context is not just a data loss problem. It is a reconciliation problem, a regulatory reporting problem, and potentially a client asset integrity problem.

Questions to ask:

  • What is the provider's stated RTO and RPO for each service tier?
  • Are RTO and RPO backed by contractual penalties or only best-effort commitments?
  • How are key shards or HSM-backed credentials protected across recovery scenarios?

What Should Be in the Disaster Recovery Protocol Itself?

A related but distinct question is whether a provider's disaster recovery plan is documented, tested, and auditable, or simply a section of legal boilerplate. SLAs establish what a provider commits to; disaster recovery protocols describe how they will deliver on that commitment when it is tested [ittoolkit.com].

A credible disaster recovery framework for WaaS should include:

  • Geographic redundancy: Data and key material replicated across physically separate locations, not just availability zones within a single data center.
  • Failover architecture: Automated switching to standby systems with documented trigger conditions and switchover times.
  • Key management continuity: Clear documentation of how MPC key shards or HSM-backed credentials are preserved, restored, or reconstituted after an infrastructure event.
  • Tested recovery procedures: Evidence that disaster recovery drills are conducted regularly, with documented results available to institutional clients on request.
  • Incident communication protocols: Defined timelines for notifying clients during outages, including escalation paths and named contacts.

Providers that cannot produce evidence of tested recovery procedures should be treated as carrying unquantified operational risk, regardless of what the contract states [riskandresiliencehub.com].

How Do SLA Terms Interact with Regulatory and Compliance Obligations?

Stepping back from the technical detail, a separate concern applies specifically to regulated institutions. Banks, licensed payment service providers, and regulated exchanges operate under frameworks that require them to demonstrate operational resilience, not just contractual coverage.

Regulators increasingly scrutinize vendor arrangements. A bank that outsources custody or payment infrastructure to a WaaS provider does not outsource its regulatory accountability. If a provider goes down and client assets are inaccessible, the regulated institution answers to its regulator, not the WaaS provider [ncontracts.com].

This means SLA terms need to be reviewed against the institution's own regulatory obligations, including:

  • Business continuity requirements under applicable financial regulation
  • Data residency and sovereignty rules that affect where recovery infrastructure can be located
  • Audit rights that allow the institution to inspect provider continuity capabilities directly
  • Incident reporting timelines that the provider must support to allow the institution to meet its own disclosure requirements

A provider holding PCI DSS, SOC 2 Type II, and ISO 27001 certifications offers institutional clients independent, audited assurance that operational controls meet recognized standards. Certifications do not replace contractual SLA terms, but they provide documented evidence that a provider's continuity practices have been externally validated.

What Questions Should You Ask Before Signing?

A concise pre-signature checklist for institutional procurement teams:

On uptime:

  • What is the exact uptime commitment per service component, per measurement period?
  • What is excluded from the uptime calculation?
  • What are the financial remedies for SLA breaches, and are they capped?

On recovery:

  • What are the published RTO and RPO for each service tier?
  • Can you provide documented evidence of a disaster recovery test conducted within the last 12 months?
  • How is key material protected during and after a recovery event?

On compliance:

  • Does the provider hold SOC 2 Type II, ISO 27001, or equivalent certifications?
  • Are audit rights contractually available to the institution?
  • How does the provider support your incident notification obligations?

On architecture:

  • Is the infrastructure geographically redundant across separate physical locations?
  • Does the provider operate an on-premise custody option for institutions requiring data sovereignty?
  • What is the provider's track record, and can they demonstrate reliable critical incident history over a material operating period?

Frequently Asked Questions

What is a realistic uptime SLA for institutional WaaS infrastructure? For institution-grade WaaS, clients should expect commitments of 99.9% or higher. However, the measurement period, exclusions, and remedies matter as much as the headline figure [computerweekly.com].

Can I negotiate SLA terms in a WaaS contract? Yes. Institutional clients typically have standing to negotiate measurement periods, remedy caps, audit rights, and incident notification timelines. Treat the initial contract as a starting position.

What happens to my assets if a WaaS provider goes offline? This depends entirely on the custody architecture. In self-custodial MPC models, key material remains with the client, and recovery does not depend on provider availability. In fully custodial models, asset access may be suspended during outages.

Are SLA penalties sufficient protection against downtime? Contractual penalties compensate for breaches but do not recover lost transaction revenue, regulatory penalties, or reputational damage. Architecture and track record matter more than penalty clauses [riskandresiliencehub.com].

How often should disaster recovery procedures be tested? Industry practice points to at least annual testing, with evidence available on request. Critical infrastructure providers should be able to demonstrate more frequent testing.

What certifications should I look for in a WaaS provider? SOC 2 Type II, ISO 27001, and PCI DSS are the benchmarks for institutional-grade providers. These certifications require independent audits of operational controls.

Does a high uptime SLA mean a provider is secure? No. Uptime measures availability, not security integrity. A provider can maintain availability while having inadequate controls around key management or transaction authorization. Evaluate security architecture and certifications separately.

About Cregis

Cregis is an enterprise-grade crypto financial infrastructure provider serving 3,500+ institutional clients across 50+ countries. Its infrastructure, built on MPC key management, HSM and TEE integration, and Zero Trust Architecture, is certified under SOC 2 Type II, ISO 27001, and PCI DSS. Cregis operates as infrastructure for the digital asset economy, delivering three core pillars: Secure. Efficient. Compliant. It serves banks, payment service providers, exchanges, and regulated institutions managing digital assets at scale.

If your institution is evaluating WaaS providers and wants to understand how Cregis structures its SLAs, disaster recovery commitments, and security architecture, visit https://www.cregis.com/ to speak with the team directly.


About Cregis

Founded in 2017, Cregis is a global leader in enterprise-grade digital asset infrastructure, providing secure, scalable and efficient management solutions for institutional clients.

Built to solve the challenges of fragmented blockchain systems and asset security risks, Cregis delivers MPC-based self-custody wallets, WaaS solutions, and Payment Engine, featuring collaborative asset control and a compliance-ready ecosystem.

To date, Cregis has served over 3,500 institutional clients globally. Our solutions empower exchanges, fintech platforms, and Web3 enterprises to adopt blockchain technology with confidence. Backed by years of proven expertise in blockchain and security, Cregis helps businesses accelerate their Web3 transformation and unlock global digital asset opportunities.