Towards a Hybrid & Multi-Cloud World — A Database Viewpoint

Background:

Trushar Borse
9 min readMay 31, 2022

Cloud has become the de facto standard for hosting most of the workloads in different industries and geographies across the world.

According to Gartner, global end-user spending on Public Cloud Services is expected to exceed $480 Billion in 2022.

According to another research, the market for Global Cloud Databases and DBaaS is projected to become as big as $399.5 Billion by 2027 growing at a CAGR of 62% over the period 2020–2027.

Intent of this article is to give a quick overview of Hybrid, Multi-Cloud architectures with focus on homogeneous and heterogeneous database engines forming the database tiers.

Why cloud, Why Multi-Cloud architectures are becoming mainstream and when Hybrid architecture is also a consideration…these are some aspects covered at a very high level.

Active-Active and Active-Passive ways of running applications have their own benefits and challenges. I have tried to cover some design tips with respect to Active-Active database tiers.

Towards the end, I have covered the Google Cloud Platform’s managed database services that can readily become part of multi-cloud and even hybrid architectures.

Let’s get going….

Some Basics

Hybrid Cloud — Generally accepted definition of Hybrid cloud architecture is a setup wherein common and/or integrated workloads are deployed across multiple infrastructure setups, one or more based in the public cloud, and at least one being on-premise/private.

Multi cloud — is usually an architectural pattern wherein two or more hyperscalers host the common and or/integrated workloads for a given organisation.

Benefits of Public Cloud over conventional On-Premise

  • Move from Capex to Opex (..as a Service) model
  • Focus on core business rather than infrastructure management
  • Standardisation of IT Architecture
  • Improved Observability and Manageability
  • No scalability worries and freedom from capacity planning exercises
  • Multi-layered security
  • Agility in development and deployment of applications
  • Simplified and automated change management
  • Freedom from vendor and supplier management
  • Leverage the cutting edge technology from Cloud provider and other marketplace partners

Reasons for the rise of Multi-cloud trend

  • De-risk cloud investments — Ensure successful cloud journey irrespective of a given cloud providers’ existence in long run
  • Avoid any kind of vendor lock-in — Remain independent and de-coupled from proprietary technologies as much as possible
  • Business Continuity Plan — Availability and scale insurance for the customer
  • Mergers and Acquisitions — Company getting acquired/merged already runs on another cloud and now joint operations need to be run across the public clouds
  • Business expansions into new markets — Tap the strong points and presence of existing cloud in new markets
  • Data residency regulations — Not every public cloud is present everywhere. Leverage the presence of incumbent public cloud in the geography of interest.

Few Instances wherein Hybrid cloud architecture finds a fitment

  • Legacy applications cannot be modernised and moved to cloud but others can
  • Applications with latency sensitivity and tightly coupled to other on-premise components
  • Disaster Recovery for on-premise workloads
  • Cloud Bursting for on-premise workloads meant to handle traffic spikes
  • Improve the availability of primarily on-premise workloads through cloud extension
  • Backups of on-premise workloads for potential cost optimisation
  • Lesser critical and Dev/Test workloads in cloud with more critical ones still on-premise

Topologies to run databases in multi-cloud architectures

[1] Active-Passive : This is an arrangement wherein the database instance in one of the identified public clouds acts as primary database and hence is responsible for accepting the WRITE traffic. READ traffic can be catered to by the primary instance itself or can be handled by the other instance running in another public cloud if designed likewise.

1a. HOT Standby — Database on DR/Secondary cloud accepts the READ traffic. It is usually in ASYNC mode and hence there is always the potential for stale reads from Secondary site database. Plan to send Read your own write kind of traffic to the Primary database itself. There is also an option to keep the application tier active on secondary cloud with Read only modules with tolerance for limited staleness accessing the database tier belonging to it.

1b. WARM Standby — Database on DR/Secondary site keeps accepting and applying the transaction logs or through CDC tool from Primary database in ASYNC mode but might not cater to any READ traffic. This standby database could be running in a down-scaled mode. When required, it can switch to primary role (Write & Read) and also scale-up during that time.

1c. COLD Standby — Secondary site does not run any database instance. It simply keeps receiving backups (base and incremental or logical ones as applicable) and a database instance is created when required. This leads to lesser operational cost but with the worst RPO & RTO.

[2] Active-Active : This topology of database tier allows for Write + Read traffic to database instances on both the sites. Bi-directional replication of data allows transactional data at each of the sites to be replicated to the other one thus guaranteeing availability of complete business data at each site. This is also an ASYNC mode data replication (natively or through CDC tools) and hence an element of delay in data availability on the remote site database. One needs to have absolute clarity on potential data conflicts due to same data getting modified at the same time on different databases (part of Active-Active) and how to handle it whenever it happens.

Tips to avoid design complications in Active-Active:

  • If data is getting written to the same functional schema/tables of databases on both the sites, application design should ensure that data conflict is avoided as much as possible and if it still happens, application design should take care of it gracefully.
  • Another way is to design such Active-Active replication through simplification approaches like-
  • Nomenclature for each site could be different so as to avoid possible conflicts. E.g. Primary key in the SALES table on site1 database could be prefixed with site1_<xyz> or even using UUID if feasible.
  • Route module-wise traffic to databases at different sites. E.g. SALES module writes go to database at Site1 and SERVICE (After Sales) module traffic goes to database at Site2 while the bi-directional replication in async mode is always happening in the backend.

Likewise, there could be other approaches taken at the design phase of application and data model to make the Active-Active design simpler and easier to manage.

Considerations at the level of database engines in multi-cloud

There are other design considerations that one needs to be mindful of when designing a multi-cloud application.Choice of databases can certainly become non-trivial because different public clouds offer different types of database engines and may not lend a common ground.Thus, it could become a decision to be made between the best solution from a manageability standpoint v/s the best fit from a feature standpoint.

Homogeneous — The underlying core database engine is the same. Features could vary in terms of additional extensions, platforms where they can run and manageability features. E.g. Cloud SQL Postgres, AWS RDS Postgres run respectively on GCP and AWS but they are based on OSS Postgres.

Heterogeneous — The underlying core database engine implementation is altogether different. For e.g Postgres and MySQL.

Fully managed offerings of open source database engines as PaaS- Data replication between homogeneous databases tends to be more manageable even for the fully managed databases of different clouds using Native replication capabilities if possible or otherwise with external CDC tools.In case of heterogeneous databases, replication is usually implemented using external tools and CDC enablement of respective engines.

Open source databases leveraged as-is by running them on Virtual Machines- This is one of the simplest options to leverage the native replication capabilities of the database engines in case of Homogeneous engines.

E.g. OSS MySQL running on GCP GCE VM is pretty much the same as that running on AWS EC2 VM.

Marketplace offerings of vendors available on multiple public clouds as fully managed offerings- Should a particular database engine be found as a right fitment from data model, use-case standpoint and if the same is also available in similar form-factor on concerned public clouds, it becomes a manageable construct.

For e.g. MongoDB Atlas is available on GCP, AWS and thus can lead to a multi-cloud application with lesser hassles.

Cloud Native databases available as highly available, distributed and fully managed offerings- Such options should also be considered for the simple reason that cost of ownership and manageability becomes quite palatable as compared to managing databases on different public clouds. There are also options to capture data changes in those engines and make the same available for downstream consumption so as to avoid any lock-ins.

E.g. Cloud Spanner is a geo-distributed, scalable, ACID compliant Relational database with Postgres Interface and built in CDC option called Change Stream.

Options to replicate data between homogeneous and heterogeneous database engines

  1. Native replication capabilities of the database engines (relevance in some homogeneous source & target combinations)
  2. Cloud Native ones like — DMS & Datastream in GCP
  3. Commercial ones like — Striim, Fivetran/HVR
  4. Open source ones like — Debezium

There could be more than 1 option available for addressing the data replication requirement.

Choice of CDC/Replication tool is made based on following dimensions:

  • Cost of Ownership of Data Replication Tool
  • Ease of use, support availability and manageability of replication tool
  • Monitoring and Governance capabilities of replication tool
  • Features related to handling of evolving schemas and DDL/DML
  • Bi-directional replication capabilities

A Google Cloud Perspective of Multi-Cloud from Database Standpoint

Google Cloud Platform (GCP) is a big proponent of Openness and happens to be one of the top contributors to the CNCF Open Source projects.

With that in mind, GCP offers managed open source services operated by its partners that are tightly integrated into Google Cloud.

E.g. MongoDB Atlas, Redis Labs, Neo4j, Datastax Astra DB, Couchbase etc.

This makes it easier for GCP customers to build on open source technologies.

Below is an overview of the different databases that one can choose and operate in a fully managed fashion on GCP as part of a potential multi-cloud database tier.

Cloud Native Transformational database from GoogleCloud Spanner

Spanner supports Change Stream to propagate the Inserts, Updates and Deletes to the downstream ecosystem.

Traditional databases — Cloud SQL

Replication and change data capture tools mentioned in the earlier section can be leveraged with Cloud SQL to become part of a multi-cloud database tier arrangement.

Relational database with lot of differentiators — AlloyDB for PostgreSQL

Needless to say, databases can certainly be installed on virtual machines or containers/pods in Google cloud and other public clouds to form a multi-cloud database tier with an understanding that it will be a self-managed setup. However, it takes away a considerable chunk of value add that migration to cloud has to offer.

Summary:

Multi-cloud and to an extent Hybrid are now part of the mainstream constructs for IT ecosystems. They bring value by means of providing scale and availability insurance in scenarios where an organisation implements multi-cloud as a strategy. One needs to be aware of the challenges as well that could be faced in building and managing such solutions; I alluded to a few while discussing the database tier design. A well thought out system will bring the benefits in form of remaining performant, resilient and secure.

Disclaimer: All the views presented in this article are in my own individual capacity and in no way reflects my employer’s, any other organisation or individual views.

--

--