Learn about the challenges and opportunities in building a hybrid or multicloud solution with Google Cloud
Companies are increasingly choosing hybrid and multicloud solutions utilizing Google Cloud. Some are startups actively building their client base by meeting customers where they are; others are large enterprises looking for the best tool to solve problems, provide high availability or diversify their providers to reduce risk. Before setting out to build a hybrid/multicloud solution (usually part of a larger application modernization process), companies should know the potential challenges and opportunities.
In this post, I’ll dive into these topics to help shed light on who is a suitable candidate for this option, challenges they should be aware of, available solutions and situations when this option should be avoided (at least for the near future).
Definitions
Let’s start with some definitions:
- Public cloud – on-demand computing services and infrastructure are managed by a third-party provider and shared with multiple organizations via the public Internet
- Private cloud – infrastructure is dedicated to a single user organization; can be hosted either at an organization’s own data center or at a third-party colocation facility
- Hybrid cloud – combines on-premises, private cloud and third-party public cloud services and orchestrates between the platforms
- Multicloud – combines multiple cloud computing and storage services in a single heterogeneous architecture; also refers to the distribution of cloud assets, software, applications, etc. across several cloud-hosting environments
- Legacy application – information system possibly based on outdated technologies but critical to day-to-day operations; usually three-tier monolithic applications, fragile and hard to maintain and upgrade
- Modern application – usually referred to as an app comprising microservices in an N Tier Architecture. It is as reliable as necessary and can be upgraded multiple times a day without impacting production (a broader definition is The Twelve Factor app).
Use cases
This is not a comprehensive list:
- Intermediary stage of moving IT Infrastructure that is no more than moderately difficult to move to the cloud: For example, a customer has a significant amount of servers on-prem (usually Linux or Windows), a connection is set up over VPN or interconnect, and groups of VMs (according to application workload boundaries) are migrated to the cloud. The company's customers are being served by a hybrid-cloud solution with some of the applications running on-prem and some of the applications running in the cloud until all workloads are migrated to the public cloud (Hybrid cloud).
- The customer would like to modernize their solution but has applications and databases that are very difficult to move to the cloud: For example, a customer who has a mainframe or Oracle DB (not as difficult anymore) in their private-cloud environment and would like to move the application to a modern architecture such as Kubernetes (Hybrid cloud).
- Regulatory requirements: For example, data regionality requirements might mean certain pieces of information are not allowed to leave the country’s border.
- Cloudbusting: In this scenario, additional capacity can be automatically provisioned to the cloud and scaled up if the on-prem environment is overloaded. This might happen in retail during the Black Friday/Cyber Monday events (Hybrid cloud).
- High availability and disaster recovery (Hybrid-cloud and multicloud)
- Using the right cloud for the right use case: For example, having Kubernetes and data analytics environments in Google Cloud Platform (GCP) and Microsoft-backed applications on Azure.
- Avoiding vendor lock-in
Tech stack and required capabilities
Now let’s dig into the tech stack and capabilities you need to have/build and the challenges this could impose on companies in various stages of their journey:
Compute:
- Understanding the private cloud environment – VMs and the various technologies at the hypervisor level: For companies who are currently in a private cloud, this is obviously not an issue. For companies who are digital native and who have not had to deal with the tech stack and the politics of trying to implement a cloud solution in a private cloud environment, this could be a significant challenge.
- Understanding the public cloud environment: This is not an issue for digital native companies, but it can present significant security and internal political issues for companies currently running on-premises. They will need to learn a new stack with new ways of managing VM’s, autoscaling, etc.
Databases and Storage:
- For companies currently in the private cloud, the challenge is to learn and select a relevant cloud-native database to use. Easy options are to move to managed SQL Server/PostgreSQL/MySQL, but, depending on the business and technical goals, some companies might want or need to move to technologies like Spanner, BigTable and BigQuery. This could be challenging.
- Riding two tech stacks at the same time can be difficult: A company with an on-prem Oracle database-backed application they want to move to PostgreSQL Database has two options, depending on the complexity and the features currently in use. The first is a lift and shift to a bare-metal option followed by modernization. The other option is to leave it in place, modernize in the private cloud and then migrate to the cloud. In either scenario, the company will need to operate two types of DBs simultaneously until the modernization is complete and one of the DBs can be turned off.
Networking:
- There are significant differences between networking in the public and private clouds – and even among the various clouds. For example, the Global VPC on Google Cloud is a massive benefit for simplifying your networking. Many private-cloud companies need to learn how these work. They also have a hard time consciously deciding not to replicate how they do things in the private cloud in the public cloud.
- Setting up VPNs, partner-interconnect and interconnects can be simple in some use cases, but it could take weeks or months, depending on the location of the customer’s private cloud.
Security:
- Implementing unified production-ready security measures across hybrid and multicloud environments can get tricky. Each cloud vendor has different security measures implemented in different ways. How you deploy SW and test it for security vulnerabilities during the deployment process could be a project in itself.
- Implementing security monitoring systems across multiple clouds is a challenge.
CI/CD:
- Reliably building a Continuous Integration and Continuous Deployment (CI/CD) pipeline to build, test, secure and deploy both infrastructure (IAC - like Terraform or Pulumi) and applications across multiple environments is a massive challenge. It also requires highly sophisticated, capable engineering teams and tooling.
Modernization:
- Deciding what and how to modernize applications and DBs is no small task and requires significant assessment and planning phases. It can take months or even years, as well as significant organizational change management, for private-cloud customers to learn K8s and the best way to work with the system.
Authentication:
- The challenge is handling SSO Federation, AD Federation, etc. across multiple platforms through a single source of truth.
Team:
- It's hard to find engineers/architects skilled on multiple platforms, so you need to invest in training your teams.
- The alternative is learning and staying up to speed on multiple environments in a rapidly changing cloud world, while building separate teams with different skill sets and trying to figure out a business process for them to work together efficiently.
Cost:
- Similar to setting up a multi-region, high-availability (HA) environment, setting up a multicloud or hybrid cloud solution can be up to ten times more expensive. It is not just about setting up another VM or database and paying extra for that: Day 2 operations including orchestration, replication, network egress cost, and supporting the people and mechanisms for these systems all add up.
- Companies need to think about FinOps and building a cultural practice in which teams manage their cloud costs, everyone takes ownership of their cloud usage, a central best-practices group provides support and a 360° view of hybrid/multicloud cost emerges.
Defining a solution
At this point, you’re probably thinking that multicloud and hybrid cloud is really difficult, but because this is the direction the industry is heading, you need to find a solution.
It’s important to understand that these types of projects take time, and then work according to the following methodology:
- Assess your company culture, your DevOps practices, your tech stack, etc.
- Plan based on your company's business and technological goals and the results of the assessment. Understand that there are significant learning curves both within your company and with your customers to changes you make to your system. This could take time, and you need to allow people to adapt to the new working methodologies and technology.
- Migrate/refactor: Start slow and small, allow people to fail fast, learn from the failure and improve. If you start slow, the speed will increase incrementally. Don’t try to optimize while changing platform, codebase, etc.
- Optimize once the initial change has been made and you have a new working system. Work on optimization both from a performance and cost perspective.
Building the optimum hybrid tech stack
None of the technology solutions on the market will solve everything. You need to evaluate your stack and combine solutions to create a complete hybrid tech stack.
Application layer:
GCP Anthos is a great solution that encompasses multiple parts of what you need to operate the code/application layer of your solution. It includes the following layers, which allow you to orchestrate and run Kubernetes clusters and even VMs (in private preview) on GCP, AWS, Azure, private cloud and bare metal.
Anthos clusters, the managed Kubernetes layer at the bottom of the stack, make Day 2 operations much easier than they are when running your own open source Kubernetes clusters.
Anthos Ingress is a a cloud-hosted multi-cluster Ingress controller for GKE clusters.
Anthos Service Mesh is a suite of tools that helps you monitor and manage a reliable service mesh on-premises or on Google Cloud.
Anthos Config Management is a service for configuration and policy management that combines three components: Policy Controller, Config Sync and Config Controller. Together, these components enable Anthos Config Management to continuously protect and configure your Google Cloud and Kubernetes resources.
Anthos fleets (formerly known as environs) are a Google Cloud concept for logically organizing clusters and other resources, letting you use and manage multi-cluster capabilities and apply consistent policies across your systems.
Limitations: It is important to note that the service is limited to controlling and monitoring only the parts of your solution that run within the Anthos environment.
Database and storage layer:
- If you operate your own storage and DB layers outside of the scope of the cluster, you may be able to leverage managed DB services like CloudSQL or RDS.
- If you operate DB within your Kubernetes cluster (e.g., mySQL or MongoDB), you add a significant amount of operation overhead by running your own DB in cluster versus using the managed versions from the cloud provider.
Security + monitoring:
- SIEM + monitoring: Look for tools that provide a multicloud/hybrid cloud solution, such as Splunk, Datadog or PagerDuty.
- Security & Secrets Management: Examples include HashiCorp Vault and AWS Secrets Manager.
CI/CD:
- Solutions on the market including Gitlab and Jenkins work across different environments and allow you to build and deploy both containers and VMs.
If you don’t need to build and deploy VMs, look at spinnaker.
Infrastructure Provisioning:
- Options include HashiCorp Terraform, Red Hat Ansible and Pulumi.
Wrapping up:
- Building a hybrid/multicloud solution is very difficult and very expensive, both from a people and process perspective and from a technological perspective.
- Technologies such as GCP Anthos support a multicloud solution, but no single technology provides an end-to-end solution. This needs to be taken into consideration when designing your solution architecture.
- This process will be easier for companies already running Kubernetes in production at scale, whether it is on-prem or in the cloud.
- This process will be most difficult for:
- Companies running legacy monolithic applications on-prem. Building a hybrid solution and the transformation required for business processes, people enablement and change in the tech stack can take anywhere between two and ten years, depending on the complexity and scale of the change.
This is where the market is going, and companies who undergo this process see a significant ROI. However, when embarking on such a journey, companies should temper their expectations with an understanding of the scale of the challenge. - Startups with small teams: Kubernetes is interesting and exciting, and your startup wants to serve every customer in every environment, but resources are limited, and you need to have a functioning product with the lowest possible operational and cost overhead. Based on the inherent challenges, focus first on having a viable product (keeping workflow portability in mind) and client base on a single cloud using as many fully managed services as possible. Once your team is big enough and the solution is sufficiently stable, begin exploring how to refactor and operate a multicloud solution.
- Companies running legacy monolithic applications on-prem. Building a hybrid solution and the transformation required for business processes, people enablement and change in the tech stack can take anywhere between two and ten years, depending on the complexity and scale of the change.
If you want to hear more on this topic and also see a demo of how Anthos works in a multicloud environment, please sign up for our event on February 22, 2022, at 1 p.m. GMT.
Feel free to contact us if you have any questions or would like to discuss your journey to a modern application in a hybrid or multicloud environment.