Connection Pooling for Databases
In this article, I intend to provide a high level overview about connection management in databases, quick comparison of implementation approaches for connection pooling and towards the end provide an example of connection pooling in Cloud SQL MySQL in Google Cloud Platform (GCP).
Applications these days tend to become busy with user traffic in unpredictable ways. While this is good from a business standpoint, where it can really get challenging is how to keep it running and available for all the increased set of users and that too with acceptable levels of performance.
Modern architectural patterns with microservices and containerisation allow for the application stack to scale horizontally and thus keep catering to increased volume but how do we ensure that the backend database is also equipped to handle the connections?
That’s where we need to talk about handling the application to database tier connections in a wise manner. Let’s talk about Connection Pooling.
What & Why of Connection Pooling:
Each time a client attempts to access a backend service, it requires OS resources to create, maintain, and close connections to the datastore. This creates a large amount of overhead causing database performance to deteriorate.
A connection pool is a cache of database connections that are shared and reused to improve connection latency and performance. When your application needs a database connection, it borrows one from its pool temporarily; when the application is finished with the connection, it returns the connection to the pool, where it can be reused the next time the application needs a database connection.
Connection pooling mechanism lets the database scale effectively as the number of clients accessing it grows.
Different Implementations of Connection Pools with Pros and Cons:
In general, the concept of connection pooling can be applied at multiple different levels.
- (Application Programming) Language Level (e.g. JDBC for Java)
- Client Layer
- Middleware (e.g. pgbouncer for Postgres, ProxySQL for MySQL)
There are pros and cons associated with the aforementioned connection pool implementations.
But as you will realize, Middleware based implementation of connection pooling works out best in most scenarios and is hence widely used. The potential shortcomings of it like single point of failure etc. can be easily overcome. Similarly, connections can be secured to address the security aspect.
Implementation & Best Practices:
Option A: Implement connection pooling in the application tier
Major programming languages (Java, Python, C#, Go, Node.js, PHP, Ruby) have utilities that can be leveraged to achieve the below best practices. Refer the Google Cloud Github repository for samples at Google Cloud Platform · GitHub. Herewith, providing small code snippets for Java language.
- Opening and closing connections
When you use a connection pool, you must open and close connections properly, so that your connections are always returned to the pool when you are done with them. Unreturned or “leaked” connections are not reused, which wastes resources and can cause performance bottlenecks for your application.
- Exponential backoff
If an application attempts to connect to the database and does not succeed, the database could be temporarily unavailable and sending too many simultaneous connection requests might waste additional database resources and increase the time needed to recover. Using exponential backoff prevents your application from sending an unresponsive number of connection requests when it can’t connect to the database.
- Connection timeout
There are many reasons why a connection attempt might not succeed. Network communication is never guaranteed, and the database might be temporarily unable to respond. Ensure the application handles broken or unsuccessful connections gracefully.
- Connection duration
Limiting a connection’s lifetime can help prevent abandoned connections from accumulating. One can use the connection pool to limit connection lifetimes.
Option B: MIDDLEWARE — a dedicated connection management layer between application and database
NOTE: Cloud SQL MySQL is based on Open Source MySQL and supports versions 5.6, 5.7 & 8.0 at the time of writing this article. Read more about GCP Cloud SQL MySQL here.
While there are quite a few connection pooling utilities available from different sources including commercial ones, I would take the example of ProxySQL which is battle-tested and quite reliable. The usual alternate option is HAProxy.
Some outstanding features of ProxySQL are-
- Efficient Connection Pooling/Multiplexing
- Ability to be deployed on VM, BM, containerized
- Very low consumption of resources
- Highly configurable in terms of traffic routing and also its deployment for High Availability
Major Benefits of ProxySQL Connection Pooling:
- Scale Insurance
- Horizontal scalability of application
- Optimize sizing of Cloud SQL MySQL instance by tuning the memory requirements
Please see the resources in the references section below to get more detailed information for implementation.
Implementing connection pooling makes a vast difference in the scalability, manageability and control of application connectivity towards database. We took an example of GCP Cloud SQL MySQL database in this article. One would prefer middleware kind of connection pooling for the benefits it provides for the latest patterns of application based on microservices and horizontal scaling requirements.
Horizontally scale a MySQL database backend with Cloud SQL and ProxySQL | Google Cloud Platform…
Chen Yu | Developer Programs Engineer | Google Contributed by Google employees. This tutorial explains how to…
Disclaimer: All the views presented in this article are my own in individual capacity.