YugaByte – a new world of SQL distributed databases
At Pebble we are always on the lookout for something that is exciting and will meet demanding customer requirements. For the past couple of months we have been working on something that we think is worth sharing.
Pebble IT started in Australia concentrating on the Oracle database. For the past ten years we have also included Microsoft’s SQL Server into the mix. Along the way we have encountered Sybase, DB2, PostgreSQL, MySQL and most recently, the Korean database, Tibero. All of these are relational databases that have SQL APIs to process and/or query data.
Whilst some are capable of clustering for high availability – in Oracle’s case they refer to this as RAC (Real Application Clusters), Tibero is TAC (Tibero Active Clusters) and the others that are capable of this are Microsoft SQL Server and PostgreSQL. Clustering enables high availability, but to be clear, it is quite different to distributed databases.
I first became aware of distributed databases when I witnessed a demonstration of Foundation DB in 2013 showing true distributed ACID transactions with nodes being turned on and off and viewing the distributed nature of this new database architecture. It was amazing to watch. What I believe is powerful about distributed databases is that it brings the database close to the users across multiple locations which reflects a common reality and it potentially removes the need for backup and hence restore.
Whilst the concept of distributed databases has been known and demonstrated for many years, they are often in the guise of a NoSQL database, as Foundation DB was. What has piqued our interest is a product titled ‘YugaByte DB’ by YugaByte, a private company from California USA. YugaByte offers a powerful SQL API with built in SQL Stored Procedures and Functions combined with the fault tolerance of a distributed database architecture. Add in a raft of other leading edge capabilities that is too long to discuss here, and you then have a low latency distributed data replication capability that enables the distributed database to operate with multiple nodes across the public internet. This combination is a game change for business.
Distributed + SQL + Low Latency = WOW. Returning to what we normally work with today, Oracle and SQL. Their offerings seem to pale in comparison. Oracle 12c has a feature titled ‘Far Sync’ that is part of the Active Data Guard feature set. However, it does not enable distributed computing as you cannot transact on the far sync destination. It is merely allowing a standby replica that can be seconds in distance from the primary database – and whilst this can be used with RAC, it is not a distributed database solution.
Oracle’s GoldenGate allows transactions to be performed on multiple different active databases. Oracle refers to this as bi-directional replication as opposed to enabling truly distributed databases. The difference is the replication protocols and their underlying robustness and capability. Microsoft SQL Server is also not capable of being a truly distributed database. The YugaByte offering enables new architectures and hence new capabilities that should be explored by companies of all sizes.
Below is a diagram of an example architecture that I will explain and then talk to the benefits that this architecture could realise.
This example architecture allows NSW users to transact and/or read from the database in their Sydney data centre. The Sydney DC has 3 nodes, therefore if one of the nodes was to become unavailable, then there would be no impact to the users. This is a high availability aspect similar to what you can experience with Oracle RAC, Tibero TAC, SQL clustering and PostgreSQL clustering.
Users in the Melbourne data centre have the same architecture and hence the same high availability. What is different here is that any transaction (insert, update or delete) is then present in all other nodes, so an update in Melbourne would impact the data in Sydney within milliseconds. Same for data operations in Sydney flowing through to Melbourne.
Then we have two separate read replicas. Users, applications or data warehouses potentially could read from these systems. They too are kept in sync with all operations.
Note that the diagram above is simplified for readability, it does not suggest that a transaction within the Melbourne DC needs to route via the Sydney DC to update the Google cloud read replica in Sydney. This is just a simple representation within my diagram.
Lets throw some scenarios at this architecture:
Scenario: A node in the Sydney DC becomes unavailable due to server failure or a human error.
Impact: To users – no impact. To data – no impact. The remaining nodes serve the NSW users
Scenario: The Sydney DC experiences a major network failure and all network traffic in and out is stopped.
Impact: To users – the company’s failover policies need to be invoked to re-route application traffic for NSW users to the Melbourne DC. To data – none due to the distributed nature of the database. If a quorum was formed for a transaction but that replication never reached the Melbourne DC then it is possible that there will be a missing transaction – and whilst this is unlikely, when the node comes back online, this transaction will be shared outside of Sydney as will all new transactions from Melbourne will occur and the entire system will automatically re-synchronise.
Scenario: Both the Sydney & Melbourne Data Centres are destroyed with all physical servers lost.
Impact: To users – they will not be able to perform any transactions. To data – either no impact or an extremely small chance of a transaction that achieved a quorum that did not yet reach the read replicas. Using the read replicas a source, the organisation would be able to build a new universe of distributed databases on the public cloud and assuming that the application servers were too available in the public cloud, the business would then be able to continue transacting again.
As you can see, the resilience that is built in to this architecture is what delivers the game changing capabilities that will make such a positive difference to your organisation.
This capability is accessible to you. The perception of high-end performance and capabilities has previously been dictated by Oracle and is often out of reach of many organisations due to the large sums required. That is not the case with Yugabyte. Similar to Oracle, licensing is by core. What is different is that it is a subscription model that is all inclusive – no additional options of RAC, Partitioning, Advanced Compression, Advanced Security – its all part of the base product of the Enterprise Edition of YugaByte DB. At around AUD2k per core per year, we are talking about a very affordable proposition. As an example, 3 nodes of 3 cores each replicated across 2 different data centres would total 18 cores at approximately 36k per year. No capex, just subscription opex. By comparison, Oracle’s RAC is approximately AUD 16k per core license and in excess of 3k per year maintenance per physical core assuming Intel Xeons for Oracle’s licensing purposes. GoldenGate is another 75% of these costs.
There are other differences too. YugaByte DB also has a free open source edition. It is quite feasible that all development, POCs and testing is done on the Open Source “Community Edition” and you run production workloads on the “Enterprise Edition”.
Something else to think about. What is the role of backup in the world of distributed databases ? Good question, and that is something we are still exploring. The read replica is in effect a backup of the latest data. However, this does not address rolling back to an earlier point in time. Snapshots are likely not to be granular enough, so if there is a definite requirement for rollback to earlier times, then a sophisticated backup solution that targets a single node would be required. However, we would challenge the need to do this on the grounds of justifying why that would be required. Data corruption used to be such a justification, but we believe that is no longer something to be addressed in this situation. Bad transactions can be reversed with SQL. The role of backup is being challenged by distributed databases with the right architecture.
Do you currently have to license standby/DR/failover database too ? Data protection can be very expensive.
What would you typically run on YugaByte ? Any SQL workload that would benefit by the distributed nature. Having nodes spread across cities, across data centres within a city as well as the public cloud can apply to department level applications as well as global trading platforms. The pricing model supports small use cases as well as large.
At Pebble IT, we are referring to this as ‘the New World’. If you want to be part of this new world, reach out to us and we can work together to understand how we can revolutionise your data platform.