In today’s technology-driven world, databases play a crucial role in storing and managing vast amounts of data. However, not all databases are created equal when it comes to performance.
Our new Credenza project requires a really reliable, super fast and economical big data storage system, as it is planned to store the transaction metadata of millions (if not billions) of e-commerce users. The use of conventional relational databases in this case obviously leads to a number of problems, and the use of a RAM cache (like Redis) is impossible due to really huge amounts of data. That is why we decided to develop a special optimized data model and an engine for working with it – TreeTalk Shaped Data Modeling (SDM) Technology.
In this article, we will explore the results of a performance test conducted on two different databases to test the potential of our idea.
By understanding the strengths and weaknesses of each database, you can make an informed decision when choosing the right option for your specific needs.
1. Test Setup and Methodology
Before diving into the test results, it is essential to establish the test setup and methodology used. A standardized approach was followed to ensure fairness and accuracy. The following aspects were considered:
a. Hardware and Software Configuration: Both databases were tested on the same hardware (PC with Intel Core i3-7020U CPU @ 2.30GHz, HDD 1TB 5400 rpm and OS Windows 10) to eliminate any disparities caused by variations in resources. The software versions and settings were also kept consistent: TreeTalk SDM 0.2 vs MySQL Server 8.0.
b. Workload Generation: A realistic workload was generated, resembling the anticipated usage patterns. Below are the results of writing and reading (as fast as possible) specially generated 11-digit phone numbers and their corresponding metadata, stored as a 24-byte array.
2. TreeTalk SDM Performance Test Results
Let’s start by examining the performance test results for our development. This database claims to excel in handling large datasets and complex queries. Here are some key findings:
a. Throughput: TreeTalk SDM demonstrated impressive throughput, allowing it to handle a significant number of transactions per second (TPS). The speed of inserting new records turned out to be two to three times higher than that of a competitor. As you can see in the graph, it took less than 2 seconds to create 1,000 new records.
b. Latency: TreeTalk SDM showcased low latency, ensuring fast response times for individual queries. This characteristic makes it suitable for applications that require real-time data retrieval. It took our system only 0.005 seconds to search for the 10000th (last) item written to the non-indexed database, while the competitor took five times as long (0.027 s).
c. Scalability: TreeTalk SDM demonstrated strong scalability, efficiently handling an increased workload without sacrificing performance. As the data volume grew, TreeTalk SDM exhibited consistent response times. With a 10 times increase in the number of records, the average lookup time (0.005 s) was increased by only 1 ms (about 20%), while MySQL took five times longer than before (0.027 s).
d. Disk consumption: here our TreeTalk SDM system showed its full potential: it took only 273 KB to store 10,000 records, while the DATA folder from MySQL took 209408 KB, that is, almost 1,000 times more!
Suppose we need to store the phone numbers and data of 1 billion customers. Using the TreeTalk Shaped Data Model (SDM), we only need about 30 GB of hard disk space. That’s a couple of 4k UHD movies. The most inexpensive servers have this amount of disk space, even if you choose a faster SSD drive.
3. MySQL Performance Test Results:
Now, let’s shift our focus to MySQL and dive into its test results. Here are the key observations:
MySQL expectedly lost in all performance parameters. But the point is not that it is worse, but that it was created universal and is designed to solve many typical tasks. This is a great Relational Database (RDBMS), it has exceptional stability, a reliable user experience. It proved highly resistant to data loss or corruption during sudden outages or failures, and has built-in replication features that allow for seamless failover and ensured high availability.
During the performance test, TreeTalk SDM and MySQL both demonstrated strong performance characteristics, but they shine in distinct domains.
We have proven that TreeTalk SDM is a fantastic option for Big Data processing and real-time applications due to its high throughput, low latency, and scalability. In our particular situation (storing enormous volumes of data with a very simple structure in a non-indexed database), it is just what we need. It is horrifying to think about how much space a typical relational database will consume in this scenario (and how it will slow down).
We are ready to pay for this advantage with more difficult development (all service elements are written in C ++) and “custom-made” system and data structure engineering.