The fundraising for a new history solution hasn’t happened, but I was testing proofs of concept in my free time nevertheless.
ScyllaDB is a re-implementation of well-known Apache Cassandra database. While Cassandra is implemented in Java, ScyllaDB is completely re-done in C++. This gives it a significant performance gain, while keeping full API compatibility and allowing to re-use all existing client software.
ScyllaDB is not compatible with ZFS, so I had to set up a separate dedicated server with XFS which is the recommended filesystem. It was 2x Intel Xeon E5620, 32GB RAM, 6x3TB hard disks in RAID5 array.
ScyllaDB worked beautifully: it does only appending writes on the disk, so the seek time on hard disks was never a problem. It’s very parallelized and utilizes all available CPU cores for maximum throughput.
A single client connection, even if the client resides on the same server, was able to perform up to ~4000 updates per second, mostly limited by API processing delays. But I could launch 5 client processes inserting concurrently at about 3000 updates per second each, and the database engine was able to process them, utilizing all server CPU cores at the maximum.
The first POC test performed the following: the whole EOS history was indexed, so that for every action all involved accounts are recorded, contract and action names, block number, block time and transaction ID. Only a minimum set of indexes was used in order to be able to fetch all actions by transaction ID and to list all history for any EOS account. Also the database writer was storing JSON transaction traces for the latest 7 days and deleting traces older than 7 days. The idea was that older traces are rarely needed, and can be retrieved by knowing block ID directly from state history archive using Chronicle in interactive mode. The whole database occupied less than 1.5TB on the disk. With 5 parallel indexers, it took about two weeks to index the whole EOS history.
The second POC was doing the same as Couchbase test, in order to compare their performance: the database contained actions index and JSON traces for 10 million blocks. The total data was about 1.2TB on the disk. As storing JSON traces involved bigger requests, the maximum throughput was about 2500 updates per second from a single client.
One of challenges with Scylla or Cassandra is a very restrictive data model and set of queries that one run. You work with tables and views, but there’s no SQL JOIN. The structure of indexes is also restrictive. Also you cannot DELETE for a range of rows, and each row needs to be deleted individually.
Couchbase is a JSON database, very versatile and flexible. It allows indexing any kind of JSON structure. Also partial indexes are a great and powerful feature in data management. Couchbase also allows to keep the data and indexes on different servers within a cluster.
However, managing hundreds of millions or few billions of records is a big challenge. First of all, it appeared that indexes are performing very poorly when placed on HDD. The database engine is reading and updating them randomly, so the indexes must reside on SSD media if we want to use it for EOS volumes.
Couchbase indexers are asynchronous, so you can insert a few millions of records and realize it will take another day to index them. So I had ti implement a pacifier which pauses the insertions and waits for the index queue size to go down while the data is still in RAM cache. If the indexer has to retrieve the data from the disk, indexing performance degrades significantly.
The POC was storing action indexes and JSON traces for the last 10 million blocks. The main bottleneck was mainly performance of the indexer, and the top performance was around 2000 updates per second on my very modest hardware. Obviously the database engine required much higher grade hardware for maximum performance.
ScyllaDB showed a much better performance for large datasets, especially in handling the storage I/O on a low-cost server. It’s very limited in queries that you can execute, but it should be a very good match for EOSIO history solutions which need to answer on a very limited set of queries.
Whenever a deeper analysis of transactions is needed, Couchbase can help a lot. It stores and indexes JSON data natively, and allows building sophisticated indexes for special-purpose queries.
Both database engines showed good performance in querying a history of an account.
If I ever get a funded request to implement a new-generation history solution, I would do the following:
- implement the action indexer as a Chronicle plugin, as most of Chronicle’s work is spent on composing JSON output, this part of the job can be skipped: the transaction data is available within Chronicle in native format, so it would be able to write the history index directly into a database, such as ScyllaDB. This would reduce the indexing time significantly.
- Transaction traces in JSON can be stored to the database in traditional Chronicle way, but the database writer needs to be asynchronous, and probably multi-threaded, for better performance.
Block One is developing their
history-tools which are also implementing history queries. It may happen that their solution would already satisfy all needs of block explorers and data analyzers. However, Chronicle is at the moment the most compact and lightweight solution for processing the EOSIO state history, so it will keep its niche in data processing.