Accumulo is a robust, scalable, high performance data storage and retrieval system.
Apache Accumulo is a mashup of various technologies, from Google's BigTable, to Apache's Hadoop, Thrift and Zookeeper.
Apache Accumulo features a few novel improvements on the BigTable design in the form of cell-based access control and a server-side programming mechanism that can modify key/value pairs at various points in the data management process.
Here are some key features of "Apache Accumulo":
Table Design and Configuration:
· Iterators
· Cell labels
· Constraints
· Sharding
· Large Rows
Integrity/Availability:
· Master fail over
· Write ahead log
· Logical time
· Logical Time for bulk import
· FATE (Fault Tolerant Executor)
· Scalable master
· Isolation
Performance:
· Relative encoding
· Native In-Memory Map
· Scan pipeline
· Caching
· Multi-level RFile Index
Testing:
· Mock
· Functional Test
· Scale Test
· Random Walk Test
Extensible Behaviors:
· Pluggable balancer
· Pluggable memory manager
· Pluggable logger assignment strategy
General Administration:
· Monitor page
· Tracing
· Online reconfiguration
· Table renaming
Internal Data Management:
· Locality groups
· Smart compaction algorithm
· Merging Minor Compaction
On-demand Data Management:
· Compactions
· Split points
· Tablet Merging
· Table Cloning
· Compact Range
· Delete Range
What's New in This Release: [ read full changelog ]
· Optionally monitor swappiness on every server.
· Support running on-top of Kerberos-enabled HDFS.
· Provide method for gathering system stats to API.