Data Deduplication
 
ExaGrid looked at the first generation, traditional approaches and saw that all vendors had used block-level deduplication.
This traditional method splits data into 4KB to 10KB groups of bytes called blocks. The size of the block is variable length to
improve matches. The challenge is that for every 10TB of backup data, the tracking table or hash table has one billion blocks.
The hash table grows so large that it needs to be housed in a single front-end controller with additional disk shelves,
often called “scale-up.” As a result, only capacity is added as data grows and since no additional bandwidth or processing
resources are added, the backup window grows in length as data volumes increase. At some point, the backup window
becomes too long and a new front-end controller is required, called a “forklift upgrade.” This is disruptive and expensive.
ExaGrid also saw approaches that used byte-level deduplication. Although this method allows for GRID scalability,
called “scale-out,” it requires an understanding of the format of every backup application, which limits the list of supported
backup applications.

ExaGrid took a third and more innovative path. ExaGrid uses zone-level deduplication, which breaks data into larger zones
and then compares at the byte level. This approach allows for the best of both worlds. First, the tracking table is 1,000th
the size of the block-level approach and allows for full appliances in a GRID, called “scale-out.” As data grows, all resources
are added: processor, memory, and bandwidth as well as disk.  If data doubles, triples, quadruples, etc., then ExaGrid
doubles, triples, and quadruples the processor, memory, bandwidth, and disk. As data grows, the backup window stays
at a fixed length. Second, the zone approach is backup application agnostic, allowing ExaGrid to support virtually any backup
application.
In summary, block-level deduplication drives a scale-up architecture that only adds disk as data grows, which expands the
backup window; byte-level deduplication allows for scalability but limits backup application support; however, ExaGrid
zone-level deduplication allows for full server appliances in a GRID – a 
scale-out approach – as well as a wide range of
backup application support.

ExaGrid continues to innovate to fix backup…forever!