dedupsqlfs Common block/hash/names tables across many backups - clustering

Allow store one copy of tables hash, block,names for many node backups.


Cleanup-defragment complicated

This setup will NOT clean these tables as it not possible to figure on-the-fly which is empty.

For cleanups such setup need to create two-pass tool:

  • fetch all hash IDs in each node backup
  • filter current hash IDs that not in nodes-hash IDs


Vacuum of mega-block-hash tables will take eternity.

Search of hash/block

To speedup things - need to split hash/block tables in partitions. Preferably by ~2Gb (1950Mb) file size. If create new big partition file - need to resync data between all partitions. Or resync on-the-fly.

For MySQL/PgSQL - need to startup two instances: for block/hash/names, and for node backup.