EC library interface
A wrapper for various EC libraries so that different ones can be used
Whiteboard
I figured I would get the discussion going on the EC interfaces and basic mechanics. This is all off the top of my head, so please make suggestions and/or point-out anything that looks wrong or unclear. Feel free to add others to this thread, if you want...
I assume we can use Python abstract base classes to enforce any EC lib used by Swift.
Let's assume that an EC lib must implement encoding a string, decoding a set of fragments into a string, verifying the consistency of a set of fragments, reconstructing a set of missing fragments and returning fragments needed to reconstruct a set of missing fragments.
Here are the basic functions (bastardizing Python syntax for clarity):
def encode(bytes): return List[String]
def decode(
def reconstruct(
def fragments_
def get_metadata(
def verify_
encode() and decode() are pretty self explanatory and will be the main functions used in the data path. We encode a string into a list of strings (fragments). We decode a list of fragments into a string.
fragments_needed() and reconstruct() will be used by the reconstructor to effectively reconstruct corrupted or lost fragments. In the case of Reed-Somolon, fragments_needed() can return a list of choose(n-f, k) fragment lists containing all possible combinations of fragments that can be used to reconstruct the 'f' lost fragments. This function is mostly useful for LRC, Pyramid codes and irregular XOR-based codes, since they will typically require less than 'k' fragments to reconstruct a lost fragment.
For example, suppose we have LRC with 10 data fragments where the first and last 5 data fragments contribute to local parities, P_1 and P_2, respectively. Assume there are two regular Reed-Solomon parities: P_3 and P_4. If the reconstructor detects the loss of fragment 1, then
fragments_
The reconstructor can now try to fetch the appropriate fragments (it will try [2, 3, 4, 5, P_1] first) and call reconstruct on their payload: e.g., reconstruct(
Stripe-level verifications can be done by having each node send their fragment's metadata to some initiator (via get_metadata()). The metadata is opaque to the auditor, but has meaning in the individual EC instance. The initiator is another storage node auditing a fragment in the particular stripe. The initiator will then use verify_
The metadata can be algebraic signatures, checking other types of checksums, checking fragment indices, etc, or some combination of those things. In other words, anything the underlying library deems useful.
Work Items
Dependency tree
* Blueprints in grey have been implemented.