OpenStack Swift and the hash_path_suffix - what can go wrong?
OpenStack Swift uses hash values to store objects. Hashing uses a mathematical algorithm to transform data, for instance a string, into a numeric representation. If the underlying data changes, the hash changes, so hashing can be used to detect changes in the data.
Swift uses the well-known MD5 hashing algorithm to transform the path of a Swift object into a hash value. A segment of the hash generated from the path to the inbound or requested object is used to specify the partition used to store the object. The complete hash is then used to position the object inside this partition.
In a perfect hash, every possible input string would be represented by a unique hash value, but the hash function used in Swift can not be perfect. The MD5 hash Swift uses is 16 bytes long and represents arbitrary-length strings, so there's no guarantee that two different strings have different hash representations.
Malicious use of hash collisions
Using non-unique hash values for placing objects bears a risk. It's possible for two objects with different object paths to translate into the same hash, and thus be stored in the same place in Swift. In that case, the second object stored overwrites the first.
This flaw means that malicious attackers can hand craft an object path whose hash matches the hash of the object path of the object they want to replace with their copy. The object they are inserting -- even though it has a totally different object path from the original -- would then overwrite the original object. A user retrieving the original object would not be able to tell that it's been replaced.
Protecting the hash
Early on, the Swift developers realized that this attack vector posed a substantial risk and implemented protection against this form of attack. The hash_path_suffix is defined in the configuration files for swift on each storage node. It must be the same across the whole cluster and, just as importantly, must be kept secret.
To calculate the hash with the hash_path_suffix, the suffix is added to the object path of the requested object, and then the hash is made from the resulting string. Thus, if an attacker tries to insert an object with the same hash as an existing file, the hashes of the strings modified with the hash_path_suffix will not match, because the secret hash_path_suffix is required to create the hash.
This means that without knowing the hash_path_suffix, it is no longer possible for an external attacker to knowingly create an alternate URI that produces the same hash as the original.
(There is still a tiny risk that the hashes will still match because of the general risk of MD5 collisions. This risk, though, is exceedingly small and does not provide a viable attack vector for the malicious attacker.)
The risk of misconfiguration
Unlike the rings, the hash_path_suffix cannot change over the life of the cluster. New nodes that are added must use the same hash_path_suffix as the pre-existing nodes.
Now what would happen if a node was added with a different hash_path_suffix?
For an explanation we must look at the auditors. Auditors are processes that constantly scour the data space of a Swift cluster, comparing hash values and MD5 sums with the desired values to detect both corruption of copies and objects that are written in the wrong place.
For our consideration, the hash comparison is of interest. The auditor compares the hash of the object path plus the hash_path_suffix with the hash from the location of the object. As the hash_path_suffix on the object is incorrect on our newly added server, the auditor will remove the "broken" copy from its location and store it in a quarantine space.
How to fix what is broken
Once the fault is detected by the cluster operator -- usually through log messages -- the hash_path_suffix must be corrected and all swift services on the defective node restarted. Once that's done, internal replication can sync a good copy of the respective objects into their correct positions from another node.
It's important to note that while this process can even work if multiple hosts have objects quarantined with the incorrect hash_path_suffix, it does require that there's at least one good copy of the object remaining to be synced once the configuration has been corrected.
If the number of defective storage nodes is equal to or larger than the replication factor, some objects will have all copies quarantined. In this case, Swift can not automatically replace the missing objects, because no good objects are left to replicate.
In this case, you have two options for recovery:
It is possible to reinsert the objects manually into at least one correct location, and then let the normal internal replication sync. This requires a significant amount of calculation to determine the correct positions, and as internal replication is not designed to be fast, will also take a significant amount of time until consistency is reached.
Often it will be easier to extract the objects from the quarantine and upload them again with the original object path. Not only will this transfer the burden of location calculation to the cluster, but it will also quickly write at least a quorum of objects and thus better protect you from losing the object again, for instance due to hardware failure before internal replication can be achieved.
The easiest method, though, is prevention. Scripting the install, or at least copying the configuration files instead of manually editing them, helps eliminate this kind of mishap, and keeps valuable data safe.