GLEP:59
Title:Manifest2 hash policies and security implications
Version:1.3
Last-Modified:2008/10/28 07:45:44
Author:Robin Hugh Johnson <robbat2 at gentoo.org>,
Status:Draft
Type:Standards Track
Content-Type:text/x-rst
Requires:44
Created:October 2006
Updated:November 2007, June 2008, July 2008, October 2008
Updates:44
Post-History:

Contents

Abstract

While Manifest2 format allows multiple hashes, the question of which checksums should be present, why, and the security implications of such have never been resolved. This GLEP covers all of these issues, and makes recommendations as to how to handle checksums both now, and in future.

Motivation

This GLEP is being written as part of the work on signing the Portage tree, but is only tangentially related to the actual signing of Manifests. Checksums present one possible weak point in the overall security of the tree - and a comprehensive security plan is needed.

Specification

The bad news

First of all, I'd like to cover the bad news in checksum security. A much discussed point, as been the simple question: What is the security of multiple independent checksums on the same data? The most common position (and indeed the one previously held by myself), is that multiple checksums would be an increase in security, but we could not provably quantify the amount of security this added. The really bad news, is that this position is completely and utterly wrong. Many of you will be aghast at this. There is extremely little added security in multiple checksums [J04]. For any set of checksums, the actual strength lies in that of the strongest checksum.

How fast can MD5 be broken?

For a general collision, not a pre-image attack, since the original announcement by Wang et al [W04], the time required to break MD5 has been massively reduced. Originally at 1 hour on a near-supercomputer (IBM P690) and estimated at 64 hours with a Pentium-3 1.7Ghz. This has gone down to less than in two years, to 17 seconds [K06a]!

08/2004 - 1 hour, IBM pSeries 690 (32x 1.7Ghz POWER4+) = 54.4 GHz-Hours 03/2005 - 8 hours, Pentium-M 1.6Ghz = 12.8 Ghz-Hours 11/2005 - 5 hours, Pentium-4 1.7Ghz = 8.5 Ghz-Hours 03/2006 - 1 minute, Pentium-4 3.2Ghz = .05 Ghz-Hours 04/2006 - 17 seconds, Pentium-4 3.2Ghz = .01 Ghz-Hours

If we accept a factor of 800x as a sample of how much faster a checksum may be broken over the course of 2 years (MD5 using the above data is >2000x), then existing checksums do not stand a significant chance of survival in the future. We should thus accept that whatever checksums we are using today, will be broken in the near future, and plan as best as possible. (A brief review [H04] of the present SHA1 attacks indicates an improvement of ~600x in the same timespan).

And for those that claim implementation of these procedures is not yet feasible, see [K06b] for an application that can produce two self-extracting .exe files, with identical MD5s, and whatever payload you want.

The good news

Of the checksums presently used by Manifest2, one stands close to being completely broken: SHA1. The SHA2 series has suffered some attacks, but still remains reasonably solid [G07],[K08]. No attacks against RIPEMD160 have been published, however it is constructed in the same manner as MD5, SHA1 and SHA2, so is also vulnerable to the new methods of cryptanalysis [H04].

To reduce the potential for future problems and any single checksum break leading to a rapid decrease in security, we should incorporate the strongest hash available from each family of checksums, and be prepared to retire old checksums actively, unless there is a overriding reason to keep a specific checksum.

What should be done

Portage should always try to verify all supported hashes that are available in a Manifest2, starting with the strongest ones as maintained by a preference list. Over time, the weaker checksums should be removed from Manifest2 files, once all old Portage installations have had sufficient time to upgrade. We should be prepared to add stronger checksums wherever possible, and to remove those that have been defeated.

An unsupported hash is not considered to be a failure unless no supported hashes are available.

Checksum depreciation

For the current Portage, SHA1 should be gradually removed, as presents no advantages over SHA256. Beyond one specific problem (see the next paragraph), we should add SHA512 (SHA2, 512 bit size), the Whirlpool checksum (standardized checksum, with no known weaknesses). In future, as stream-based checksums are developed (in response to the development by NIST [AHS]), they should be considered and used.

There is one temporary stumbling block at hand - the existing Portage infrastructure does not support SHA384/512 or Whirlpool, thus hampering their immediate acceptance. SHA512 is available in Python 2.5, while SHA1 is already available in Python 2.4. After Python2.5 is established in a Gentoo media release, that would be a suitable time to remove SHA1 from Manifest2 files.

Backwards Compatibility

Old versions of Portage may support and expect only specific checksums. This is accounted for in the checksum depreciation discussion.

References

[AHS] NIST (2007). "NIST's Plan for New Cryptographic Hash Functions",
(Advanced Hash Standard). http://csrc.nist.gov/pki/HashWorkshop/
[BOBO06] Boneh, D. and Boyen, X. (2006). "On the Impossibility of
Efficiently Combining Collision Resistant Hash Functions"; Proceedings of CRYPTO 2006, Dwork, C. (Ed.); Lecture Notes in Computer Science 4117, pp. 570-583. Available online from: http://crypto.stanford.edu/~dabo/abstracts/hashing.html
[H04] Hawkes, P. and Paddon, M. and Rose, G. (2004). "On Corrective
Patterns for the SHA-2 Family". CRYPTO 2004 Cryptology ePrint Archive, Report 2004/204. Available online from: http://eprint.iacr.org/2004/207.pdf
[J04] Joux, Antoie. (2004). "Multicollisions in Iterated Hash
Functions - Application to Cascaded Constructions;" Proceedings of CRYPTO 2004, Franklin, M. (Ed); Lecture Notes in Computer Science 3152, pp. 306-316. Available online from: http://web.cecs.pdx.edu/~teshrim/spring06/papers/general-attacks/multi-joux.pdf
[K06a] Klima, V. (2006). "Tunnels in Hash Functions: MD5 Collisions
Within a Minute". Cryptology ePrint Archive, Report 2006/105. Available online from: http://eprint.iacr.org/2006/105.pdf
[K06b] Klima, V. (2006). "Note and links to high-speed MD5 collision
proof of concept tools". Available online from: http://cryptography.hyperlink.cz/2006/trick.txt
[K08] Klima, V. (2008). "On Collisions of Hash Functions Turbo SHA-2".
Cryptology ePrint Archive, Report 2008/003. Available online from: http://eprint.iacr.org/2008/003.pdf
[G07] Gligoroski, D. and Knapskog, S.J. (2007). "Turbo SHA-2".
Cryptology ePrint Archive, Report 2007/403. Available online from: http://eprint.iacr.org/2007/403.pdf
[W04] Wang, X. et al: "Collisions for Hash Functions MD4, MD5,
HAVAL-128 and RIPEMD", rump session, CRYPTO 2004, Cryptology ePrint Archive, Report 2004/199, first version (August 16, 2004), second version (August 17, 2004). Available online from: http://eprint.iacr.org/2004/199.pdf

Thanks to

I'd like to thank the following folks, in no specific order:
  • Ciaran McCreesh (ciaranm) - for pointing out the Joux (2004) paper, and also being stubborn enough in not accepting a partial solution.
  • Marius Mauch (genone), Zac Medico (zmedico) and Brian Harring (ferringb): for being knowledgeable about the Portage Manifest2 codebase.