Preimage attack: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
note existing citation to fix citation needed. a bit dated, perhaps....
 
(45 intermediate revisions by 29 users not shown)
Line 1: Line 1:
{{short description|Attack model against cryptographic hash functions}}
In [[cryptography]], a '''preimage attack''' on [[cryptographic hash function]]s tries to find a [[message]] that has a specific hash value. A cryptographic hash function should resist attacks on its [[Preimage#Inverse image|preimage]].
In [[cryptography]], a '''preimage attack''' on [[cryptographic hash function]]s tries to find a [[Message#In computer science|message]] that has a specific hash value. A cryptographic hash function should resist attacks on its [[Preimage#Inverse image|preimage]] (set of possible inputs).


In the context of attack, there are two types of preimage resistance:
In the context of attack, there are two types of preimage resistance:


* ''preimage resistance'': for essentially all pre-specified outputs, it is computationally infeasible to find any input that hashes to that output, i.e., given ''y'', it is difficult to find an ''x'' such that {{nowrap|1=''h''(''x'') = ''y''}}.<ref name="crypto-hash-def">
* ''preimage resistance'': for essentially all pre-specified outputs, it is computationally infeasible to find any input that hashes to that output; i.e., given {{math|{{var|y}}}}, it is difficult to find an {{math|{{var|x}}}} such that {{math|1={{var|h}}({{var|x}}) = {{var|y}}}}.<ref name="crypto-hash-def">{{cite book
| last1= Rogaway
{{cite paper |last1= Rogaway |first1= P. |last2= Shrimpton |first2= T. |title= Cryptographic Hash-Function Basics: Definitions, Implications, and Separations for Preimage Resistance, Second-Preimage Resistance, and Collision Resistance|journal= Fast Software Encryption (2004) |publisher= Springer-Verlag|url= http://www.cs.ucdavis.edu/~rogaway/papers/relates.pdf|accessdate=17 November 2012}}
| first1= P.
</ref>
| last2= Shrimpton
* ''second-preimage resistance'': it is computationally infeasible to find any second input which has the same output as that of a specified input, i.e., given ''x'', it is difficult to find a second preimage {{nowrap|''x''′ ≠ ''x''}} such that {{nowrap|1=''h''(''x'') = ''h''(''x''′)}}.<ref name="crypto-hash-def" />
| first2= T.
| title= Fast Software Encryption
| chapter= Cryptographic Hash-Function Basics: Definitions, Implications, and Separations for Preimage Resistance, Second-Preimage Resistance, and Collision Resistance
| series= Lecture Notes in Computer Science
| year=2004
| volume= 3017
| pages= 371–388
| publisher= Springer-Verlag
| doi= 10.1007/978-3-540-25937-4_24
| isbn= 978-3-540-22171-5
| chapter-url= https://web.cs.ucdavis.edu/~rogaway/papers/relates.pdf
| access-date=17 November 2012}}</ref>
* ''second-preimage resistance'': for a specified input, it is computationally infeasible to find another input which produces the same output; i.e., given {{math|{{var|x}}}}, it is difficult to find a second input {{math|{{var|x}}′ ≠ {{var|x}}}} such that {{math|1={{var|h}}({{var|x}}) = {{var|h}}({{var|x}}′)}}.<ref name="crypto-hash-def" />


These can be compared with a [[collision resistance]], in which it is computationally infeasible to find any two distinct inputs ''x'', ''x''′ that hash to the same output, i.e., such that {{nowrap|1=''h''(''x'') = ''h''(''x''′)}}.<ref name="crypto-hash-def" />
These can be compared with a [[collision resistance]], in which it is computationally infeasible to find any two distinct inputs {{math|{{var|x}}}}, {{math|{{var|x}}}} that hash to the same output; i.e., such that {{math|1={{var|h}}({{var|x}}) = {{var|h}}({{var|x}}′)}}.<ref name="crypto-hash-def" />


Collision resistance implies second-preimage resistance,<ref name="crypto-hash-def" /> but does not guarantee preimage resistance.<ref name="crypto-hash-def" /> Conversely, a second-preimage attack implies a collision attack (trivially, since, in addition to ''x''′, ''x'' is already known right from the start).
Collision resistance implies second-preimage resistance. Second-preimage resistance implies preimage resistance only if the size of the hash function's inputs can be substantially (e.g., factor 2) larger than the size of the hash function's outputs.<ref name="crypto-hash-def" /> Conversely, a second-preimage attack implies a collision attack (trivially, since, in addition to {{math|{{var|x}}}}, {{math|{{var|x}}}} is already known right from the start).


== Applied preimage attacks ==
== Applied preimage attacks ==
By definition, an ideal hash function is such that the fastest way to compute a first or second preimage is through a [[brute-force attack]]. For an ''n''-bit hash, this attack has a [[time complexity]] 2<sup>''n''</sup>, which is considered too high for a typical output size of ''n'' = 128 bits. If such complexity is the best that can be achieved by an adversary, then the hash function is considered preimage-resistant. However, there is a general result that quantum computers perform a structured preimage attack in {{sqrt|2<sup>''n''</sup>}} = 2<sup>''n''/2</sup>, which also implies second preimage<ref name=quantumsha3>http://cr.yp.to/hash/quantumsha3-20101112.pdf</ref> and thus a collision attack.
By definition, an ideal hash function is such that the fastest way to compute a first or second preimage is through a [[brute-force attack]]. For an {{math|{{var|n}}}}-bit hash, this attack has a [[time complexity]] {{math|1=2{{sup|{{var|n}}}}}}, which is considered too high for a typical output size of {{math|{{var|n}}}} = 128 bits. If such complexity is the best that can be achieved by an adversary, then the hash function is considered preimage-resistant. However, there is a general result that [[quantum computer]]s perform a structured preimage attack in <math>\sqrt{2^{n}} = 2^{\frac{n}{2}}</math>, which also implies second preimage<ref>{{cite web
| url=https://cr.yp.to/hash/quantumsha3-20101112.pdf
| title=Quantum attacks against Blue Midnight Wish, ECHO, Fugue, Grøstl, Hamsi, JH, Keccak, Shabal, SHAvite-3, SIMD, and Skein
| author=Daniel J. Bernstein
| work=[[University of Illinois at Chicago]]
| date=2010-11-12
| access-date=2020-03-29}}</ref> and thus a collision attack.


Faster preimage attacks can be found by [[cryptanalysis|cryptanalysing]] certain hash functions, and are specific to that function. Some significant preimage attacks have already been discovered, but they are not yet practical. If a practical preimage attack is discovered, it would drastically affect many Internet protocols. In this case, "practical" means that it could be executed by an attacker with a reasonable amount of resources. For example, a preimaging attack that costs trillions of dollars and takes decades to preimage one desired hash value or one message is not practical; one that costs a few thousand dollars and takes a few weeks might be very practical.
Faster preimage attacks can be found by [[cryptanalysis|cryptanalysing]] certain hash functions, and are specific to that function. Some significant preimage attacks have already been discovered, but they are not yet practical. If a practical preimage attack is discovered, it would drastically affect many Internet protocols. In this case, "practical" means that it could be executed by an attacker with a reasonable amount of resources. For example, a preimaging attack that costs trillions of dollars and takes decades to preimage one desired hash value or one message is not practical; one that costs a few thousand dollars and takes a few weeks might be very practical.


All currently known practical or almost-practical attacks<ref>{{cite web
All currently known practical or almost-practical attacks<ref>{{cite web |authors=Bruce Morton, Clayton Smith |date={{date|2014-01-30}} |title=Why We Need to Move to SHA-2 |publisher=CA Security Council |url=https://casecurity.org/2014/01/30/why-we-need-to-move-to-sha-2/}}</ref><ref>{{cite web |date={{date|2009-01-01}} |title=MD5 and Perspectives |url=https://www.cs.cmu.edu/~perspectives/md5.html }}</ref><ref>{{cite web | url=https://security.googleblog.com/2017/02/announcing-first-sha1-collision.html | title=Google Online Security Blog: Announcing the first SHA1 collision | publisher=Google | accessdate=2017-02-23}}</ref> on [[MD5]] and [[SHA-1]] are [[collision attack]]s{{Citation needed|date=March 2017}}. In general, a collision attack is easier to mount than a preimage attack, as it is not restricted by any set value (any two values can be used to collide). The time complexity of the collision attack, in contrast, is 2<sup>''n''/2</sup>.
| url=https://casecurity.org/2014/01/30/why-we-need-to-move-to-sha-2/
| title=Why We Need to Move to SHA-2
|author=Bruce Morton |author2=Clayton Smith
| publisher=[[Certificate Authority Security Council]]
| date=2014-01-30}}</ref><ref>{{cite web
| url=https://security.googleblog.com/2017/02/announcing-first-sha1-collision.html
| title=Google Online Security Blog: Announcing the first SHA1 collision
| access-date=2017-02-23}}</ref> on [[MD5]] and [[SHA-1]] are [[collision attack]]s.<ref>{{cite web |date=2009-01-01 |title=MD5 and Perspectives |url=https://www.cs.cmu.edu/~perspectives/md5.html}}</ref> In general, a collision attack is easier to mount than a preimage attack, as it is not restricted by any set value (any two values can be used to collide). The time complexity of a brute-force collision attack, in contrast to the preimage attack, is only <math>2^{\frac{n}{2}}</math>.


===Restricted preimage space attacks===
== Usual solutions ==
The computational infeasibility of a first preimage attack on an ideal hash function assumes that the set of possible hash inputs is too large for a brute force search. However if a given hash value is known to have been produced from a set of inputs that is relatively small or is ordered by likelihood in some way, then a brute force search may be effective. Practicality depends on the input set size and the speed or cost of computing the hash function.


A common example is the use of hashes to store [[password]] validation data for authentication. Rather than store the plaintext of user passwords, an access control system stores a hash of the password. When a user requests access, the password they submit is hashed and compared with the stored value. If the stored validation data is stolen, the thief will only have the hash values, not the passwords. However most users choose passwords in predictable ways and many passwords are short enough that all possible combinations can be tested if fast hashes are used, even if the hash is rated secure against preimage attacks.<ref>{{cite web
To improve resistance to collision attacks, [[double hashing]] is a good solution {{Citation needed|reason=the pre-image resistance of any (iteratively applied) secure hash function should be 2^b tests of that (iteratively applied) hash function; double hashing only doubles the time required by the attack -- to actually slow things down, one should use a password hashing algorithm such as [[Lyra2]]|date=October 2018}} in case someone discovers a preimage attack on the first hash. The [[Bitcoin|Bitcoin system]] uses double hashed [[SHA256]]<ref>https://en.bitcoin.it/wiki/Block_hashing_algorithm</ref> that was a common way to slow down hashing searches in the 2000's.
| url=https://arstechnica.com/information-technology/2012/12/25-gpu-cluster-cracks-every-standard-windows-password-in-6-hours/
| title=25-GPU cluster cracks every standard Windows password in <6 hours
| date=2012-12-10
| first=Dan
| last=Goodin
| publisher=[[Ars Technica]]
| access-date=2020-11-23}}</ref> Special hashes called [[key derivation function]]s have been created to slow searches. ''See'' [[Password cracking]]. For a method to prevent the testing of short passwords see [[salt (cryptography)]].


== See also ==
== See also ==
* [[Birthday attack]]
* [[Birthday attack]]
* [[Cryptographic hash function]]
* [[Cryptographic hash function]]
* [[Hash function security summary]]
* [[Puzzle friendliness]]
* [[Rainbow table]]
* [[Rainbow table]]
* [[Random oracle]]
* [[Random oracle]]
* {{IETF RFC|4270}}: ''Attacks on Cryptographic Hashes in Internet Protocols''


== References ==
== References ==
* IETF RFC 4270: Attacks on Cryptographic Hashes in Internet Protocols
<references />
<references />


Line 37: Line 75:
{{DEFAULTSORT:Preimage Attack}}
{{DEFAULTSORT:Preimage Attack}}
[[Category:Cryptographic attacks]]
[[Category:Cryptographic attacks]]

{{crypto-stub}}

Latest revision as of 15:44, 13 April 2024

In cryptography, a preimage attack on cryptographic hash functions tries to find a message that has a specific hash value. A cryptographic hash function should resist attacks on its preimage (set of possible inputs).

In the context of attack, there are two types of preimage resistance:

  • preimage resistance: for essentially all pre-specified outputs, it is computationally infeasible to find any input that hashes to that output; i.e., given y, it is difficult to find an x such that h(x) = y.[1]
  • second-preimage resistance: for a specified input, it is computationally infeasible to find another input which produces the same output; i.e., given x, it is difficult to find a second input x′ ≠ x such that h(x) = h(x′).[1]

These can be compared with a collision resistance, in which it is computationally infeasible to find any two distinct inputs x, x that hash to the same output; i.e., such that h(x) = h(x′).[1]

Collision resistance implies second-preimage resistance. Second-preimage resistance implies preimage resistance only if the size of the hash function's inputs can be substantially (e.g., factor 2) larger than the size of the hash function's outputs.[1] Conversely, a second-preimage attack implies a collision attack (trivially, since, in addition to x, x is already known right from the start).

Applied preimage attacks[edit]

By definition, an ideal hash function is such that the fastest way to compute a first or second preimage is through a brute-force attack. For an n-bit hash, this attack has a time complexity 2n, which is considered too high for a typical output size of n = 128 bits. If such complexity is the best that can be achieved by an adversary, then the hash function is considered preimage-resistant. However, there is a general result that quantum computers perform a structured preimage attack in , which also implies second preimage[2] and thus a collision attack.

Faster preimage attacks can be found by cryptanalysing certain hash functions, and are specific to that function. Some significant preimage attacks have already been discovered, but they are not yet practical. If a practical preimage attack is discovered, it would drastically affect many Internet protocols. In this case, "practical" means that it could be executed by an attacker with a reasonable amount of resources. For example, a preimaging attack that costs trillions of dollars and takes decades to preimage one desired hash value or one message is not practical; one that costs a few thousand dollars and takes a few weeks might be very practical.

All currently known practical or almost-practical attacks[3][4] on MD5 and SHA-1 are collision attacks.[5] In general, a collision attack is easier to mount than a preimage attack, as it is not restricted by any set value (any two values can be used to collide). The time complexity of a brute-force collision attack, in contrast to the preimage attack, is only .

Restricted preimage space attacks[edit]

The computational infeasibility of a first preimage attack on an ideal hash function assumes that the set of possible hash inputs is too large for a brute force search. However if a given hash value is known to have been produced from a set of inputs that is relatively small or is ordered by likelihood in some way, then a brute force search may be effective. Practicality depends on the input set size and the speed or cost of computing the hash function.

A common example is the use of hashes to store password validation data for authentication. Rather than store the plaintext of user passwords, an access control system stores a hash of the password. When a user requests access, the password they submit is hashed and compared with the stored value. If the stored validation data is stolen, the thief will only have the hash values, not the passwords. However most users choose passwords in predictable ways and many passwords are short enough that all possible combinations can be tested if fast hashes are used, even if the hash is rated secure against preimage attacks.[6] Special hashes called key derivation functions have been created to slow searches. See Password cracking. For a method to prevent the testing of short passwords see salt (cryptography).

See also[edit]

  • Birthday attack
  • Cryptographic hash function
  • Hash function security summary
  • Puzzle friendliness
  • Rainbow table
  • Random oracle
  • RFC 4270: Attacks on Cryptographic Hashes in Internet Protocols

References[edit]