There is one cryptosystem that uses an NP-hard problem for its security: the McEliece cryptosystem. The problem used here is decoding a generalized version of an error correcting code, which can be made both NP-hard in theory and in practice if you use the right kind of code. It is reliant on using very specific algorithmic parameters to stay hard, though. However, the versions that are not broken are likely post-quantum ready.
honzaik 5 hours ago [-]
afaik the "right kind of code" does a lot of heavy lifting for practical implementations, such as Classical McEliece.
correct me if I am wrong as I havent spent much time looking into it, but the security analysis essentially says "we assume the Goppa code is indistinguishable from a random code so the best attack is to do generic decoding for a random code (NP-hard problem)". but there is no reduction to some NP-hard problem that Goppa code (the specific code used in Classical McEliece) is indistinguishable.
the assumption is reasonable as nobody has been able to find a distinguisher for decades. also, if a distinguisher exists, it also doesn't translate into a direct attack against the system, it just means you cannot rule out "structural attacks" and jump to NP-hard problem.
less_less 4 hours ago [-]
Yeah that's right, there are no known cryptosystems whose security is based on the difficulty of solving an NP-hard problem. It's not known even in theory whether P != NP implies that one-way functions exist: for example, it might be that all NP problems are easy on average, or that there are problems that are hard on average but that you can't sample the problems and their solution at the same time.
(And this is even with the simplification that polytime = practical and not-polytime = infeasible.)
aleph_minus_one 2 hours ago [-]
> It's not known even in theory whether P != NP implies that one-way functions exist: for example, it might be that all NP problems are easy on average, or that there are problems that are hard on average but that you can't sample the problems and their solution at the same time.
Relevant paper:
Impagliazzo, R. A personal view of average-case complexity theory. In Proceedings of the 10th Annual Conference on Structure in Complexity Theory. IEEE Computer Society Press (1995), 134–147.
The article is a bit confusing (to me as a layman), basically as I understand it's saying that the reason we can't form cryptosystems out of arbitrary NP-complete problems is because there exist randomized algorithms that work for 99% of inputs (average vs. worst-case complexity, e.g. why the simplex algorithm works in practice). But there are some things missing:
* We know that randomized algorithms exist for NP-complete problems. But couldn't we have some problem in NP that while reducible to e.g. 3SAT always reduce to "very hard" instances?
* Conversely, there's no guarantee that there doesn't exist a randomized algorithm for something that's NP-hard (and not in NP), right?
The argument along the lines of NP-complete vs NP-hard seems to be a non-sequitur, since the actual distinction is average vs. worst-case hardness right? It may just so happen that all currently known problems in NP are easy "on average", but I don't think that's formally guaranteed.
There has been quite some research on finding, in layman's terms, where the difficult instances of an NP-complete problem are.
What it comes down to is that there are phase transitions in computational systems that depend on how constrained the problem is.
Consider a sudoku puzzle. If the board is almost empty of numbers then it's easy to solve. If the board is almost full of numbers it's easy to solve. But there's some critical amount of numbers filled in (say half of them) that makes the sudoko most difficult to solve. That's where the phase transition is.
The notion of a phase transition comes from thermodynamics where ice becomes water becomes gas.
In general, it's very hard, and a constantly moving target, to make 'hard' NP-complete problems.
For example, it's really nice to have a problem generator which make a series of hard problems, for things like benchmarking. It's very common that someone comes up with such a system, then someone ends up finding a way of solving "almost everything" that the random generator makes, by finding a good way of finding solutions to the solvable problems, and/or finding a way to easily prove unsolvable problems have no solutions.
The second category tends to be more interesting, it ends up being really hard to make large random unsolvable problems which don't end up including some kind of "flaw" with high probability, where the flaw ends up being easy to find, and easily proves the problem is unsolvable.
Vervious 8 hours ago [-]
1) It'd be really really awesome to show average-case hardness version of a natural problem (i.e. 3SAT on a natural distribution) from the worst-case hardness of that problem (i.e. 3SAT). I believe it's wide open in the context of building one-way functions.
2) Randomized polynomial-time algorithms don't exist for NP-hard problems unless P = NP (iirc).
I think you have the right intuition. The issue is that right now cryptography is built on top of relatively "fragile" "average-case hard" problems -- discrete log, lattice stuff, etc. They are hard by assumption and we really don't know how to study them. (I would bet on some of this stuff being broken in our lifetimes).
It'd be really nice to instead base cryptography on an assumption that we are much more confident in being true, i.e. worst-case hardness of 3SAT. (I mean, who is going to prove P=NP). There are two steps:
1) First, show cryptography from average-case hardness of a "more believable" problem.
2) Second, to show average-case hardness of that problem is implied by the worst-case hardness of an NP-complete problem.
(1) is sort of better studied now, with a line of work on meta-complexity. (2) I believe is wide open, as previously mentioned.
Anyways I'm not really an expert in this area, but it's really quite fascinating.
strulovich 9 hours ago [-]
Generally, all problems used for crypto are in NP. Since given the secret, you need to be able to compute it efficiently.
It’s just that being NP hard or complete is not enough.
The theoretical definition of one way functions is used to define cryptography. So reading on that my clarify this some more.
cubefox 3 minutes ago [-]
Does this mean there is a relationship between NP completeness and worst case time complexity?
stncls 4 hours ago [-]
> Instead, cryptography needs problems that are hard in the average case, like the RSA problem, the discrete logarithm for elliptic curves, and the shortest vector problem for lattices. We don’t technically know whether these are NP-complete
But we do know that the shortest vector problem is NP-hard (in Linf norm), and so is its decision version [1]. We don't have a reduction in the other direction, but we know that SVP is as-hard-as-NP-complete-or-harder.
This goes against the general argument of the article, but only weakly so, because I think lattice-based systems are less broadly deployed than elliptic curves or factoring (not a crypto person, though).
> For example, it could be possible that, if we generated a random map in some way, 99% percent of the time, the three-color map coloring problem would be easy, but 1% of the time, it would be very difficult.
> This would still be NP-complete, but would make for a terrible cryptosystem - a scheme based on this problem would be easily broken 99% of the time!
I don’t think this explanation is great. We could just generate instances until we found the hard one and then use it. So the answer is more complex than just as written.
cobbal 12 hours ago [-]
The point is not that all NP hard problems make bad cryptosystems, but rather the NP-hardness guarantee fails to translate into a useful guarantee for a cryptosystem.
It's about the proof, not the computation.
ethanwillis 11 hours ago [-]
Can you explain your usage of NP hard in your comment?
saghm 11 hours ago [-]
A problem being NP-hard is one of the two requirements of being in NP-complete (the other being that it's NP). If I remember correctly, problems that are NP-hard without being NP are undecidable, so in contexts like "what problems can we base cryptography on?", it's basically the same as saying NP-complete, since we aren't able to base cryptography on undecidable problems.
Not undecidable, it's just not possible to verify if an answer is correct in polynomial time. There are problems know to be decidable, known to be NP-hard and outside of NP. (E.g. in NEXPTIME)
ulrikrasmussen 5 hours ago [-]
You don't know if the one you generated is hard. Graph coloring is easy for large but trivially connected graphs, and many many problem instances turn out to have structure that can be exploited to solve it efficiently. You would basically need to implement a procedure for determining the lower bound of the runtime of all known algorithms for the problem instance which is most likely very difficult, and even if possible, you don't know if your particular class of generated problems turns out to be easily solvable by a yet to be discovered algorithm.
Amateur observation, take with a grain of salt: On the other hand, I guess we are doing something similar when generating a new public/private key pair on an elliptic curve. It just turns out that finite fields have very little (known) structure to exploit compared to e.g. colored graphs, so we have more confidence in the assumption that the generated discrete logarithm problem is indeed hard.
Vervious 11 hours ago [-]
how do you know if an instance you picked is hard?
Another problem with your suggestion: Do we even know that a 1/polynomial sized fraction of 3-coloring instances are hard? (Else you'll never sample one). (NP-Hardness certainly does not give such a guarantee, even though it sounds weak)
goalieca 11 hours ago [-]
RSA is kind of shaky like this. You generate key and then run some tests hoping it’s “good”. Most of the time it isn’t so you try again. Many products have failed because they were not.
adgjlsfhk1 9 hours ago [-]
this mostly isn't true. for much smaller keys there were risks of "weak primes", but as general purpose factoring algorithms got better and pushed keys bigger a higher and higher percent of primes end up safe. with 2048 bit primes, the odds of your prime being significantly weaker than average drop to something like 1 in 2^1024
aleph_minus_one 5 hours ago [-]
What is a "weak prime" here (i.e. by which properties is such a prime number characterized)?
less_less 4 hours ago [-]
Suppose that p-1 has no large prime factors, eg suppose p-1 divides 10000!, then you can factor N by calculating something like x = random() ^ (10000!) % N. Then x will be 1 mod p (or rarely 0, if x was) but probably not mod q unless q has the same property. Then gcd(x-1,N) = p, and you've factored N. A similar trick works with Lucas sequences if p+1 has no large prime factors. So instead of attacking your key with a very expensive algorithm that doesn't care about special properties of p,q, they might gamble that your key is weak and try a cheaper algorithm first. So folks would avoid keys where p+1 or p-1 was "smooth" in this way.
However, we don't care much about this attack anymore for two reasons. One is that N must now be big enough to resist attack by the Number Field Sieve (and not just the Quadratic Sieve). At least if there are only two primes in the RSA key, this means that p,q must be big enough for these attacks to be very unlikely to work. But just as importantly, it turns out that you can do the above attack with exponentiation on a random elliptic curve instead of regular exponentiation / Lucas sequences. This is how most math libraries implement factor(). It's slightly slower but since the random elliptic curve will have a random order instead of exactly p+1 or p-1, it is equally likely to work with any p,q of a given size, regardless of special properties they might have.
In other words, an attacker can re-randomize how weak or strong a key is by using elliptic curves. So there's not much point in avoiding ones that look weak at first glance.
asimpletune 4 hours ago [-]
A prime pair that’s easily factorable
alexey-salmin 12 hours ago [-]
I'd also note that zero-knowledge proofs (which are considered a branch of cryptography) often do rely on NP-complete problems like the three-color maps.
Vervious 11 hours ago [-]
That's completely unrelated, the reliance there is because you want to be able to prove "any statement in NP" in zero knowledge. It then suffices to give a protocol for just 3-colorings, because "any statement in NP" can then be reduced to a 3-coloring.
ars 11 hours ago [-]
How do you know which ones are the hard ones? I don't think it's easy to figure that out.
4 hours ago [-]
ccppurcell 7 hours ago [-]
The keyword not mentioned here is one-way function. Cryptography is for the most part based on candidates for one-way functions: functions that are easy to perform on any input but difficult to undo on average. The existence of such a function would imply that P is not NP but the converse doesn't hold.
delifue 12 hours ago [-]
The RSA problem in article doesn't mention that RSA's difficulty is based on MODULUS prime factorization, not simple prime factorization.
There's a footnote along these lines that links to the actual algorithm.
LegionMammal978 11 hours ago [-]
Are there any known attack methods that don't involve factoring the public key into two primes?
bmenrigh 11 hours ago [-]
Yeah but they usually require RSA to be used in some rather unusual and bad way.
For example, encrypting one message with many different public keys can be broken with chinese remainder theorem and Nth roots. This reveals the message without factoring any key. This is why randomized padding (among other things) is a must with RSA.
saurik 7 hours ago [-]
This seems like a definition issue, with respect to the set from which we are randomly sampling... what constitutes the problem? If you randomly select numbers and try to factor them--which to me seems like a fair way to define the problem: factoring, not two-primes factoring--the vast majority of numbers are, in fact, very easy to factor ;P. Instead, we say we only want to randomly select a number which is only two primes multiplied together in order to find the juicy hard versions of the problem. (And, even then, finding those numbers isn't precise! We rely on probabilistic primarily tests to weed out numbers that are probably not prime.)
Similarly, who is to say that there isn't some way to define what the hard instances of NP-complete problems are, and we just should be sampling from that difficult region of the problem space instead of all possible problems, the same way we sample only from the difficult-to-factor coprimes instead of from all possible numbers? There is definitely something interesting going on with the harder instances of these problems, and researchers have managed to discover a kind of "phase transition" among instances of some of the problems between ones with so many solutions they are easy and ones with so few solutions they are also once again easy (and my knowledge on this is about 20 years rusty and outdated... maybe this is better understood and characterized now).
praptak 6 hours ago [-]
Another problem with NP-hard is that it only covers deterministic algorithms.
Even if most instances of a problem are deterministically hard, they might still have a randomized algorithm which solves them feasibly.
tyilo 2 hours ago [-]
It might also be the case that P=NP
userbinator 12 hours ago [-]
Aren't hash functions a counterexample? There have been attempts at using SAT solvers to find preimage and collision attacks on them.
less_less 4 hours ago [-]
Breaking hash functions is in NP, but isn't expected to be NP-complete, meaning that you can't easily encode eg a SAT problem into a hash function, such that breaking the hash solves the SAT problem.
You can do the reverse (encode the hash into a SAT problem), but that's possible for any NP problem, because SAT is NP-complete.
memkit 1 hours ago [-]
Your last statement is misleading. A problem being NP-complete isn't exactly the property that allows you to reduce any NP problem to it. Suppose there was a complexity class MP-hard that has no efficient reduction to an NP-complete problem. Then a problem being MP isn't what allows me to write a reduction to the MP-hard problem; I could just as easily write a reduction from a problem in P to the MP-hard problem. Your statement is misleading but incidentally correct because the true condition (NP-hard or easier) happens to be equivalent to your stated condition (NP) for this particular complexity class. It would be clearer to simply state that you can always reduce an easy problem to a harder one.
elchananHaas 10 hours ago [-]
Yes, this article is only talking about asymmetric cryptography. Symmetric cryptography and hashes are based on NP complete problems.This is only when generalized appropriately, most individual hashes/ciphers are fixed so there isn't really a way to talk about their complexity. The broader class of block cipher like problems is NP complete, and very hard.
westurner 10 hours ago [-]
SAT Solvers, GA, LLMs, brute force
Birthday paradox probability changes with memoization.
Rainbow tables trade CPU for storage for lookup; but salting, double hashing, and key derivation functions with many rounds like pbkdf and argon2.
tines 12 hours ago [-]
Some of the graphics have black text on a transparent background and so are hard to read with a dark theme.
aaron695 4 hours ago [-]
[dead]
tofof 11 hours ago [-]
It's unfortunate that he doesn't understand complexity analysis and has taken it upon himself to write an article that requires it as an underpinning.
In particular, I stopped reading at
> the discrete logarithm for elliptic curves, ... We don’t technically know whether these are NP-complete
This is wildly misleading. We do know with absolute certainty that the decision problem for DLC is NP-Hard, and therefore that EPDLC inherits at least NP-Hardness. While it is true that their presence in NP-Hard does not require their pressence in NP (and thus are not proved NP Complete), it is misleading to characterize as the author has here that they are weaker than NP, instead of the known certainty that NP is the absolute minimum bound for their hardness.
cevi 10 hours ago [-]
If the discrete logarithm problem is NP-hard, then I will eat my hat. The discrete logarithm problem can be solved by Shor's algorithm on a quantum computer, placing it in the complexity class BQP. Anyone who claims that BQP contains an NP-hard problem is selling something - I would bet at a trillion-to-one odds against such a claim (if there were any hope of definitively settling the problem).
memkit 56 minutes ago [-]
> While it is true that their presence in NP-Hard does not require their pressence in NP (and thus are not proved NP Complete)
You're confused here. The two conditions for a problem being NP-complete are (1) it being NP-hard and (2) it being in NP.
You suggest (2) is the issue, but usually it's harder to prove (1) rather than (2). In the context of factorization problems, the factors are simply the certificate that satisfy condition (2).
Vervious 11 hours ago [-]
Not sure it's misleading, he did write the word "technically", and anyone who knows what NP-complete is knows that NP-hard does not necessarily mean NP-complete. I am a cryptographer and the article is fine.
Also, do you have a citation for "We do know with absolute certainty that the decision problem for DLC is NP-Hard"
JohnKemeny 8 hours ago [-]
DLC is in NP and co-NP. Very unlikely to be NP-hard. It is usually listed as one of the candidates for problems that are NP-intermediate, ie, problems in-between P and NP-hard (should they be different).
correct me if I am wrong as I havent spent much time looking into it, but the security analysis essentially says "we assume the Goppa code is indistinguishable from a random code so the best attack is to do generic decoding for a random code (NP-hard problem)". but there is no reduction to some NP-hard problem that Goppa code (the specific code used in Classical McEliece) is indistinguishable.
the assumption is reasonable as nobody has been able to find a distinguisher for decades. also, if a distinguisher exists, it also doesn't translate into a direct attack against the system, it just means you cannot rule out "structural attacks" and jump to NP-hard problem.
(And this is even with the simplification that polytime = practical and not-polytime = infeasible.)
Relevant paper:
Impagliazzo, R. A personal view of average-case complexity theory. In Proceedings of the 10th Annual Conference on Structure in Complexity Theory. IEEE Computer Society Press (1995), 134–147.
https://ieeexplore.ieee.org/document/514853
Non-paywall links:
- https://www.karlin.mff.cuni.cz/~krajicek/ri5svetu.pdf
- https://gwern.net/doc/cs/cryptography/1995-impagliazzo.pdf
Popular scientific articles on this topic:
- https://www.quantamagazine.org/which-computational-universe-...
- https://cacm.acm.org/research/fifty-years-of-p-vs-np-and-the...
* We know that randomized algorithms exist for NP-complete problems. But couldn't we have some problem in NP that while reducible to e.g. 3SAT always reduce to "very hard" instances?
* Conversely, there's no guarantee that there doesn't exist a randomized algorithm for something that's NP-hard (and not in NP), right?
The argument along the lines of NP-complete vs NP-hard seems to be a non-sequitur, since the actual distinction is average vs. worst-case hardness right? It may just so happen that all currently known problems in NP are easy "on average", but I don't think that's formally guaranteed.
Edit: https://theory.stanford.edu/~trevisan/average/slides.pdf seems to imply that (1) is an open question.
What it comes down to is that there are phase transitions in computational systems that depend on how constrained the problem is.
Consider a sudoku puzzle. If the board is almost empty of numbers then it's easy to solve. If the board is almost full of numbers it's easy to solve. But there's some critical amount of numbers filled in (say half of them) that makes the sudoko most difficult to solve. That's where the phase transition is.
The notion of a phase transition comes from thermodynamics where ice becomes water becomes gas.
Here's an early paper on phase transition is AI.
https://www.sciencedirect.com/science/article/pii/0004370287...
Here's a later one on phase transitions for the satisfiability problem (SAT).
https://homepages.inf.ed.ac.uk/rbf/MY_DAI_OLD_FTP/rp679.pdf
For example, it's really nice to have a problem generator which make a series of hard problems, for things like benchmarking. It's very common that someone comes up with such a system, then someone ends up finding a way of solving "almost everything" that the random generator makes, by finding a good way of finding solutions to the solvable problems, and/or finding a way to easily prove unsolvable problems have no solutions.
The second category tends to be more interesting, it ends up being really hard to make large random unsolvable problems which don't end up including some kind of "flaw" with high probability, where the flaw ends up being easy to find, and easily proves the problem is unsolvable.
2) Randomized polynomial-time algorithms don't exist for NP-hard problems unless P = NP (iirc).
I think you have the right intuition. The issue is that right now cryptography is built on top of relatively "fragile" "average-case hard" problems -- discrete log, lattice stuff, etc. They are hard by assumption and we really don't know how to study them. (I would bet on some of this stuff being broken in our lifetimes).
It'd be really nice to instead base cryptography on an assumption that we are much more confident in being true, i.e. worst-case hardness of 3SAT. (I mean, who is going to prove P=NP). There are two steps:
1) First, show cryptography from average-case hardness of a "more believable" problem.
2) Second, to show average-case hardness of that problem is implied by the worst-case hardness of an NP-complete problem.
(1) is sort of better studied now, with a line of work on meta-complexity. (2) I believe is wide open, as previously mentioned.
Anyways I'm not really an expert in this area, but it's really quite fascinating.
It’s just that being NP hard or complete is not enough.
The theoretical definition of one way functions is used to define cryptography. So reading on that my clarify this some more.
But we do know that the shortest vector problem is NP-hard (in Linf norm), and so is its decision version [1]. We don't have a reduction in the other direction, but we know that SVP is as-hard-as-NP-complete-or-harder.
This goes against the general argument of the article, but only weakly so, because I think lattice-based systems are less broadly deployed than elliptic curves or factoring (not a crypto person, though).
[1] H. Bennett, "The Complexity of the Shortest Vector Problem.", 2022 https://simons.berkeley.edu/sites/default/files/docs/21271/s...
> This would still be NP-complete, but would make for a terrible cryptosystem - a scheme based on this problem would be easily broken 99% of the time!
I don’t think this explanation is great. We could just generate instances until we found the hard one and then use it. So the answer is more complex than just as written.
It's about the proof, not the computation.
https://en.wikipedia.org/wiki/NP-hardness
Amateur observation, take with a grain of salt: On the other hand, I guess we are doing something similar when generating a new public/private key pair on an elliptic curve. It just turns out that finite fields have very little (known) structure to exploit compared to e.g. colored graphs, so we have more confidence in the assumption that the generated discrete logarithm problem is indeed hard.
Another problem with your suggestion: Do we even know that a 1/polynomial sized fraction of 3-coloring instances are hard? (Else you'll never sample one). (NP-Hardness certainly does not give such a guarantee, even though it sounds weak)
However, we don't care much about this attack anymore for two reasons. One is that N must now be big enough to resist attack by the Number Field Sieve (and not just the Quadratic Sieve). At least if there are only two primes in the RSA key, this means that p,q must be big enough for these attacks to be very unlikely to work. But just as importantly, it turns out that you can do the above attack with exponentiation on a random elliptic curve instead of regular exponentiation / Lucas sequences. This is how most math libraries implement factor(). It's slightly slower but since the random elliptic curve will have a random order instead of exactly p+1 or p-1, it is equally likely to work with any p,q of a given size, regardless of special properties they might have.
In other words, an attacker can re-randomize how weak or strong a key is by using elliptic curves. So there's not much point in avoiding ones that look weak at first glance.
https://en.wikipedia.org/wiki/RSA_(cryptosystem)
For example, encrypting one message with many different public keys can be broken with chinese remainder theorem and Nth roots. This reveals the message without factoring any key. This is why randomized padding (among other things) is a must with RSA.
Similarly, who is to say that there isn't some way to define what the hard instances of NP-complete problems are, and we just should be sampling from that difficult region of the problem space instead of all possible problems, the same way we sample only from the difficult-to-factor coprimes instead of from all possible numbers? There is definitely something interesting going on with the harder instances of these problems, and researchers have managed to discover a kind of "phase transition" among instances of some of the problems between ones with so many solutions they are easy and ones with so few solutions they are also once again easy (and my knowledge on this is about 20 years rusty and outdated... maybe this is better understood and characterized now).
Even if most instances of a problem are deterministically hard, they might still have a randomized algorithm which solves them feasibly.
You can do the reverse (encode the hash into a SAT problem), but that's possible for any NP problem, because SAT is NP-complete.
Birthday paradox probability changes with memoization.
Rainbow tables trade CPU for storage for lookup; but salting, double hashing, and key derivation functions with many rounds like pbkdf and argon2.
In particular, I stopped reading at
> the discrete logarithm for elliptic curves, ... We don’t technically know whether these are NP-complete
This is wildly misleading. We do know with absolute certainty that the decision problem for DLC is NP-Hard, and therefore that EPDLC inherits at least NP-Hardness. While it is true that their presence in NP-Hard does not require their pressence in NP (and thus are not proved NP Complete), it is misleading to characterize as the author has here that they are weaker than NP, instead of the known certainty that NP is the absolute minimum bound for their hardness.
You're confused here. The two conditions for a problem being NP-complete are (1) it being NP-hard and (2) it being in NP.
You suggest (2) is the issue, but usually it's harder to prove (1) rather than (2). In the context of factorization problems, the factors are simply the certificate that satisfy condition (2).
Also, do you have a citation for "We do know with absolute certainty that the decision problem for DLC is NP-Hard"
See e.g. https://cs.stackexchange.com/a/2765