NHacker Next
login
▲Why cryptography is not based on NP-complete problemsblintzbase.com
120 points by blintz 143 days ago | 56 comments
Loading comments...
pclmulqdq 143 days ago [-]
There is one cryptosystem that uses an NP-hard problem for its security: the McEliece cryptosystem. The problem used here is decoding a generalized version of an error correcting code, which can be made both NP-hard in theory and in practice if you use the right kind of code. It is reliant on using very specific algorithmic parameters to stay hard, though. However, the versions that are not broken are likely post-quantum ready.
honzaik 143 days ago [-]
afaik the "right kind of code" does a lot of heavy lifting for practical implementations, such as Classical McEliece.

correct me if I am wrong as I havent spent much time looking into it, but the security analysis essentially says "we assume the Goppa code is indistinguishable from a random code so the best attack is to do generic decoding for a random code (NP-hard problem)". but there is no reduction to some NP-hard problem that Goppa code (the specific code used in Classical McEliece) is indistinguishable.

the assumption is reasonable as nobody has been able to find a distinguisher for decades. also, if a distinguisher exists, it also doesn't translate into a direct attack against the system, it just means you cannot rule out "structural attacks" and jump to NP-hard problem.

less_less 143 days ago [-]
Yeah that's right, there are no known cryptosystems whose security is based on the difficulty of solving an NP-hard problem. It's not known even in theory whether P != NP implies that one-way functions exist: for example, it might be that all NP problems are easy on average, or that there are problems that are hard on average but that you can't sample the problems and their solution at the same time.

(And this is even with the simplification that polytime = practical and not-polytime = infeasible.)

aleph_minus_one 143 days ago [-]
> It's not known even in theory whether P != NP implies that one-way functions exist: for example, it might be that all NP problems are easy on average, or that there are problems that are hard on average but that you can't sample the problems and their solution at the same time.

Relevant paper:

Impagliazzo, R. A personal view of average-case complexity theory. In Proceedings of the 10th Annual Conference on Structure in Complexity Theory. IEEE Computer Society Press (1995), 134–147.

https://ieeexplore.ieee.org/document/514853

Non-paywall links:

- https://www.karlin.mff.cuni.cz/~krajicek/ri5svetu.pdf

- https://gwern.net/doc/cs/cryptography/1995-impagliazzo.pdf

Popular scientific articles on this topic:

- https://www.quantamagazine.org/which-computational-universe-...

- https://cacm.acm.org/research/fifty-years-of-p-vs-np-and-the...

krackers 143 days ago [-]
The article is a bit confusing (to me as a layman), basically as I understand it's saying that the reason we can't form cryptosystems out of arbitrary NP-complete problems is because there exist randomized algorithms that work for 99% of inputs (average vs. worst-case complexity, e.g. why the simplex algorithm works in practice). But there are some things missing:

* We know that randomized algorithms exist for NP-complete problems. But couldn't we have some problem in NP that while reducible to e.g. 3SAT always reduce to "very hard" instances?

* Conversely, there's no guarantee that there doesn't exist a randomized algorithm for something that's NP-hard (and not in NP), right?

The argument along the lines of NP-complete vs NP-hard seems to be a non-sequitur, since the actual distinction is average vs. worst-case hardness right? It may just so happen that all currently known problems in NP are easy "on average", but I don't think that's formally guaranteed.

Edit: https://theory.stanford.edu/~trevisan/average/slides.pdf seems to imply that (1) is an open question.

nemoniac 143 days ago [-]
There has been quite some research on finding, in layman's terms, where the difficult instances of an NP-complete problem are.

What it comes down to is that there are phase transitions in computational systems that depend on how constrained the problem is.

Consider a sudoku puzzle. If the board is almost empty of numbers then it's easy to solve. If the board is almost full of numbers it's easy to solve. But there's some critical amount of numbers filled in (say half of them) that makes the sudoko most difficult to solve. That's where the phase transition is.

The notion of a phase transition comes from thermodynamics where ice becomes water becomes gas.

Here's an early paper on phase transition is AI.

https://www.sciencedirect.com/science/article/pii/0004370287...

Here's a later one on phase transitions for the satisfiability problem (SAT).

https://homepages.inf.ed.ac.uk/rbf/MY_DAI_OLD_FTP/rp679.pdf

Vervious 143 days ago [-]
1) It'd be really really awesome to show average-case hardness version of a natural problem (i.e. 3SAT on a natural distribution) from the worst-case hardness of that problem (i.e. 3SAT). I believe it's wide open in the context of building one-way functions.

2) Randomized polynomial-time algorithms don't exist for NP-hard problems unless P = NP (iirc).

I think you have the right intuition. The issue is that right now cryptography is built on top of relatively "fragile" "average-case hard" problems -- discrete log, lattice stuff, etc. They are hard by assumption and we really don't know how to study them. (I would bet on some of this stuff being broken in our lifetimes).

It'd be really nice to instead base cryptography on an assumption that we are much more confident in being true, i.e. worst-case hardness of 3SAT. (I mean, who is going to prove P=NP). There are two steps:

1) First, show cryptography from average-case hardness of a "more believable" problem.

2) Second, to show average-case hardness of that problem is implied by the worst-case hardness of an NP-complete problem.

(1) is sort of better studied now, with a line of work on meta-complexity. (2) I believe is wide open, as previously mentioned.

Anyways I'm not really an expert in this area, but it's really quite fascinating.

CJefferson 143 days ago [-]
In general, it's very hard, and a constantly moving target, to make 'hard' NP-complete problems.

For example, it's really nice to have a problem generator which make a series of hard problems, for things like benchmarking. It's very common that someone comes up with such a system, then someone ends up finding a way of solving "almost everything" that the random generator makes, by finding a good way of finding solutions to the solvable problems, and/or finding a way to easily prove unsolvable problems have no solutions.

The second category tends to be more interesting, it ends up being really hard to make large random unsolvable problems which don't end up including some kind of "flaw" with high probability, where the flaw ends up being easy to find, and easily proves the problem is unsolvable.

strulovich 143 days ago [-]
Generally, all problems used for crypto are in NP. Since given the secret, you need to be able to compute it efficiently.

It’s just that being NP hard or complete is not enough.

The theoretical definition of one way functions is used to define cryptography. So reading on that my clarify this some more.

zeroonetwothree 143 days ago [-]
> For example, it could be possible that, if we generated a random map in some way, 99% percent of the time, the three-color map coloring problem would be easy, but 1% of the time, it would be very difficult.

> This would still be NP-complete, but would make for a terrible cryptosystem - a scheme based on this problem would be easily broken 99% of the time!

I don’t think this explanation is great. We could just generate instances until we found the hard one and then use it. So the answer is more complex than just as written.

cobbal 143 days ago [-]
The point is not that all NP hard problems make bad cryptosystems, but rather the NP-hardness guarantee fails to translate into a useful guarantee for a cryptosystem.

It's about the proof, not the computation.

ethanwillis 143 days ago [-]
Can you explain your usage of NP hard in your comment?
saghm 143 days ago [-]
A problem being NP-hard is one of the two requirements of being in NP-complete (the other being that it's NP). If I remember correctly, problems that are NP-hard without being NP are undecidable, so in contexts like "what problems can we base cryptography on?", it's basically the same as saying NP-complete, since we aren't able to base cryptography on undecidable problems.

https://en.wikipedia.org/wiki/NP-hardness

gpm 143 days ago [-]
Not undecidable, it's just not possible to verify if an answer is correct in polynomial time. There are problems know to be decidable, known to be NP-hard and outside of NP. (E.g. in NEXPTIME)
saghm 142 days ago [-]
Good to know! I guess I didn't recall as well as I thought.
Vervious 143 days ago [-]
how do you know if an instance you picked is hard?

Another problem with your suggestion: Do we even know that a 1/polynomial sized fraction of 3-coloring instances are hard? (Else you'll never sample one). (NP-Hardness certainly does not give such a guarantee, even though it sounds weak)

ulrikrasmussen 143 days ago [-]
You don't know if the one you generated is hard. Graph coloring is easy for large but trivially connected graphs, and many many problem instances turn out to have structure that can be exploited to solve it efficiently. You would basically need to implement a procedure for determining the lower bound of the runtime of all known algorithms for the problem instance which is most likely very difficult, and even if possible, you don't know if your particular class of generated problems turns out to be easily solvable by a yet to be discovered algorithm.

Amateur observation, take with a grain of salt: On the other hand, I guess we are doing something similar when generating a new public/private key pair on an elliptic curve. It just turns out that finite fields have very little (known) structure to exploit compared to e.g. colored graphs, so we have more confidence in the assumption that the generated discrete logarithm problem is indeed hard.

goalieca 143 days ago [-]
RSA is kind of shaky like this. You generate key and then run some tests hoping it’s “good”. Most of the time it isn’t so you try again. Many products have failed because they were not.
adgjlsfhk1 143 days ago [-]
this mostly isn't true. for much smaller keys there were risks of "weak primes", but as general purpose factoring algorithms got better and pushed keys bigger a higher and higher percent of primes end up safe. with 2048 bit primes, the odds of your prime being significantly weaker than average drop to something like 1 in 2^1024
aleph_minus_one 143 days ago [-]
What is a "weak prime" here (i.e. by which properties is such a prime number characterized)?
less_less 143 days ago [-]
Suppose that p-1 has no large prime factors, eg suppose p-1 divides 10000!, then you can factor N by calculating something like x = random() ^ (10000!) % N. Then x will be 1 mod p (or rarely 0, if x was) but probably not mod q unless q has the same property. Then gcd(x-1,N) = p, and you've factored N. A similar trick works with Lucas sequences if p+1 has no large prime factors. So instead of attacking your key with a very expensive algorithm that doesn't care about special properties of p,q, they might gamble that your key is weak and try a cheaper algorithm first. So folks would avoid keys where p+1 or p-1 was "smooth" in this way.

However, we don't care much about this attack anymore for two reasons. One is that N must now be big enough to resist attack by the Number Field Sieve (and not just the Quadratic Sieve). At least if there are only two primes in the RSA key, this means that p,q must be big enough for these attacks to be very unlikely to work. But just as importantly, it turns out that you can do the above attack with exponentiation on a random elliptic curve instead of regular exponentiation / Lucas sequences. This is how most math libraries implement factor(). It's slightly slower but since the random elliptic curve will have a random order instead of exactly p+1 or p-1, it is equally likely to work with any p,q of a given size, regardless of special properties they might have.

In other words, an attacker can re-randomize how weak or strong a key is by using elliptic curves. So there's not much point in avoiding ones that look weak at first glance.

asimpletune 143 days ago [-]
A prime pair that’s easily factorable
ars 143 days ago [-]
How do you know which ones are the hard ones? I don't think it's easy to figure that out.
alexey-salmin 143 days ago [-]
I'd also note that zero-knowledge proofs (which are considered a branch of cryptography) often do rely on NP-complete problems like the three-color maps.
Vervious 143 days ago [-]
That's completely unrelated, the reliance there is because you want to be able to prove "any statement in NP" in zero knowledge. It then suffices to give a protocol for just 3-colorings, because "any statement in NP" can then be reduced to a 3-coloring.
saurik 143 days ago [-]
This seems like a definition issue, with respect to the set from which we are randomly sampling... what constitutes the problem? If you randomly select numbers and try to factor them--which to me seems like a fair way to define the problem: factoring, not two-primes factoring--the vast majority of numbers are, in fact, very easy to factor ;P. Instead, we say we only want to randomly select a number which is only two primes multiplied together in order to find the juicy hard versions of the problem. (And, even then, finding those numbers isn't precise! We rely on probabilistic primarily tests to weed out numbers that are probably not prime.)

Similarly, who is to say that there isn't some way to define what the hard instances of NP-complete problems are, and we just should be sampling from that difficult region of the problem space instead of all possible problems, the same way we sample only from the difficult-to-factor coprimes instead of from all possible numbers? There is definitely something interesting going on with the harder instances of these problems, and researchers have managed to discover a kind of "phase transition" among instances of some of the problems between ones with so many solutions they are easy and ones with so few solutions they are also once again easy (and my knowledge on this is about 20 years rusty and outdated... maybe this is better understood and characterized now).

delifue 143 days ago [-]
The RSA problem in article doesn't mention that RSA's difficulty is based on MODULUS prime factorization, not simple prime factorization.

https://en.wikipedia.org/wiki/RSA_(cryptosystem)

LegionMammal978 143 days ago [-]
Are there any known attack methods that don't involve factoring the public key into two primes?
bmenrigh 143 days ago [-]
Yeah but they usually require RSA to be used in some rather unusual and bad way.

For example, encrypting one message with many different public keys can be broken with chinese remainder theorem and Nth roots. This reveals the message without factoring any key. This is why randomized padding (among other things) is a must with RSA.

fenomas 143 days ago [-]
There's a footnote along these lines that links to the actual algorithm.
stncls 143 days ago [-]
> Instead, cryptography needs problems that are hard in the average case, like the RSA problem, the discrete logarithm for elliptic curves, and the shortest vector problem for lattices. We don’t technically know whether these are NP-complete

But we do know that the shortest vector problem is NP-hard (in Linf norm), and so is its decision version [1]. We don't have a reduction in the other direction, but we know that SVP is as-hard-as-NP-complete-or-harder.

This goes against the general argument of the article, but only weakly so, because I think lattice-based systems are less broadly deployed than elliptic curves or factoring (not a crypto person, though).

[1] H. Bennett, "The Complexity of the Shortest Vector Problem.", 2022 https://simons.berkeley.edu/sites/default/files/docs/21271/s...

less_less 142 days ago [-]
Yeah, SVP is NP-complete for certain parameters. But lattice cryptography uses other parameters where SVP might not be NP-complete. Also lattice crypto is usually based some other lattice problem like LWE, MLWE, MLWR, SIVP etc.

Lattice crypto is going to be broadly deployed because it hopefully can resist quantum attack, unlike ECDLP and factoring.

ccppurcell 143 days ago [-]
The keyword not mentioned here is one-way function. Cryptography is for the most part based on candidates for one-way functions: functions that are easy to perform on any input but difficult to undo on average. The existence of such a function would imply that P is not NP but the converse doesn't hold.
praptak 143 days ago [-]
Another problem with NP-hard is that it only covers deterministic algorithms.

Even if most instances of a problem are deterministically hard, they might still have a randomized algorithm which solves them feasibly.

tyilo 142 days ago [-]
It might also be the case that P=NP
143 days ago [-]
userbinator 143 days ago [-]
Aren't hash functions a counterexample? There have been attempts at using SAT solvers to find preimage and collision attacks on them.
elchananHaas 143 days ago [-]
Yes, this article is only talking about asymmetric cryptography. Symmetric cryptography and hashes are based on NP complete problems.This is only when generalized appropriately, most individual hashes/ciphers are fixed so there isn't really a way to talk about their complexity. The broader class of block cipher like problems is NP complete, and very hard.
Vervious 142 days ago [-]
no, the article is talking about symmetric key crypto also. (i.e. one way functions). You're talking about cryptosystems that don't have proofs of security, or reductions to heuristical assumptions
less_less 143 days ago [-]
Breaking hash functions is in NP, but isn't expected to be NP-complete, meaning that you can't easily encode eg a SAT problem into a hash function, such that breaking the hash solves the SAT problem.

You can do the reverse (encode the hash into a SAT problem), but that's possible for any NP problem, because SAT is NP-complete.

memkit 142 days ago [-]
Your last statement is misleading. A problem being NP-complete isn't exactly the property that allows you to reduce any NP problem to it. Suppose there was a complexity class MP-hard that has no efficient reduction to an NP-complete problem. Then a problem being MP isn't what allows me to write a reduction to the MP-hard problem; I could just as easily write a reduction from a problem in P to the MP-hard problem. Your statement is misleading but incidentally correct because the true condition (NP-hard or easier) happens to be equivalent to your stated condition (NP) for this particular complexity class. It would be clearer to simply state that you can always reduce an easy problem to a harder one.
less_less 142 days ago [-]
What?

Being in NP doesn't exclude being in P. Every problem in P is also in NP.

The definition of "NP-complete" is "in NP, and also NP-hard". The definition of "NP-hard" is that you can reduce any NP problem to it using a poly-time mapping reduction (also known as a Karp reduction). So yes, SAT being NP-complete does mean that you can reduce any NP problem to SAT, using a poly-time mapping reduction.

Breaking a hash (e.g. collision finding) is in NP, because you can easily check a proposed solution. Well, with an obvious quibble: P and NP are about asymptotic complexity, but most hash functions are fixed-size. Also if you're looking at complexity theory you might want to talk about targeted collision finding, or first or second preimage resistance, but same deal there. But anyway, supposing you choose a keyed hash that does scale so that you can talk about its asymptotic complexity at all, and has a poly-time cost to evaluate it, breaking that hash would be in NP. Therefore it can be reduced to SAT using a poly-time mapping reduction.

memkit 142 days ago [-]
I agree with you, I just think the condition "being in NP" is needlessly confusing. The whole point is that you can always find a reduction from easier problems to harder ones. It just so happens that NP encompasses all the problems easier than SAT.

The reason why your statement is confusing to me is that if you generalize it beyond NP, it breaks down; for an arbitrarily hard complexity class M and an arbitrary M-hard problem, you don't need to be in M to be able to find a reduction to the M-hard problem.

less_less 142 days ago [-]
OK, I'm sorry for the touchy response. But I still don't understand your point.

Breaking a hash is a prototypical NP problem (ok maybe FNP). SAT is the prototypical NP-hard problem.

I was just trying to explain that using SAT to attack hashes is therefore unsurprising, and does not in any way imply that breaking hashes is NP-complete, the way that it would if the reduction went in the other direction.

Surely the same logic would make sense for another class M, if you had a problem "M-HASH" that's clearly in M, and an M-hard problem "M-SAT" to reduce it to? There might be other problems that you could also reduce to M-SAT, but mentioning that it solves all of M is what's relevant if M-HASH is in M.

memkit 142 days ago [-]
Yes, I apologize for being combative. I see your point now.

I think I'm also wrong.

I thought about my original response some more and this is a more coherent version of what I was trying to say:

A problem being in NP is sufficient but not necessary to reduce it to an NP-complete problem.

But that's wrong. It's both sufficient and necessary to be in NP. It intuitively feels like you're tacking on more than you need to by introducing the "necessary" constraint, but it makes sense.

less_less 142 days ago [-]
Ah. I didn't mean to introduce a "necessary" constraint at all, but my wording wasn't the best.
westurner 143 days ago [-]
SAT Solvers, GA, LLMs, brute force

Birthday paradox probability changes with memoization.

Rainbow tables trade CPU for storage for lookup; but salting, double hashing, and key derivation functions with many rounds like pbkdf and argon2.

impossiblefork 142 days ago [-]
Surely because of Goldreich & Wasserstein's result that public key encryption based on NP completeness necessitates NP=co-NP?
cubefox 142 days ago [-]
Does this mean there is a relationship between the NP completeness of a problem and the worst case time complexity of algorithms which solve it?
tines 143 days ago [-]
Some of the graphics have black text on a transparent background and so are hard to read with a dark theme.
dc443 135 days ago [-]
slightly off topic but I would like to understand better why there is handwringing going around about cryptosystems being broken by future quantum computers. Don't we already have quantum resistant cryptosystems? Why not just switch to them across the board?
graymatters 142 days ago [-]
The author should’ve stayed in the NP class and not drop out.
aaron695 143 days ago [-]
[dead]
tofof 143 days ago [-]
It's unfortunate that he doesn't understand complexity analysis and has taken it upon himself to write an article that requires it as an underpinning.

In particular, I stopped reading at

> the discrete logarithm for elliptic curves, ... We don’t technically know whether these are NP-complete

This is wildly misleading. We do know with absolute certainty that the decision problem for DLC is NP-Hard, and therefore that EPDLC inherits at least NP-Hardness. While it is true that their presence in NP-Hard does not require their pressence in NP (and thus are not proved NP Complete), it is misleading to characterize as the author has here that they are weaker than NP, instead of the known certainty that NP is the absolute minimum bound for their hardness.

cevi 143 days ago [-]
If the discrete logarithm problem is NP-hard, then I will eat my hat. The discrete logarithm problem can be solved by Shor's algorithm on a quantum computer, placing it in the complexity class BQP. Anyone who claims that BQP contains an NP-hard problem is selling something - I would bet at a trillion-to-one odds against such a claim (if there were any hope of definitively settling the problem).
Vervious 143 days ago [-]
Not sure it's misleading, he did write the word "technically", and anyone who knows what NP-complete is knows that NP-hard does not necessarily mean NP-complete. I am a cryptographer and the article is fine.

Also, do you have a citation for "We do know with absolute certainty that the decision problem for DLC is NP-Hard"

JohnKemeny 143 days ago [-]
DLC is in NP and co-NP. Very unlikely to be NP-hard. It is usually listed as one of the candidates for problems that are NP-intermediate, ie, problems in-between P and NP-hard (should they be different).

See e.g. https://cs.stackexchange.com/a/2765

memkit 142 days ago [-]
> While it is true that their presence in NP-Hard does not require their pressence in NP (and thus are not proved NP Complete)

You're confused here. The two conditions for a problem being NP-complete are (1) it being NP-hard and (2) it being in NP.

You suggest (2) is the issue, but usually it's harder to prove (1) rather than (2). In the context of factorization problems, the factors are simply the certificate that satisfy condition (2).