I'm reminded of the Philosophy of Computer Science entry in the Stanford Encyclopedia of Philosophy [0], which briefly considers what it means for two programs to be identical.
"... it has been argued that there are cases in which it is not possible to determine whether two programs are the same without making reference to an external semantics. Sprevak (2010) proposes to consider two programs for addition which differ from the fact that one operates on Arabic, the other one on Roman numerals. The two programs compute the same function, namely addition, but this cannot always be established by inspecting the code with its subroutines; it must be determined by assigning content to the input/output strings"
"The problem can be tackled by fixing an identity criterion, namely a formal relation, that any two programs should entertain in order to be defined as identical. Angius and Primiero (2018) show how to use the process algebra relation of bisimulation between the two automata implemented by two programs under examination as such an identity criterion. Bisimulation allows to establish matching structural properties of programs implementing the same function, as well as providing weaker criteria for copies in terms of simulation."
(Of course, it isn't surprising that this would be relevant, because proofs and programs themselves are isomorphic).
This technique seems rather stricter than what Gowers has in mind, but it seems helpful as a baseline.
I think it's also important to make a distinction between a pair of programs which compute the same function using an identical amount of space and time and a pair of programs which compute the same function with different amounts of either space or time (or both). Two programs might compute the same function and be considered formally identical in that sense but may be in radically different complexity classes [O(1) vs O(n) vs O(n^2) vs O(2^n)].
Formally we may not be interested in this distinction but practically we definitely are. One program may be extremely practical and useful whereas the other might not finish computing anything before the heat death of the universe on anything but trivial-sized inputs.
setopt 5 hours ago [-]
On the other hand, compiler tricks like tail call optimization can e.g. reduce an O(n) algorithm to an O(1) algorithm. Is it a “different program” if the same source code is compiled with a new compiler?
chongli 7 minutes ago [-]
Tail call elimination is not an optimization because it changes the semantics of the program. The feature can take a program which would previously fail to terminate due to a stack overflow and cause it to terminate without error.
Perhaps TCO is better thought of as a language extension.
Jtsummers 5 hours ago [-]
Tail call optimization does not turn O(n) algorithms into O(1) algorithms unless you're talking about the space used and not the runtime.
naniwaduni 4 hours ago [-]
At a certain level of abstraction, that's easily an example of converting an O(n log n) algorithm into an O(n) one.
In practice, of course, the effect is far more dramatic with a MMU.
Jtsummers 4 hours ago [-]
Can you show a O(n log n) algorithm with tail calls but not TCO that's O(n) after being optimized with TCO?
naniwaduni 4 hours ago [-]
Computing f(0)=0; f(n)=f(n-1) is O(n log n) without tail calls because you need O(log n) addresses to hold your stack frames.
Jtsummers 4 hours ago [-]
> Computing f(0)=0; f(n)=f(n-1) is O(n log n) without tail calls because you need O(log n) addresses to hold your stack frames.
There are two principal ways of applying asymptotic analysis to algorithms: time or memory used. In both, your procedure is O(n) without TCO. With TCO it is O(n) for runtime (though further optimization would reduce it to O(1) since it's just the constant function 0, but TCO alone doesn't get us there) and O(1) for space since it would reuse the same stack frame.
What O(log n) addresses do you need to hold the stack frames when there are O(n) stack frames needing O(n) addresses (without TCO, which, again, reduces it to O(1) for memory)?
Also, regarding "without tail calls", your example already has tail calls. What do you mean by that?
zeroonetwothree 2 hours ago [-]
I assume they mean the size of the address is log n, since there are >n addresses.
Dylan16807 56 minutes ago [-]
If we don't treat almost all integers in an algorithm as fixed size then the analysis gets really messy and annoying in a way that has nothing to do with real computers.
And if the algorithm actually did anything with the value that made it grow or shrink with the recursion, the TCO version would stop being O(n) under such a framework. This only works because it's passing 0 around every iteration. And this probably already applies to the TCO version's flow control depending on how you expect to run it.
Jtsummers 52 minutes ago [-]
I was going to write something similar.
Regardless, the comment I replied to is fundamentally confused (presented a tail recursive algorithm and said it didn't have tail calls, presented a linear algorithm that uses a linear amount of memory and claims it's O(n log n) for some reason but no clarification if it's in time or space). I'd rather hear from the person I responded to than whatever guesses the rest of us can come up with because it needs several points of clarification to be understood.
Sharlin 5 hours ago [-]
From complexity analysis we can adopt the concept of polynomial-time reducibility and might define a type of equivalence relation where two algorithms are equivalent only if both are pt-reducible to each other. Intuitively it’s not a sufficient condition for "sameness" – otherwise, for example, all NP-complete problems are the "same" and solvable with the "same" algorithm – but it’s arguably a necessary one.
tightbookkeeper 47 minutes ago [-]
Knuth answers this question in chapter 0.
cloogshicer 18 minutes ago [-]
In which book? Sounds interesting.
qubitly 3 hours ago [-]
Reducing two mathematical proofs to being 'essentially the same' just because they reach the same conclusion overlooks something crucial: each proof isn’t merely a path to a result but a unique expression of understanding. A proof has its own logical and conceptual structure, and that structure isn’t interchangeable without losing some of its inherent value. Comparing proofs shouldn’t just focus on a shared outcome: the path taken, the relationships it establishes, and the concepts it explores are as fundamental as the conclusion itself. Perhaps it’s time to view mathematics not just as calculation, but as a real act of knowledge that in its diversity deepens our grasp of reality
red_trumpet 2 hours ago [-]
The result of a prove is a theorem. I don't see any claim in the article that any two proofs of the same theorem are essentially the same?
joe_the_user 6 hours ago [-]
Well, if you define a proof system as a series of potential manipulations of a space of true statements, a given proof is a sequence of manipulations and states and thus a path in a sort-of-metric space. Two proofs could said to be similar if their paths are "close" in that sort-of-metric space. Of course, you're left with the question of how close is close and whether "close" means close at one intermediate point or many. Moreover, mathematicians often like proofs that are more "cohesive" than just sequences of manipulations. So the question with real world would probably be a matter of mathematical taste as well as objective measures.
setopt 5 hours ago [-]
It’s also hard to prove that a statement definitely lies in the “space of true statements”. Moreover, whether a proof assumes “A = B” or “B = C” can make them closer together or further apart in such a space depending on whether it is established that “A = C” or not, which also makes it tricky to establish rigorously.
Xcelerate 8 hours ago [-]
One way to compare proofs is to consider whether they belong to the same "level" or not. Consider by analogy whether a particular Turing machine halts. You can look at the sequence of configurations of the Turing machine at each step. Since the evolution of the machine's configuration is deterministic, any configuration along a "halting path" ends up in the same final configuration (i.e., the first configuration in a halting state).
But that's too difficult in some cases. Most of the Goodstein sequences reach extraordinarily high values before coming back down. How can we prove they all eventually reach 0? Even at small values of n, the sequence length of G(n) requires something on the order of the Ackermann function to specify bounds. We can't inspect these sequences directly to prove whether they reach 0. Instead we create a "parallel" sequence to a Goodstein sequence. Then we prove there exists an algorithm that maps from each item in the parallel sequence to an item in the Goodstein sequence such that both sequences are well-ordered and decreasing. If the parallel sequence reaches 0, then so does the Goodstein sequence. You could think of this as one Turing machine computing the configurations of another Turing machine or perhaps one branch of a tree "cross-predicting" the items along another branch. You aren't just following the branch to its end. In this sense, the proof occurs at a higher "level".
This concept is known as ordinal analysis and one can consider the proof-theoretic ordinal of any theory T. If T_1 and T_2 both prove a specific theorem and have the same proof-theoretic ordinal, you could consider the two proofs to occur on the same "level". Interestingly, Peano Arithmetic can prove that any specific Goodstein sequence reaches 0 but not that all Goodstein sequences reach 0—this requires a more powerful formal system. So if you prove a specific sequence reaches 0 using the more powerful system, I would say that's a fundamentally different proof.
6gvONxR4sf7o 8 hours ago [-]
It seems like in your first part, you're saying that proofs are the same as their normalized proofs, up to some rewriting system. So like how we say 3-2 is the same as 1, basically, or (more interestingly) saying that x-x is the same as zero, or that e^(i pi (2n+1)) is the same as -1. Yes, they can be reduced/normalized to the same thing, but in basically any system with terms, `reduction(term)` is not always the same as `term`, And 'a sequence of term transformations' is a common proof method.
There's obviously a sense in which they're the same, but at the proof level, I would be surprised if that's a particularly useful sense, because the whole point of a proof is that it's the journey, not the destination. Even within the same "level," in your terms.
Xcelerate 8 hours ago [-]
My first sentence didn't make sense and wasn't well-thought-out. Removed it in favor of keeping the discussion about proof-theoretic strength.
colechristensen 8 hours ago [-]
>I agree with pkoird's point that philosophically, two correct proofs of the same theorem should be considered "the same". Any theorem is ultimately a property of the natural numbers themselves along with the various paths that lead there from the axioms (since all proofs are essentially a finite sequence of Gödel numbers).
As with a lot of philosophy, the argument turns out to actually be much more about defining terms being used than the objects those terms are referring to. I mean when you are making an argument about "x is the same as y because..." your philosophical argument is actually about what should be meant by the same instead of any particular properties of x or y.
The article seems to be digging at the existence of a few categories of proofs
* proofs that are trivially transformed into one another
* proofs that use substantially similar arguments that don't necessary have a direct transformation
* proofs that arrive at the same destination through a much different path that have no obvious transformation to another proof
So the question is: how easy does it have to be to transform one proof to another in order for them to be considered "the same"?
One extreme is "the slightest change in wording makes a proof unique"
The other extreme is "any two proofs of the same concept are by definition the same proof"
I would argue that neither extreme is particularly useful, because both are just obvious. One means "these are different sheets of paper" and the other means "these are both proofs of X", neither of which are interesting statements.
What is an interesting statement is commentary on the path made to a proof and the differences in paths to proving a statement. Both in the ability to transform one into another easily to show similarity, and in the difficulty to transform one into another to show divergence.
VirusNewbie 3 hours ago [-]
Why not make this rigorous and actually quantify how similar proofs are? I assume this could be done.
colechristensen 2 hours ago [-]
You would need a rigorous way to encode proofs likely akin to Gödel numbering or at least something related to automated theorem proving and then add on transformation mechanisms and then rigorously prove that all proofs have transforms from one to the other.
I strongly assume this would be hard.
Xcelerate 8 hours ago [-]
Yeah, my first sentence is sort of nonsense the more I think about it... removed it to keep the focus of my comment on different kinds of proofs.
082349872349872 7 hours ago [-]
See Girard, The Blind Spot: Lectures on Logic (2011) for some attempts at tackling this question. (in particular, his "proof nets" attempt to have a canonical form, such that we can identify differently drawn concrete proof nets as representing the same abstract proof)
> Is it ever possible to give a completely compelling argument that two proofs are genuinely different?
I think in some cases, we can. Sometimes one of the proofs generalizes better than the other because it uses strictly fewer assumptions. It seems fair to say those would have to be inequivalent.
> For example, it is often possible to convert a standard inductive proof into a proof by contradiction
I would not consider those the same though, as one is constructive and the other is not.
VikingCoder 7 hours ago [-]
I was working with a friend writing a paper about the Ship of Theseus, but my friend kept replacing all of my arguments.
lisper 5 hours ago [-]
I had a similar experience, but my collaborator had a touch of OCD and just kept micro-editing my original draft, each time replacing exactly one word with a different word that had nearly the same meaning. By the end of the process, my collaborator had produced a word-for-word copy of William Shakespeare's "Julius Caesar". It is a remarkable coincidence that my original draft just happened to have the same number of words to begin with to make this transformation possible.
My collaborator then translated the original paper into Greek. Or maybe he translated "Julius Caesar" into Greek. I don't speak Greek so I have no way of knowing.
;-)
pkoird 9 hours ago [-]
If I were allowed a small philosophical leeway, I'd argue that two correct proofs are always the same. For sure they may contain different words or make use of different "abstractions", but it just seems to me that these abstractions should be equivalent if one were willing to unravel it all to a certain degree. Essentially, all proof is, is a statement that says "this is true" and no matter which language you use to say it, you are saying the same thing.
mjcohen 13 minutes ago [-]
How abut the following proofs that sqrt{2} is irrational.
1. If 2=a^2/b^2 with a and b relatively prime then a^2=2b^2 so a is even, and, letting a=2c, b is also even a contradiction.
2. Use the lemma that a positive real r is rational if and only if there is a positive integer b such that br is an integer. (Proof left to the reader.) if sqrt{2} is ration then there is an integer b such that bsqrt{2}=a for some integer a. Let b be the smallest such integer. Then, if c=b(sqrt{2}-1) then c is smaller than b, c=bsqrt{2}-b is an integer, and sqrt{2}c=sqrt{2}b(sqrt{2}-1)=2b-b*sqrt{2} is an integer, a contradiction (this uses infinite descent).
3. Use the theorem that if x, y, and n are positive integers such that x^2-ny^2=1 then sqrt{n} is irrational, and apply with n=2, x=3, y=2.
Proof of theorem.
a. If x^2-ny^2=1, then using (x^2-ny^2)^2=(x^2+ny^2)^2-n(2xy)^2 to show that there are arbitrarily large solutions to x^2-ny^2=1.
b. If n=a^2/b^2 then 1=x^2-(a^2/b^2)y^2 so b^2=b^2x^2-a^2y^2=(bx+ay)(bx-ay)>=bx+ay>bx so b>x but this contradicts the existence of arbitrarily large x.
How are any of these the same as any other?
ColinWright 9 hours ago [-]
This is like saying that if I walk out of my house, turn right, and walk 10 minutes to the local food store, it's the same as coming out of the house, turning left, and walking 15 minutes around the block. The destination is the same, so surely these are "the same".
I'd argue that this is not the case.
pkoird 8 hours ago [-]
Not quite.
If we consider that we are trying to prove "you can reach the local food store from your house" then starting from either side would consist of two proofs by example. And for sure these are different paths one is taking and should be different! But if you consider deeply, both of these proofs are implicitly encoding same information about the space between your house and the local store:
1) there is continuous space between your house and the store i.e. the store is reachable from your house. (as opposed to your house being in on an island and you not being able to swim)
2) you can traverse two points in a continuous space.
What I wanted to opine was merely the fact that since all proofs use logic, assuming certain premise, all theorems about a certain statement being true must be reducible to a single irreducible logical chain of argument. It is true that we use different abstractions that have relevant meaning and ease in different contexts but since all of our abstractions are based upon logic in the first place, it does not seem outlandish to me to think that any logical transformation and subsequent treatment between two proof structures should inherently encode the same facts.
Twisol 8 hours ago [-]
The path example is extremely fertile ground for this kind of discussion! It is definitely true that both paths encode the information that one's house is connected to the local store. But is that all they encode? Homotopy theory is all about the different paths between two points, and it tells us some quite interesting things! In particular, if you have two paths from point A to point B, you can ask: can you smoothly animate an image of the first path into an image of the second, such that every still frame in-between is also a legitimate path? (If you can't, that tells you that there's some form of hole in between them!)
In the house/store example, a path is also a witness to the fact that, if you perform a road closure anywhere not on the path, then connectivity is preserved. Simply stating that the two points are connected doesn't tell you whether it's safe to close a road! Moreover, taking the two paths together tells you that performing a single road closure that only affects one of the paths will still leave a route you can take.
In both examples, if the paths were logically interchangeable, you wouldn't be able to get more information out of the both of them than you could from just one. But because they aren't equivalent -- because each contains some information that the other does not -- we can deduce more from both together than from either individually.
seanhunter 5 hours ago [-]
> all theorems about a certain statement being true must be reducible to a single irreducible logical chain of argument.
Why is this necessarily true? We know that true statements in topology (for example) don't all reduce down to being equivalent (eg if I have a loop that goes through the ring of a donut/toroid it doesn't reduce the same as if I have a loop on the surface of the donut/toroid so establishing facts about one wouldn't tell me facts about the other). So how do we know that statements in logic reduce? Could the space of logical statements not have topological characteristics like that?
SkiFire13 4 hours ago [-]
> But if you consider deeply, both of these proofs are implicitly encoding same information about the space between your house and the local store
That is only _some_ of the informations that they encode, and particularly informations shared by both proofs, but that it not the only information they encode! The exact way to reach the local food store is also some information, and they encode different ways, hence different informations.
> What I wanted to opine was merely the fact that since all proofs use logic
Note that there's no single logic! There are at least two big logics, classical and constructive/intuitionistic, each with their own variants.
For example a proof by contradiction is valid in classical logic but not in constructive one. It would give you a proof that there must be a way to reach the local store without giving you the way to reach it. Would it still count as the same proof as the other two for you? It doesn't encode how to reach it, so for some it's not even a valid proof.
ColinWright 5 hours ago [-]
You are being too literal -- I was providing an analogy, not an example.
Also:
> ... all theorems about a certain statement being true must be reducible to a single irreducible logical chain of argument.
Citation needed ... I have no reason to believe this is true.
But here's an example of two proofs.
Proving sqrt(2) is irrational.
Proof 1:
Every rational number has a finite Continued Fraction representation. But the normalised Continued Fraction representation of sqrt(2) is [1;2,2,2,...], which is infinite. Since this is infinite, sqrt(2) is irrational.
Proof 2:
Consider integers a and b, and suppose 2(b²)=a². Consider the prime decompositions of a and b, and count how many times "2" turns up on each side. It's odd on the left, it's even on the right, so this cannot happen. Therefore we can never have integers a and b with 2(b²)=a². Therefore we can't have 2=(a²)/(b²)=(a/b)². So any fraction when squared cannot equal 2, so sqrt(2) is irrational.
Do these really feel "the same" to you?
lupire 9 hours ago [-]
There's a simple mechanical transformation from one path to the other.
As a proof that "the store is reachable, they are essentially the same if it is already known that you live on a "block" with the store" . If it is not known that you live on a block, then the second proof together with the first gives a much deeper result, proving that you do live on a block. That makes a second proof valuable, but in the monograph of history, it is most parsimonious to make the block proof and the note how it implies to trivially distinct ways of reaching the store.
ColinWright 9 hours ago [-]
So you are saying that the two proofs are different, but there is a third proof that gives each of the first two as corollaries.
So ... the first two proofs are different, then.
lupire 9 hours ago [-]
That's one opinion. The OP and I have a different opinion.
Y_Y 9 hours ago [-]
Neglect considerations of homotopy at your peril!
gus_massa 8 hours ago [-]
Yep.
If you can go from A to C by B or B' and all the place is a nice grass field they are probably equivalent.
But if between B anb B' there is an active vocano, most people would call the paths different.
Twisol 9 hours ago [-]
I disagree with this on two points.
First, oftentimes the interest in proving long-standing, difficult mathematical problems is because we hope a proof will demonstrate new tools for tackling similar problems. In that sense, the exact content of a proof is quite important. Not to mention, there is value in having multiple proofs that each demonstrate quite different toolkits. Mere truth is not often the most important thing -- after all, mathematicians can (and do!) take certain propositions as premises of downstream work. If we discover a proof for one of those premises, that just means we can drop the premise from downstream work. Not having a proof doesn't block us from making use of the proposition anyway.
Second, sometimes the content of the proof is relevant formally. A sibling comment gave an example in terms of paths between two points; it is often the case that you care not only that the points are merely connected, but you also have a preference for which path is taken. Or, you can do an analysis of the paths themselves, and determine their length or average furthest distance from the nearest McDonalds. A path is "just" a proof of connectivity, but the individual paths can be quite distinct when studied themselves. Less abstractly, a constructive proof will yield an algorithm that can be performed, and we know quite well that the variety of sorting algorithms (that "merely" prove that "all lists are sortable") actually vary in quite important ways, including asymptotics and stability.
pkoird 8 hours ago [-]
I don't think you have disagreed with me. You have advocated that different tools/methods are useful for different problems and may have unique properties that make them interesting in specific contexts. I completely agree and I have not stated anything against it. My opinion, admittedly abstract and stated without proof, was simply that if you have two ways of showing something to be true, they must be logically equivalent (in some sense of the word) if you are willing to dig deep enough. This does not necessarily imply that certain abstractions are not useful on their own, merely that at a certain level, they should represent the same thing.
I fully understand that this is not a concrete argument and I have not stated my opinion with desirable rigor (but the author of the original article does provide a few examples in support). Maybe someone with a better grasp on abstract mathematical concept could convey my arguments better (if they think it's true).
Twisol 8 hours ago [-]
That's a fair response; thanks for taking the time.
I was primarily reacting to this part of your message...
> I'd argue that two correct proofs are always the same.
...with emphasis on the "always". To my eyes, a proof is any object that witnesses the truth of a proposition. The proof can be more than a witness, but to be called a proof, it must do at least that much.
To say that "two correct proofs are always the same" is, to me, to say that proofs can be no more than witnesses of the proposition; to be the same means to be indistinguishable. My argument is that two correct proofs may be distinct in useful ways.
I suppose this discussion depends on what "same" means ("depends on what the meaning of the word 'is' is", heh). Do you mean something other than what I mean? Your use of "logically equivalent" is probably telling -- so, two proofs should have the same deductive power? We often speak of propositions this way, but I'm not sure how to understand that notion on proofs. Terence Tao gives some possible notions of proof equivalence in a comment on the OP [0]; you might enjoy reading them and considering which is closest to your intuitive idea of equivalence :)
I can attempt to semi-formalize it but I'm sure I'd butcher it along the way so feel free to point out anything that doesn't feel correct.
Consider a set of premises P that are assumed to be true. Also, consider that we are trying to analyze a statement s0 assuming P.
One proof could be of the form:
P1: s0 -> s1 -> s2 -> ... -> T/F.
Another proof could be of the form:
P2: s0 -> s11 -> s12 -> ... -> T/F.
Where T/F represent a single terminal symbol i.e. either T (true) or F (false) and s1... and s11... etc. could be different abstractions that have been employed to illustrate the veracity of the statement.
Regardless, both of these abstractions make use of the same Logical rules at each step so you could argue that the logical chain of both P1 and P2 are equivalent in some sense. If you think about it, it does seem obvious though, because if P1 and P2 disagreed with P1 yielding T and P2 yielding F, while using the same set of logical rules, it must be the case that either the logic is not consistent or one or both of the chain has errors.
So now, one could argue that all such correct logical chains (maybe of different lengths) that start with a statement s0 and terminate at a single symbol (say T) should essentially be the same.
s1 -> s2 -> s3 -> ... -> sn
↑................................↓
s0 -> s11 -> s12 -> ...->T
You could also argue that there must be exactly one such chain of the smallest possible complexity (in some sense) and that all other chains should be reducible to this one chain (not sure how).
At the end, I still agree with you in that two correct proofs can be distinct in useful ways but since proofs, to me, are a series of application of logic under certain premise to obtain a terminal symbol, all such logically sound chains must actually correspond to the one fundamental chain that's irreducible (in some sense).
Twisol 7 hours ago [-]
Thanks for taking a stab at it! I think I understand the angle you're attempting to take. May I offer a relatively contrived counterexample to poke at this a little more deeply?
Suppose I have a proposition that says, roughly, "if A and B and C then contradiction". Furthermore, suppose that A and B together are already contradictory, and B and C together are also already contradictory.
Now I can construct two proofs, one in which I use A and B (but not C) to yield the desired result, and another in which I use B and C (but not A).
In what way can we say that these two proofs are essentially the same? It appears that each uses potentially rather distinct information in order to derive the expected contradiction; it isn't clear how to go from a proof that avoids A to a proof that avoids C in a smooth way.
pkoird 7 hours ago [-]
That is a really good question. I suppose you could reduce it further by saying that you want the proof of "A or B". Assuming both true, it suffices to either get a proof for A or for B (of course, this may not be true in general).
Regardless, this is a really good counter-example that will force me to think some more about it. Thanks!
Twisol 5 hours ago [-]
> I suppose you could reduce it further by saying that you want the proof of "A or B". Assuming both true, it suffices to either get a proof for A or for B
Yes, absolutely :) I thought about this framing too, but figured the one I gave above might be more immediately convincing.
lupire 9 hours ago [-]
Asymptotics and stability are different theorems.
An algorithm is not a proof. It is a technique for proof.
Two algorithms can be different, while not being meaningfully different profs that a list is sortable.
To the extent that they are different, they proof different theorems, such as "list can be sorted in O(f) time" for an f of interest.
Twisol 9 hours ago [-]
> An algorithm is not a proof.
That is an opinion that many do not share. FWIW, I framed my response as an opinion; you gave yours as a blanket statement. It is not wrong to treat algorithms as valid proofs.
In a dependent type theory, propositions are represented as types; the proposition that "all lists can be sorted" could be represented represented as the type "forall (t : Type) -> (le : Ordered t) -> forall (xs : List t) -> exists (ys : List t). (Increasing le ys, PermutationOf xs ys)". A proof of this proposition is exactly a program (algorithm) with that type; the sorted list is the `ys` component of the returned existential product. Yet the inhabitants of this type are not graded by asymptotics or stability; any sorting algorithm will do.
In a setting where inhabitants of the above type are distinguishable, you could then write proofs of asymptotics or stability against individual algorithms. That is, the proofs of the sorting proposition are themselves the subjects of subsequent propositions and proofs thereof.
js8 8 hours ago [-]
> It is not wrong to treat algorithms as valid proofs.
I think that's what PP meant, i.e. if you want to differentiate between sorting algorithms in terms of efficiency, you somehow should encode this demand into the types (specification).
Twisol 8 hours ago [-]
You can do that, yes. My argument is that you don't have to do that -- you can prove asymptotics or stability after the fact, having previously only given an algorithm as a proof of sortability.
Putting these properties in the specification assumes you already know (or suspect) that your algorithm has these properties; then you are simply verifying what you knew. If you develop your algorithm first, then want to analyze it in terms of (say) asymptotics, then not only is it far too late to change the type, you don't even know what the asymptotics are yet. You'd still like to treat the algorithm formally in order to determine those asymptotics, but since you don't know them yet, the algorithm can't inhabit a type that states those asymptotics outright.
winwang 8 hours ago [-]
Two programs which are semantically equivalent are not simply the same. See: bubblesort vs mergesort. (Yes I'm relying on curry-howard isomorphism here).
andrewla 4 hours ago [-]
I don't know why this hasn't been voted to the top. Curry-Howard isomorphism is a hell of a bludgeon to apply here but it makes for a very straightforward and obvious refutation of the parent post.
justinpombrio 8 hours ago [-]
Some proofs that aren't "essentially the same":
1. Prove that the interior angles of a triangle sum to 180 degrees.
First proof: draw a line parallel to one of the triangle's sides passing through its opposite vertex. There are three angles on one side of this line, and they obviously add to 180 degrees because it's a line. One of the three angles is directly one of the triangle's interior angles; the other two can be shown to be equal to the triangle's other two interior angles. (Try drawing it out.)
Second proof: start at one side of the triangle and walk around it. By the time you return to where you started, you must have turned 360 degrees. Thus the sum of the exterior angles is 360 degrees. Each interior angle is 180 minus the corresponding exterior angle, and there are three of them, so calling the interior angles A, B, C and the exterior angles A', B', C' we have A'+B'+C' = 360 implies (180-A) + (180-B) + (180-C) = 360 implies 540 - A - B - C = 360 implies 180 = A + B + C.
2. Prove that the sum of the first N numbers is N(N+1)/2.
First proof: sum the first and last number to get 1 + N, then the second and second-to-last to get 2 + (N-1) = 1 + N, repeating until you get to the middle. There are N/2 such pairs, giving a total of (1 + N)N/2. (This assumed that there were an even number of terms; consider the odd case too.)
Second proof: proceed by induction. For the base case, it's true for N=1 because 1*2/2 = 1. For the inductive case, suppose it's true for N-1. Then 1 + 2 + ... + N-1 + N = (1 + 2 + ... + N-1) + N = N(N-1)/2 + N = N(N-1)/2 + 2N/2 = N(N+1)/2.
pkoird 7 hours ago [-]
I'm responding to your second example simply because it's easy to argue about. I'd say that both proofs that you have presented are equivalent ways of saying that "since when you sum all the numbers from 1 to N you obtain a number that's N(N+1)/2, therefore, it is true that the sum of numbers from 1 to N is N(N+1)/2".
Now, this argument may appear trite but do consider that both of your proofs essentially do the same thing with the first one summing the numbers from extremities and the second one summing 1...N-1 first and then the last. I'd argue that if addition were not commutative, you may have obtained different results.
justinpombrio 6 hours ago [-]
If two programs are equivalent, you can typically show that they're equivalent with a sequence of small refactorings. Replace `x + x` with `2 * x`. Inline that function call. Etc.
Can you do that with these two proofs? What's a proof that's halfway in between the two?
If you can get from one proof to the other with small "refactorings", then I agree that they're fundamentally the same. If you can't---if there's an insurmountable gap that you need to leap across to transform one into the other---then I'd call them fundamentally different. If you insist that two proofs are "essentially the same thing" despite having this uncrossable gap between them, then I suspect you're defining "essentially the same" to mean "proves the same thing", which is a stupid definition because it makes all proofs the same by fiat, and avoids the interesting question.
seanhunter 5 hours ago [-]
I don't think this is true because a proof does more than state a conclusion. It establishes a true path from some premises to that conclusion. Sometimes that path continues.
For example if you had a general constructive proof that there were infinitely many prime numbers it should be a simple matter to alter it a bit and prove the twin prime conjecture wouldn't it?
In general a constructive proof and a non-constructive proof of some fact (say proof by contradiction) are fundamentally different in terms of where you can go with the proof.
Someone 8 hours ago [-]
Theorem: there are 500,000 odd integers between zero and a million.
Proof #1: there ar no odd integers between zero and 1 (inclusive), 1 is odd so there is 1 odd integer between zero and 2, 2 is even so there is 1 odd integer between zero and 3, 3 is odd so there are 2 odd integers between zero and 4, …, 999,998 is even so there are 499,999 odd integers between zero and 999,999, 999,999 is odd so there are 500,000 odd integers between zero and 1,000,000. QED.
Proof #2: this is a specific case of “there are n odd integers between zero and 2n (exclusive)”. (proof of the more general theorem). Picking n to be 500,000, the theorem follows.
I think most people would call those two proofs different.
drdeca 9 hours ago [-]
A proof is not a statement that something is true, but a demonstration that it is true.
Are you familiar with the proofs-as-programs idea?
The uh, something isomorphism? Idr the name.
Not all programs that implement a function are the same.
When you boil things down to the fundamental steps of the logic you are working on, you needn’t get the same thing.
For one thing, it may be that axioms A and B suffice to prove Z, and that axioms B and C suffice to prove Z, but that B alone doesn’t, and that A and B doesn’t prove C and that B and C doesn’t prove A.
So, the proofs using A and the proofs using C are certainly different.
js8 8 hours ago [-]
I think "propositions-as-types" is exactly why we should consider proofs to be the same if they prove the same type.
As others have already said, if you want to distinguish between different proofs, it's better to encode those distinctions formally into types (and thus potentially into another mathematical theory).
naniwaduni 4 hours ago [-]
> I'd argue that two correct proofs are always the same
All correct inferences proceeding from the same axioms are the same.
tightbookkeeper 45 minutes ago [-]
No. Because the whole point of proof from a human perspective is to express understanding of the question, not merely answer it.
shaunxcode 9 hours ago [-]
yes : if two discrete semiotic symbolic networks point to the same signified value they are the same in the way two different poems with the same meaning are the same. which is to say they are unique but have the same meaning.
lupire 9 hours ago [-]
A Proof is not a statement. A theorem is a statement.
Proofs are usually not completely formal or even formalizable. Math is not completely well founded.
"Unravelling it all the way" might be an open research project, or a new conjecture directly inspired by the second, apparently different proof. Showing these two profs to be equivalent might depend on a major new idea that happens after the two proofs are createdm
This is hinted at in the OP discussion of Terry Tao.
bjornsing 9 hours ago [-]
> Proofs are usually not completely formal or even formalizable. Math is not completely well founded.
This is often stated, but is it really true? I haven’t seen a persuasive argument that not all math could (in principle) be formalized.
9 hours ago [-]
lupire 9 hours ago [-]
> A couple of years ago I spoke at a conference about mathematics that brought together philosophers, psychologists and mathematicians. The proceedings of the conference will appear fairly soon
"... it has been argued that there are cases in which it is not possible to determine whether two programs are the same without making reference to an external semantics. Sprevak (2010) proposes to consider two programs for addition which differ from the fact that one operates on Arabic, the other one on Roman numerals. The two programs compute the same function, namely addition, but this cannot always be established by inspecting the code with its subroutines; it must be determined by assigning content to the input/output strings"
"The problem can be tackled by fixing an identity criterion, namely a formal relation, that any two programs should entertain in order to be defined as identical. Angius and Primiero (2018) show how to use the process algebra relation of bisimulation between the two automata implemented by two programs under examination as such an identity criterion. Bisimulation allows to establish matching structural properties of programs implementing the same function, as well as providing weaker criteria for copies in terms of simulation."
(Of course, it isn't surprising that this would be relevant, because proofs and programs themselves are isomorphic).
This technique seems rather stricter than what Gowers has in mind, but it seems helpful as a baseline.
0. https://plato.stanford.edu/entries/computer-science/
Formally we may not be interested in this distinction but practically we definitely are. One program may be extremely practical and useful whereas the other might not finish computing anything before the heat death of the universe on anything but trivial-sized inputs.
Perhaps TCO is better thought of as a language extension.
In practice, of course, the effect is far more dramatic with a MMU.
There are two principal ways of applying asymptotic analysis to algorithms: time or memory used. In both, your procedure is O(n) without TCO. With TCO it is O(n) for runtime (though further optimization would reduce it to O(1) since it's just the constant function 0, but TCO alone doesn't get us there) and O(1) for space since it would reuse the same stack frame.
What O(log n) addresses do you need to hold the stack frames when there are O(n) stack frames needing O(n) addresses (without TCO, which, again, reduces it to O(1) for memory)?
Also, regarding "without tail calls", your example already has tail calls. What do you mean by that?
And if the algorithm actually did anything with the value that made it grow or shrink with the recursion, the TCO version would stop being O(n) under such a framework. This only works because it's passing 0 around every iteration. And this probably already applies to the TCO version's flow control depending on how you expect to run it.
Regardless, the comment I replied to is fundamentally confused (presented a tail recursive algorithm and said it didn't have tail calls, presented a linear algorithm that uses a linear amount of memory and claims it's O(n log n) for some reason but no clarification if it's in time or space). I'd rather hear from the person I responded to than whatever guesses the rest of us can come up with because it needs several points of clarification to be understood.
But that's too difficult in some cases. Most of the Goodstein sequences reach extraordinarily high values before coming back down. How can we prove they all eventually reach 0? Even at small values of n, the sequence length of G(n) requires something on the order of the Ackermann function to specify bounds. We can't inspect these sequences directly to prove whether they reach 0. Instead we create a "parallel" sequence to a Goodstein sequence. Then we prove there exists an algorithm that maps from each item in the parallel sequence to an item in the Goodstein sequence such that both sequences are well-ordered and decreasing. If the parallel sequence reaches 0, then so does the Goodstein sequence. You could think of this as one Turing machine computing the configurations of another Turing machine or perhaps one branch of a tree "cross-predicting" the items along another branch. You aren't just following the branch to its end. In this sense, the proof occurs at a higher "level".
This concept is known as ordinal analysis and one can consider the proof-theoretic ordinal of any theory T. If T_1 and T_2 both prove a specific theorem and have the same proof-theoretic ordinal, you could consider the two proofs to occur on the same "level". Interestingly, Peano Arithmetic can prove that any specific Goodstein sequence reaches 0 but not that all Goodstein sequences reach 0—this requires a more powerful formal system. So if you prove a specific sequence reaches 0 using the more powerful system, I would say that's a fundamentally different proof.
There's obviously a sense in which they're the same, but at the proof level, I would be surprised if that's a particularly useful sense, because the whole point of a proof is that it's the journey, not the destination. Even within the same "level," in your terms.
As with a lot of philosophy, the argument turns out to actually be much more about defining terms being used than the objects those terms are referring to. I mean when you are making an argument about "x is the same as y because..." your philosophical argument is actually about what should be meant by the same instead of any particular properties of x or y.
The article seems to be digging at the existence of a few categories of proofs
* proofs that are trivially transformed into one another
* proofs that use substantially similar arguments that don't necessary have a direct transformation
* proofs that arrive at the same destination through a much different path that have no obvious transformation to another proof
So the question is: how easy does it have to be to transform one proof to another in order for them to be considered "the same"?
One extreme is "the slightest change in wording makes a proof unique"
The other extreme is "any two proofs of the same concept are by definition the same proof"
I would argue that neither extreme is particularly useful, because both are just obvious. One means "these are different sheets of paper" and the other means "these are both proofs of X", neither of which are interesting statements.
What is an interesting statement is commentary on the path made to a proof and the differences in paths to proving a statement. Both in the ability to transform one into another easily to show similarity, and in the difficulty to transform one into another to show divergence.
I strongly assume this would be hard.
https://en.wikipedia.org/wiki/Proof_net
I think in some cases, we can. Sometimes one of the proofs generalizes better than the other because it uses strictly fewer assumptions. It seems fair to say those would have to be inequivalent.
https://mathoverflow.net/questions/3776/when-are-two-proofs-...
I would not consider those the same though, as one is constructive and the other is not.
My collaborator then translated the original paper into Greek. Or maybe he translated "Julius Caesar" into Greek. I don't speak Greek so I have no way of knowing.
;-)
1. If 2=a^2/b^2 with a and b relatively prime then a^2=2b^2 so a is even, and, letting a=2c, b is also even a contradiction.
2. Use the lemma that a positive real r is rational if and only if there is a positive integer b such that br is an integer. (Proof left to the reader.) if sqrt{2} is ration then there is an integer b such that bsqrt{2}=a for some integer a. Let b be the smallest such integer. Then, if c=b(sqrt{2}-1) then c is smaller than b, c=bsqrt{2}-b is an integer, and sqrt{2}c=sqrt{2}b(sqrt{2}-1)=2b-b*sqrt{2} is an integer, a contradiction (this uses infinite descent).
3. Use the theorem that if x, y, and n are positive integers such that x^2-ny^2=1 then sqrt{n} is irrational, and apply with n=2, x=3, y=2. Proof of theorem. a. If x^2-ny^2=1, then using (x^2-ny^2)^2=(x^2+ny^2)^2-n(2xy)^2 to show that there are arbitrarily large solutions to x^2-ny^2=1. b. If n=a^2/b^2 then 1=x^2-(a^2/b^2)y^2 so b^2=b^2x^2-a^2y^2=(bx+ay)(bx-ay)>=bx+ay>bx so b>x but this contradicts the existence of arbitrarily large x.
How are any of these the same as any other?
I'd argue that this is not the case.
If we consider that we are trying to prove "you can reach the local food store from your house" then starting from either side would consist of two proofs by example. And for sure these are different paths one is taking and should be different! But if you consider deeply, both of these proofs are implicitly encoding same information about the space between your house and the local store:
1) there is continuous space between your house and the store i.e. the store is reachable from your house. (as opposed to your house being in on an island and you not being able to swim) 2) you can traverse two points in a continuous space.
What I wanted to opine was merely the fact that since all proofs use logic, assuming certain premise, all theorems about a certain statement being true must be reducible to a single irreducible logical chain of argument. It is true that we use different abstractions that have relevant meaning and ease in different contexts but since all of our abstractions are based upon logic in the first place, it does not seem outlandish to me to think that any logical transformation and subsequent treatment between two proof structures should inherently encode the same facts.
In the house/store example, a path is also a witness to the fact that, if you perform a road closure anywhere not on the path, then connectivity is preserved. Simply stating that the two points are connected doesn't tell you whether it's safe to close a road! Moreover, taking the two paths together tells you that performing a single road closure that only affects one of the paths will still leave a route you can take.
In both examples, if the paths were logically interchangeable, you wouldn't be able to get more information out of the both of them than you could from just one. But because they aren't equivalent -- because each contains some information that the other does not -- we can deduce more from both together than from either individually.
Why is this necessarily true? We know that true statements in topology (for example) don't all reduce down to being equivalent (eg if I have a loop that goes through the ring of a donut/toroid it doesn't reduce the same as if I have a loop on the surface of the donut/toroid so establishing facts about one wouldn't tell me facts about the other). So how do we know that statements in logic reduce? Could the space of logical statements not have topological characteristics like that?
That is only _some_ of the informations that they encode, and particularly informations shared by both proofs, but that it not the only information they encode! The exact way to reach the local food store is also some information, and they encode different ways, hence different informations.
> What I wanted to opine was merely the fact that since all proofs use logic
Note that there's no single logic! There are at least two big logics, classical and constructive/intuitionistic, each with their own variants.
For example a proof by contradiction is valid in classical logic but not in constructive one. It would give you a proof that there must be a way to reach the local store without giving you the way to reach it. Would it still count as the same proof as the other two for you? It doesn't encode how to reach it, so for some it's not even a valid proof.
Also:
> ... all theorems about a certain statement being true must be reducible to a single irreducible logical chain of argument.
Citation needed ... I have no reason to believe this is true.
But here's an example of two proofs.
Proving sqrt(2) is irrational.
Proof 1: Every rational number has a finite Continued Fraction representation. But the normalised Continued Fraction representation of sqrt(2) is [1;2,2,2,...], which is infinite. Since this is infinite, sqrt(2) is irrational.
Proof 2: Consider integers a and b, and suppose 2(b²)=a². Consider the prime decompositions of a and b, and count how many times "2" turns up on each side. It's odd on the left, it's even on the right, so this cannot happen. Therefore we can never have integers a and b with 2(b²)=a². Therefore we can't have 2=(a²)/(b²)=(a/b)². So any fraction when squared cannot equal 2, so sqrt(2) is irrational.
Do these really feel "the same" to you?
So ... the first two proofs are different, then.
If you can go from A to C by B or B' and all the place is a nice grass field they are probably equivalent.
But if between B anb B' there is an active vocano, most people would call the paths different.
First, oftentimes the interest in proving long-standing, difficult mathematical problems is because we hope a proof will demonstrate new tools for tackling similar problems. In that sense, the exact content of a proof is quite important. Not to mention, there is value in having multiple proofs that each demonstrate quite different toolkits. Mere truth is not often the most important thing -- after all, mathematicians can (and do!) take certain propositions as premises of downstream work. If we discover a proof for one of those premises, that just means we can drop the premise from downstream work. Not having a proof doesn't block us from making use of the proposition anyway.
Second, sometimes the content of the proof is relevant formally. A sibling comment gave an example in terms of paths between two points; it is often the case that you care not only that the points are merely connected, but you also have a preference for which path is taken. Or, you can do an analysis of the paths themselves, and determine their length or average furthest distance from the nearest McDonalds. A path is "just" a proof of connectivity, but the individual paths can be quite distinct when studied themselves. Less abstractly, a constructive proof will yield an algorithm that can be performed, and we know quite well that the variety of sorting algorithms (that "merely" prove that "all lists are sortable") actually vary in quite important ways, including asymptotics and stability.
I fully understand that this is not a concrete argument and I have not stated my opinion with desirable rigor (but the author of the original article does provide a few examples in support). Maybe someone with a better grasp on abstract mathematical concept could convey my arguments better (if they think it's true).
I was primarily reacting to this part of your message...
> I'd argue that two correct proofs are always the same.
...with emphasis on the "always". To my eyes, a proof is any object that witnesses the truth of a proposition. The proof can be more than a witness, but to be called a proof, it must do at least that much.
To say that "two correct proofs are always the same" is, to me, to say that proofs can be no more than witnesses of the proposition; to be the same means to be indistinguishable. My argument is that two correct proofs may be distinct in useful ways.
I suppose this discussion depends on what "same" means ("depends on what the meaning of the word 'is' is", heh). Do you mean something other than what I mean? Your use of "logically equivalent" is probably telling -- so, two proofs should have the same deductive power? We often speak of propositions this way, but I'm not sure how to understand that notion on proofs. Terence Tao gives some possible notions of proof equivalence in a comment on the OP [0]; you might enjoy reading them and considering which is closest to your intuitive idea of equivalence :)
[0]: https://gowers.wordpress.com/2007/10/04/when-are-two-proofs-...
Consider a set of premises P that are assumed to be true. Also, consider that we are trying to analyze a statement s0 assuming P.
One proof could be of the form:
P1: s0 -> s1 -> s2 -> ... -> T/F.
Another proof could be of the form:
P2: s0 -> s11 -> s12 -> ... -> T/F.
Where T/F represent a single terminal symbol i.e. either T (true) or F (false) and s1... and s11... etc. could be different abstractions that have been employed to illustrate the veracity of the statement.
Regardless, both of these abstractions make use of the same Logical rules at each step so you could argue that the logical chain of both P1 and P2 are equivalent in some sense. If you think about it, it does seem obvious though, because if P1 and P2 disagreed with P1 yielding T and P2 yielding F, while using the same set of logical rules, it must be the case that either the logic is not consistent or one or both of the chain has errors.
So now, one could argue that all such correct logical chains (maybe of different lengths) that start with a statement s0 and terminate at a single symbol (say T) should essentially be the same.
s1 -> s2 -> s3 -> ... -> sn
↑................................↓
s0 -> s11 -> s12 -> ...->T
You could also argue that there must be exactly one such chain of the smallest possible complexity (in some sense) and that all other chains should be reducible to this one chain (not sure how).
At the end, I still agree with you in that two correct proofs can be distinct in useful ways but since proofs, to me, are a series of application of logic under certain premise to obtain a terminal symbol, all such logically sound chains must actually correspond to the one fundamental chain that's irreducible (in some sense).
Suppose I have a proposition that says, roughly, "if A and B and C then contradiction". Furthermore, suppose that A and B together are already contradictory, and B and C together are also already contradictory.
Now I can construct two proofs, one in which I use A and B (but not C) to yield the desired result, and another in which I use B and C (but not A).
In what way can we say that these two proofs are essentially the same? It appears that each uses potentially rather distinct information in order to derive the expected contradiction; it isn't clear how to go from a proof that avoids A to a proof that avoids C in a smooth way.
Regardless, this is a really good counter-example that will force me to think some more about it. Thanks!
Yes, absolutely :) I thought about this framing too, but figured the one I gave above might be more immediately convincing.
An algorithm is not a proof. It is a technique for proof. Two algorithms can be different, while not being meaningfully different profs that a list is sortable.
To the extent that they are different, they proof different theorems, such as "list can be sorted in O(f) time" for an f of interest.
That is an opinion that many do not share. FWIW, I framed my response as an opinion; you gave yours as a blanket statement. It is not wrong to treat algorithms as valid proofs.
In a dependent type theory, propositions are represented as types; the proposition that "all lists can be sorted" could be represented represented as the type "forall (t : Type) -> (le : Ordered t) -> forall (xs : List t) -> exists (ys : List t). (Increasing le ys, PermutationOf xs ys)". A proof of this proposition is exactly a program (algorithm) with that type; the sorted list is the `ys` component of the returned existential product. Yet the inhabitants of this type are not graded by asymptotics or stability; any sorting algorithm will do.
In a setting where inhabitants of the above type are distinguishable, you could then write proofs of asymptotics or stability against individual algorithms. That is, the proofs of the sorting proposition are themselves the subjects of subsequent propositions and proofs thereof.
I think that's what PP meant, i.e. if you want to differentiate between sorting algorithms in terms of efficiency, you somehow should encode this demand into the types (specification).
Putting these properties in the specification assumes you already know (or suspect) that your algorithm has these properties; then you are simply verifying what you knew. If you develop your algorithm first, then want to analyze it in terms of (say) asymptotics, then not only is it far too late to change the type, you don't even know what the asymptotics are yet. You'd still like to treat the algorithm formally in order to determine those asymptotics, but since you don't know them yet, the algorithm can't inhabit a type that states those asymptotics outright.
1. Prove that the interior angles of a triangle sum to 180 degrees.
First proof: draw a line parallel to one of the triangle's sides passing through its opposite vertex. There are three angles on one side of this line, and they obviously add to 180 degrees because it's a line. One of the three angles is directly one of the triangle's interior angles; the other two can be shown to be equal to the triangle's other two interior angles. (Try drawing it out.)
Second proof: start at one side of the triangle and walk around it. By the time you return to where you started, you must have turned 360 degrees. Thus the sum of the exterior angles is 360 degrees. Each interior angle is 180 minus the corresponding exterior angle, and there are three of them, so calling the interior angles A, B, C and the exterior angles A', B', C' we have A'+B'+C' = 360 implies (180-A) + (180-B) + (180-C) = 360 implies 540 - A - B - C = 360 implies 180 = A + B + C.
2. Prove that the sum of the first N numbers is N(N+1)/2.
First proof: sum the first and last number to get 1 + N, then the second and second-to-last to get 2 + (N-1) = 1 + N, repeating until you get to the middle. There are N/2 such pairs, giving a total of (1 + N)N/2. (This assumed that there were an even number of terms; consider the odd case too.)
Second proof: proceed by induction. For the base case, it's true for N=1 because 1*2/2 = 1. For the inductive case, suppose it's true for N-1. Then 1 + 2 + ... + N-1 + N = (1 + 2 + ... + N-1) + N = N(N-1)/2 + N = N(N-1)/2 + 2N/2 = N(N+1)/2.
Now, this argument may appear trite but do consider that both of your proofs essentially do the same thing with the first one summing the numbers from extremities and the second one summing 1...N-1 first and then the last. I'd argue that if addition were not commutative, you may have obtained different results.
Can you do that with these two proofs? What's a proof that's halfway in between the two?
If you can get from one proof to the other with small "refactorings", then I agree that they're fundamentally the same. If you can't---if there's an insurmountable gap that you need to leap across to transform one into the other---then I'd call them fundamentally different. If you insist that two proofs are "essentially the same thing" despite having this uncrossable gap between them, then I suspect you're defining "essentially the same" to mean "proves the same thing", which is a stupid definition because it makes all proofs the same by fiat, and avoids the interesting question.
For example if you had a general constructive proof that there were infinitely many prime numbers it should be a simple matter to alter it a bit and prove the twin prime conjecture wouldn't it?
In general a constructive proof and a non-constructive proof of some fact (say proof by contradiction) are fundamentally different in terms of where you can go with the proof.
Proof #1: there ar no odd integers between zero and 1 (inclusive), 1 is odd so there is 1 odd integer between zero and 2, 2 is even so there is 1 odd integer between zero and 3, 3 is odd so there are 2 odd integers between zero and 4, …, 999,998 is even so there are 499,999 odd integers between zero and 999,999, 999,999 is odd so there are 500,000 odd integers between zero and 1,000,000. QED.
Proof #2: this is a specific case of “there are n odd integers between zero and 2n (exclusive)”. (proof of the more general theorem). Picking n to be 500,000, the theorem follows.
I think most people would call those two proofs different.
Are you familiar with the proofs-as-programs idea?
The uh, something isomorphism? Idr the name.
Not all programs that implement a function are the same.
When you boil things down to the fundamental steps of the logic you are working on, you needn’t get the same thing.
For one thing, it may be that axioms A and B suffice to prove Z, and that axioms B and C suffice to prove Z, but that B alone doesn’t, and that A and B doesn’t prove C and that B and C doesn’t prove A.
So, the proofs using A and the proofs using C are certainly different.
As others have already said, if you want to distinguish between different proofs, it's better to encode those distinctions formally into types (and thus potentially into another mathematical theory).
All correct inferences proceeding from the same axioms are the same.
Proofs are usually not completely formal or even formalizable. Math is not completely well founded. "Unravelling it all the way" might be an open research project, or a new conjecture directly inspired by the second, apparently different proof. Showing these two profs to be equivalent might depend on a major new idea that happens after the two proofs are createdm
This is hinted at in the OP discussion of Terry Tao.
This is often stated, but is it really true? I haven’t seen a persuasive argument that not all math could (in principle) be formalized.
Can we do better?