9
8
To formulate this question precisely, I will define an idealized hypothetical "perfect" hash function H(n) which has nice scalability properties, and will formulate a problem PERFECT HASHCASH in terms of that, understanding that practical considerations may end up yielding only an approximation of this ideal.
To keep it simple, we will say that our hash function H(n) takes as input a single natural number n. Then we say that H(n) is a perfect hash function iff:
- H(n) maps each natural number to an infinite binary sequence, of which the time complexity to compute any initial segment s is polynomial in the size of n and s, (making it a sponge function).
- For any initial segment of length d, the set of all natural numbers n such that H(n) shares that initial segment has natural density = 1/(2^d).
The first thing formalizes the scalability of our function, and the second thing formalizes the idea that we want all hashes to appear roughly "equally often" as an output. Other than that, our perfect hash function is a black box, and we don't care much about exactly how it works, so long as it meets the above properties, as well as the usual desiderata applying to hash functions (easy to compute, hard to invert, hard to find collisions, etc).
Predicated on the assumption that a perfect hash function exists, we can now define the problem PERFECT HASHCASH as follows: PERFECT HASHCASH takes as input a perfect hash function H, a natural number n, and an all-zeroes vector 0^d of length d, which can be thought of as a unary representation of d. A solution to PERFECT HASHCASH consists of an n and d such that H(n) starts with 0^d.
Given those inputs, it is clear that PERFECT HASHCASH is in the complexity class TFNP, since this is a function problem and a solution is guaranteed to exist.
Can we also identify PERFECT HASHCASH as a member of any complexity class finer than TFNP?
Could it perhaps be in PPP? PPA? PPAD? Something else?
For background, see Complexity class on Wikipedia.
EDIT: the above question has been overhauled, as in the way that I originally formulated it I assumed that SHA256 is what I'm now calling a perfect hash function. Many people have noted in the comments that this may not be true, so rather than place the emphasis in this question on whether SHA256 specifically has the nice scaling properties we want, I defined an idealized hash function that we hope SHA256 at least approximates nicely enough for real-world purposes, and rephrased the question in terms of that.
As a final note to clear up any potential confusion, to make PERFECT HASHCASH resemble real Hashcash, we'd have to make one more assumption: that there exists some way to start with a block of data (an email, a Bitcoin block, etc) and somehow derive a characteristic perfect hash function from that, perhaps by "salting" a different perfect hash function in a way that the result is also another perfect hash function. So in the case of a "perfect Bitcoin," all of the miners on the bitcoin network would be working with their own unique perfect hash functions H'(n) which are somehow tied to the block they're working on, and each miner would simply try H'(0), H'(1), H'(2), ... in order until they find something starting with enough 0's. Each H' would be a different input to PERFECT HASHCASH.
+1 though not everyone on this site is a cryptographer; it may be beneficial to describe the acronyms (FNP, TFNP, etc) or to ask on crypto.stackoverflow.com – goodguys_activate – 2013-10-16T14:09:20.230
I think it is the responsibility of the asker to make his question understandable to potential experts answering the question, giving all background information shouldn't be necessary in every question, it would inflate questions too much. – Murch – 2013-10-25T10:33:53.723
Polynomial in what value? What's playing the role of input size here? – Nate Eldredge – 2014-02-01T20:47:38.920
Nate: Difficulty. – Mike Battaglia – 2014-02-02T00:21:16.067
The pigeonhole principle guarantees that there are an infinite number of solutions for some x. It does not guarantee an infinite number of solutions for every x - I'm not sure we can disprove that there is some x for which there is no solution. – Meni Rosenfeld – 2014-02-03T14:12:16.763
I'm not sure it makes sense to consider difficulty (or its log) as input size. In particular because if the hash function is fixed as SHA-256, difficulty values above 2^256 are meaningless so you can't talk about asymptotics. – Meni Rosenfeld – 2014-02-03T14:15:01.963
@MeniRosenfeld The pp guarantees that there are solutions (more than zero, but not infinitely many, which is impossible for a finite discrete input size).
The correct input size from a complexity standpoint should be the number of bits in x and y. But, to be sloppy, in practice that's going to be essentially the number of bits in the hash function output---less would bring about trouble for the existence of a solution at extreme difficulty, more would presumably(!) quickly become unnecessarily many. – pyramids – 2014-02-03T14:30:39.027
@pyramids: The pp guarantees there is a solution for some x. Not for every x. – Meni Rosenfeld – 2014-02-03T19:45:21.187
Many of the questions here are particular to the real-world limitations of using SHA-256, so I overhauled the question to consider an idealized perfect hash function which scales perfectly and which we hope SHA-256 at least approximates decently in real-world situations. – Mike Battaglia – 2014-02-05T07:00:08.467
@MeniRosenfeld I stand corrected, but then again you already said the "for some x" part; my point was just that "infinite number" was incorrect, I didn't mean to dispute the "for some x" part. I guess it took both of us to find (hopefully) all the errors in your statement. – pyramids – 2014-02-05T18:10:32.170
@MikeBattaglia I saw your changes. I'll see if I can find time for it later, but I suppose all I'll be able to do is (likely) concur that you changed the question such that much of my answer doesn't apply anymore. Which I fear will not help you with the core issue. – pyramids – 2014-02-05T18:12:34.823
@pyramids: My comments were a reply to the OP. The OP said there are infinitely many preimages, which results from the fact that the input string to SHA-256 can be arbitrarily long (it's not confined to any particular input size). This is true and I only contested the "for all x" part. The OP then said that the input size is difficulty, which I said makes no sense. I see no errors in my statement, so I expect an apology for your disrespectful comment and failure to follow the conversation. – Meni Rosenfeld – 2014-02-06T07:45:26.943
@MeniRosenfeld: Sorry to read you're unhappy about my comments. I'm only trying to help in the issue, not to follow orders or wishful thinking along the lines of "I see no errors, so there must be none." If we'd instead look at the issue, we'd probably learn new things---such as that the pigeonhole principle really only works if input and output range have the same size. There is something wrong with the combination of infinitely many and pp here, be it in your answer, the pre-revision question, or somewhere in how we combined them. – pyramids – 2014-02-11T00:15:38.640
@pyramids: I suggest you read https://en.wikipedia.org/wiki/Pigeonhole_principle. The PP definitely applies for sets of different size and infinite sets. In a bijection from an infinite set to a finite set, there must be an element in the output range with infinitely many preimages. This argument is pointless - you're not engaging in the technical discussion, you ignore your own faults and you blame me for things I didn't do. "I see no errors" doesn't mean "there must be none", it means the onus on finding errors is on you if you wish to speak disrespectfully as you did.
– Meni Rosenfeld – 2014-02-12T08:22:28.843@MeniRosenfeld: Wow, you're right, that statement of the pigeonhole principle does have an (even explicitly given!) extension to infinite input. But it gets even stranger than that: The one you linked to explicitly requires the input to be larger than the output, quite the opposite to the way basically the same pigeon hole principle is applied in the polynomial pigeon hole principle (PPP), see http://en.wikipedia.org/wiki/PPP_%28complexity%29, where input and output need to have exactly the same size!
– pyramids – 2014-02-12T22:08:25.890@pyramids: Ok. PPP is an interesting read which I was not previously familiar with, but it's an esoteric concept. What I linked to is what is commonly known as the pigeonhole principle. PPP is so called because of its use of the PP. Note that a solution to PC is "either a preimage of 0, or two preimages of an output". The PP guarantees a solution to this because if there is no preimage of 0, the effective output range is reduced by 1 element - hence there are more input values than output values, the PP applies and there are two preimages to some output. – Meni Rosenfeld – 2014-02-12T22:32:46.773
(PS - Where I said above "bijection" it should have been instead just "function") – Meni Rosenfeld – 2014-02-12T22:37:47.900
@MeniRosenfeld: Yes, exactly. I agree to that explanation. Plus I suppose with that information we can probably agree that there was no malice or disrespect involved? I'd hate to leave you feeling wronged; I was simply narrowly addressing everyting in the context of the PPP (and finer subclasses) specifically asked about by the OP, rather than a more general form of the PP. – pyramids – 2014-02-12T22:44:05.217
@pyramids: We can agree that there were misunderstandings. – Meni Rosenfeld – 2014-02-14T09:51:27.497