In this paper we propose a quantification of ensemble of distributions on a set of strings, in terms of how close to pseudorandom a distribution is. The quantification is an adaptation of the theory of dimension of sets of infinite sequences introduced by Lutz. Adapting Hitchcock’s work, we also show that the logarithmic loss incurred by a predictor on an ensemble of distributions is quantitatively equivalent to the notion of dimension we define. Roughly, this captures the equivalence between pseudorandomness defined via indistinguishability and via unpredictability. Later we show some natural properties of our notion of dimension. We also do a comparative study among our proposed notion of dimension and two well known notions of computational analogue of entropy, namely HILL-type pseudo min-entropy and next-bit pseudo Shannon entropy.
Further, we apply our quantification to the following problem. If we know that the dimension of an ensemble of distributions on the set of n-length strings is , can we extract out pseudorandom bits out of the distribution? We show that to construct such extractor, one needs at least bits of pure randomness. However, it is still open to do the same using random bits. We show that deterministic extraction is possible in a special case – analogous to the bit-fixing sources introduced by Chor et al., which we term nonpseudorandom bit-fixing source. We adapt the techniques of Gabizon, Raz and Shaltiel to construct a deterministic pseudorandom extractor for this source.
By the end, we make a little progress towards P vs. BPP problem by showing that existence of optimal stretching function that stretches input bits to produce n output bits such that output distribution has dimension , implies .
Incorporating randomness in a feasible computation is one of the basic primitives in theoretical computer science. Fortunately, any efficient (polynomial time) randomized algorithm does not require pure random bits. What it actually needs is a source that looks random to it and this is where the notion of pseudorandomness [5,35] comes into picture. Since its introduction, pseudorandomness has been fundamental to the domain of cryptography, complexity theory and computational learning theory. Pseudorandomness is mainly a computational approach to study the nature of randomness, and computational indistinguishability [11] played a pivotal role in this. Informally, a distribution is said to be pseudorandom if no efficient algorithm can distinguish it from the uniform distribution. Another way of looking at computational indistinguishability is via the notion of unpredictability of distributions, due to Yao [35]. Informally, a distribution is unpredictable if there is no efficient algorithm that, given a prefix of a string coming from that distribution, can guess the next bit with a significant success probability. This line of research naturally posed the question of constructing algorithms that can generate pseudorandom distributions, known as pseudorandom generators. Till now we know such constructions by assuming the existence of one-way functions. It is well known that constructibility of an optimal pseudorandom generator implies complete derandomization (i.e., ) and exponential hardness assumption on one-way function enables us to do that. However, Nisan and Wigderson [26] showed that the existence of an exponential hard function, which is a much weaker assumption, is also sufficient for this purpose. The assumption was further weakened in [18].
In order to characterize the class of random sources, information theoretic notion of min-entropy is normally used. A computational analogue of entropy was introduced by Yao [35] and was based on compression. Håstad, Impagliazzo, Levin and Luby [13] extended the definition of min-entropy in computational settings while giving the construction of a pseudorandom generator from any one-way function. This HILL-type pseudoentropy basically extends the definition of pseudorandomness syntactically. Relations among above two types of pseudoentropy was further studied in [4]. A more relaxed notion of pseudoentropy, known as next-bit Shannon pseudoentropy, was later introduced by Haitner, Reingold and Vadhan [12] in the context of an efficient construction of a pseudorandom generator from any one-way function. In a follow up work [33], the same notion was alternatively characterized by KL-hardness. So far it is not clear which of the above notions is the most appropriate or whether they are at all suitable to characterize distributions in terms of the degree of pseudorandomness in it.
In this paper, we first propose an alternative measure to quantify the amount of pseudorandomness present in a distribution. This measure is motivated by the ideas of dimension [23] and logarithmic loss unpredictability [14]. Lutz used the betting functions known as gales to characterize the Hausdroff dimension of sets of infinite sequences over a finite alphabet. The definition given by Lutz cannot be carried over directly, because here we consider the distributions over finite length strings instead of sets containing infinite length strings. To overcome this difficulty, we allow “non-uniform” gales and introduce a new probabilistic notion of success of a gale over a distribution. We use this to define two notions of dimension of a distribution – strong one and a weak one. Both the notions were already there for the case of infinite strings [24]. In [14], Hitchcock showed that the definition of dimension given by Lutz is equivalent to logarithmic loss unpredictability. In this paper, we show that this result can be adapted to establish a quantitative equivalence between the notion of logarithmic loss unpredictability of a distribution and our proposed notion of dimension. Roughly, this captures the essence of equivalence between pseudorandomness defined via indistinguishability and via unpredictability [35]. We show some important properties of the notion of dimension of a distribution, which eventually makes this characterization much more powerful and flexible. We also do a comparative study between our notion of dimension and two known notions of pseudoentropy, namely HILL-type pseudo min-entropy and next-bit pseudo Shannon entropy. We show that the class of distributions with high dimension is a strict superset of the class of distributions having high HILL-type pseudo min-entropy. Whereas, there is a much closer relationship between dimension and next-bit pseudo Shannon entropy.
Once we have a quantification of pseudorandomness of a distribution, the next natural question is how to extract the pseudorandom part from a given distribution. The question is similar to the question of constructing randomness extractors which is an efficient algorithm that converts a realistic source to an almost ideal source of randomness. The term randomness extractor was first defined by Nisan and Zuckerman [27]. Unfortunately there is no such deterministic algorithm and to extract out almost all the randomness, extra pure random bits are always required [28,30]. There is a long line of research on construction of extractors towards achieving this bound. For a comprehensive treatment on this topic, we refer the reader to excellent surveys by Nisan and Ta-Shma [25] and Shaltiel [31]. Finally, the desired bound was achieved up to some constant factor in [20].
Coming back to the computational analogue, it is natural to study the same question in the domain of pseudorandomness. Given a distribution with dimension s, the problem is to output many bits that are pseudorandom. A simple argument can show that deterministic pseudorandom extraction is not possible, but it is not at all clear that how many pure random bits are necessary to serve the purpose. In this paper, we show that we need to actually involve random bits to extract out all the pseudorandomness present in a distribution. However explicit construction of one such extractor with random bits is not known. If it is known that the given distribution has high HILL-type pseudo min-entropy, then any randomness extractor will work [4]. Instead of HILL-type pseudoentropy, even if we have Yao-type pseudo min-entropy, then also some special kind of randomness extractor (namely with a “reconstruction procedure”) could serve our purpose [4]. Unfortunately both of these notions of pseudoentropy can be very small for a distribution with very high dimension. Actually the same counterexample will work for both cases. So it is interesting to come up with an pseudorandom extractor for a class of distributions having high dimension.
As a first step towards this goal, we consider a special kind of source which we call the nonpseudorandom bit-fixing source. It is similar to the well studied notion of bit-fixing random source introduced by Chor et al. [6], for which we know the construction of a deterministic randomness extractor due to [19] and [9]. In this paper, we show that the same construction yields a deterministic pseudorandom extractor for all nonpseudorandom bit-fixing sources having polynomial-size support.
In the concluding section, we make a little progress towards the question of P vs. by showing that in order to prove , it is sufficient to construct an algorithm that stretches pure random bits to n bits such that the output distribution has a non-zero weak dimension (not necessarily pseudorandom). The idea is that using such stretching algorithm, we easily construct a hard function, which eventually gives us the most desired optimal pseudorandom generator.
In this paper, we consider the binary alphabet . We denote as , where E is an event and x is drawn randomly according to the distribution D. We use to denote the uniform distribution on . Given a string , denote the ith bit of x and denotes the first i bits of x. Now suppose and , then by , we denote the string .
Quantification of pseudorandomness
In this section, we propose a quantification of pseudorandomness present in a distribution. We adapt the notion introduced by Lutz [23] of an s-gale to define a variant notion of success of an s-gale against a distribution D on . Throughout this paper, we will talk about non-uniform definitions. First, we consider the definition of pseudorandomness.
Pseudorandomness
We start by defining the notion of indistinguishability which we will use frequently in this paper.
(Indistinguishability).
A distribution D over is -indistinguishable from another distribution over (for ) if for every circuit C of size at most S,
Now we are ready to introduce the notion of pseudorandomness.
(Pseudorandomness).
For an ensemble of distributions , where the distribution is on and for any ,2
Throughout this paper, we consider so that the circuit can at least read the full input.
,
(via computational indistinguishability) D is said to be -pseudorandom if for all sufficiently large n, is -indistinguishable from ; or equivalently,
(via unpredictability [35]) D is said to be -pseudorandom if for all sufficiently large n,
for all circuits C of size at most and for all .
It is always natural to consider asymptotic definitions with respect to all polynomial size circuits and allowing bias term to be any inverse polynomial. An ensemble of distributions , where a distribution is on , is said to be pseudorandom if for every constant and , D is -pseudorandom [10].
Martingales, s-gales and predictors
Martingales are “fair” betting games which are used extensively in probability theory (see for example, [3]). Lutz introduced a generalized notion, that of an s-gale, to characterize Hausdorff dimension [22] and Athreya et al. used a similar notion to characterize packing dimension [2].
A functionis an s-gale if and only if the functiondefined asis a martingale.
In order to adapt the notion of an s-gale to the study of pseudorandomness, we first relate it to the notion of predictors, which have been extensively used in the literature [33]. Given an initial finite segment of a string, a predictor specifies a probability distribution over Σ for the next symbol in the string.
A function is a predictor if for all , .
Note that the above definition of a predictor is not very different from the type of predictor used in Definition 2.2. If we have a predictor that given a prefix of a string outputs the next bit, then by invoking that predictor independently polynomially many times we can get an estimate on the probability of occurrence of 0 or 1 as the next bit. Using Chernoff bound it can easily be shown that the estimation is correct up to some inverse exponential error. For the detailed equivalence, the reader may refer to [33]. In this paper, we only consider the martingales (or s-gales) and predictors that can be computed using family of non-uniform circuits and from now onwards we refer to them just as martingales (or s-gales) and predictors, and by the size of a martingale (or an s-gale or a predictor), we refer the size of the circuit corresponding to that martingale (or s-gale or predictor).
Conversion between s-gale and predictor
There is an equivalence between an s-gale and a predictor. An early reference to this is [7]. We follow the construction given in [14].
A predictor π induces an s-gale for each and is defined as follows: , for all and ; equivalently for all .
Conversely, an s-gale d with induces a predictor defined as: if , ; otherwise, , for all and .
Hitherto, s-gales have been used to study the dimension of sets of infinite sequences – for an extensive bibliography, see [15] and [16]. Although in this paper, we consider distributions on finite length strings, the conversion procedure between s-gale and predictor will be exactly same as described above.
Defining dimension
For any , an s-gale is said to ε-succeed over a distribution on if
Note that the above definition of win of an s-gale is not arbitrary and reader may refer to the last portion of the proof of Theorem 4.3 to get some intuition behind this definition. The following lemma states the equivalence between the standard definition of pseudorandomness and the definition using martingale.
Consider an ensemble of distributions, where the distributionis on. If D is-pseudorandom then for all large enough n, there is no martingale of size at mostthat-succeeds on. Conversely, if for all large enough n, there is no martingale of size at mostthat-succeeds on, then D is-pseudorandom.
Assume is a martingale which -succeeds on for some , i.e.,
By the Markov Inequality, .
Let be a circuit of size obtained by instantiating d at length n. Now let C be a circuit which outputs 1 if . Then,
Thus D is not -pseudorandom.
Now for the converse direction, assume that D is not -pseudorandom. Then there exists an such that for any there exists an bit position and some circuit C of size at most for which
Now build a martingale using this circuit C as follows. Let . Now, , , and , if .
Now it is clear that
and for all large enough n, the size of the martingale d is at most . □
The next definition gives a complete quantification of distributions in terms of dimension.
(Weak dimension or dimension).
For an ensemble of distributions , where the distribution is on and for any , , the -weak dimension or simply dimension of D is defined as
Informally, if the dimension of an ensemble of distribution is s, we say that it is s-pseudorandom. It is clear from Lemma 2.7 that for any -pseudorandom ensemble of distributions D, . In Section 4 we will see that it is actually an equality. We can also define dimension of distributions in slightly stronger sense.
(Strong dimension).
For an ensemble of distributions , where the distribution is on and for any , , the -strong dimension of D is defined as
It follows from the definition that weak dimension is smaller or equal to strong dimension.
For an ensemble of distributions, where the distributionis onand for any,,andare well defined and
We will show separation between both of these notions of dimensions in Section 4. Now just like the asymptotic definition of pseudorandomness with respect to all polynomial size circuits and inverse polynomial bias, for any ensemble of distributions , we can give definitions of weak dimension or simply dimension (denoted by ) and strong dimension (denoted by ) by allowing to be and to be in the Definition 2.8 and 2.9 respectively.
Unpredictability and dimension
It is customary to measure the performance of a predictor utilizing a loss function [14]. The loss function determines the penalty incurred by a predictor for erring in its prediction. Let the next bit be b and the probability induced by the predictor on it is .
Commonly used loss functions include the absolute loss function, which penalizes the amount ; and the logarithmic loss function, which penalizes . The latter, which appears complicated at first glance, is intimately related to the concepts of Shannon Entropy and dimension. In this section, adapting the result of Hitchcock [14], we establish that there is an equivalence between the notion of dimension that we have defined in the previous section, and the logarithmic loss function defined on a predictor.
The logarithmic loss function on is defined to be .
Using this, we define the running loss that a predictor incurs while it predicts successive bits of a string in , as the sum of the losses that the predictor makes on individual bits.
Let be a predictor.
The cumulative loss of π on , denoted as , is defined by .
The loss rate of π on is .
The ε-loss rate of π over a distribution on is
Intuitively the unpredictability of a distribution is defined as the infimum of the loss rate that any predictor has to incur on the distribution.
(Weak unpredictability or unpredictability).
For an ensemble of distributions , where the distribution is on and for any , , the -weak unpredictability or simply unpredictability of D is
With this, we can prove that dimension can equivalently be defined using unpredictability. The proof is motivated from the proof of the equivalence between logarithmic loss unpredictability and dimension [14].
Consider any. For an ensemble of distributions, where the distributionis onand for any,, ifthen. Conversely,implies.
Assume that is any number such that and then take a number such that and is rational.3
We consider to be rational to ensure that the value of can be computed using constant size circuit. Note that we can always find such due to the fact that the function for is continuous, monotonically increasing and within any two real numbers there exists a rational number.
For some large enough n, suppose d is an -gale of size at most that -succeeds on . Let be defined by
For any with , we have
So . Thus,
Note that implementation of involves division of two at most bits rational numbers4
As we are considering martingales and predictors that can be implemented by circuits of size so the output must be a rational number which can be represented by at most bits.
and thus can be done using a circuit of size at most [34].
Conversely, assume that . Assume that is any number satisfying and then take any number such that and is rational. For some large enough n, let π be a predictor of size at most such that
If is the -gale defined by
then for any with , we have the following,
The last inequality holds for all sufficiently large n. Hence, for infinitely many n,
Moreover, computation of d involves multiplication of n rational numbers of at most bits each and thus can be implemented by a circuit of size [34]. □
Just like dimension, one can also define strong unpredictability in the following way.
(Strong unpredictability).
For an ensemble of distributions , where the distribution is on and for any , , the -strong unpredictability of D is
Now by following the proof of Theorem 3.4 it is easy to see that a similar relation also holds between strong dimension and strong unpredictability.
Consider any. For an ensemble of distributions, where the distributionis onand for any,, ifthen. Conversely,implies.
Analogous to pseudorandomness and dimension, for any ensemble of distributions , one can define weak unpredictability or simply unpredictability (denoted by ) and strong unpredictability (denoted by ) by allowing to be and to be in the Definition 3.3 and 3.5 respectively. Following is a straight forward implication of Theorem 3.4 and Theorem 3.6.
For any ensemble of distributions, where the distributionis on,
Properties of dimension
We now establish a few basic properties of our notion of dimension. We begin by exhibiting existence of an ensemble of distributions with dimension s, for any .
First, we observe that the dimension of any ensemble of distributions D is the infimum of a non-empty subset of for any and hence the dimension of a distribution is well-defined. The following lemma establishes the above claim.
For an ensemble of distributions, where the distributionis onand for any,,.
Let us first take any such that is rational and then consider the following function . Let and , . It is easy to see that d is an s-gale and for any , . Thus for all large enough n, d will -succeed over . Also note that for all large enough n, this function d can be implemented using a circuit of size .5
As for every string w of length n, the value of will be same, so one can hardcode the value inside the non-uniform circuit implementing the function d.
Hence the statement of the lemma follows. □
Above lemma along with Proposition 2.10 implies the following corollary.
For an ensemble of distributions, where the distributionis onand for any,,.
Since it is clear that any ensemble of distributions has a dimension, the following theorem establishes the fact that our definition yields a nontrivial quantification of the set of ensembles of distributions.
Let. Then for any,, there is an ensemble of distributions, where the distributionis on, such that-dimension (also strong dimension) of D is s.
Let us take the ensemble of uniform distributions, i.e., , where is the uniform distribution on . Then by Lemma 2.7 together with Lemma 4.1 shows that for any S and ε, . By using Corollary 4.2 instead of Lemma 4.1, one can also show that .
On the other hand we consider an ensemble where support size of each is one, or in other words, imposes all the probability on a single string, say . Now first take any such that is rational and then consider the following function . Let and , . It is easy to see that d is an s-gale and . Thus for all large enough n, d will -succeed over . Also note that for all large enough n this function d can be implemented using a circuit of size . Hence for any S and ε, -dimension as well as -strong dimension of D is 0.
Otherwise, assume that . Let us take the ensemble of uniform distributions . To each string , we append many zeros, and denote the resulting string as . Let . For strings which do not terminate in a sequence of many zeros, we set .
For any large enough n, let be the predictor for distribution which testifies that the -strong unpredictability of D is at most 1. Define the new predictor by
For every which is in the support of such that , for any , we have that
The last inequality holds for all small enough and this testifies that the -strong unpredictability (hence the -strong dimension) of the ensemble of distributions is at most s. Now by Proposition 2.10, it follows that is also at most s.
Now, assume that -dimension of is less than s. For any such that , for infinitely many n, there exists an -gale d of size at most which -succeeds on . We show that this would imply that is not uniform. Now consider a string , which is in the support of . For any , and thus will imply that . Now consider the martingale (needs not be computed by any circuit) corresponding to the -gale d. According to [22], we have , for any string . Thus,
Note that is same as , which contradicts the fact that by Markov inequality, . This shows that and hence by Proposition 2.10, is also greater than or equal to s. □
Informally the following theorem shows that our notion of dimension is able to capture the fact that if we mix a “good” distribution with a small amount of an extremely “bad” distribution, then also “quality” of the first distribution would not change much.
Let,andbe three ensembles of distributions such that for some, for every,. If for anyand,, then.6
Note that bias term in the dimension of depends on δ.
For the contrary, let us assume that, and assume , for some , . So for infinitely many n, there exists an s-gale of size at most that -succeeds over . Thus,
Let the strings w for which holds be , . So,
where , for all .
Now, as ,
and thus which is a contradiction.
The above argument is valid for all n for which there exists an s-gale of size at most that -succeeds over and hence the same claim holds even if we consider strong dimension instead of dimension. □
If we follow the proof of Theorem 4.4 with martingale instead of s-gale, we get the following weaker version of the above theorem, which we will require in the construction of deterministic extractor for a special kind of sources in Section 6.1.
Let,andbe three ensembles of distributions such that for some, for every,. If for anyand,is-pseudorandom, then D is-pseudorandom.
In subsequent sections, we will see how to extract pseudorandom parts from a convex combination of distributions. For that purpose we need the following lemma which establishes the fact that convex combination of two pseudorandom distributions will be pseudorandom as well.
Consider two ensembles of distributionsand. Suppose for anyand, bothandare-pseudorandom. If forthere exists an ensemble of distributionswhich can be expressed as for all,, then D is also-pseudorandom, where.7
Note that this lemma will be useful only when we consider .
The claim clearly holds when is either 0 or 1, so assume that . For the contrary, let us assume that D is not -pseudorandom where , i.e., by Lemma 2.7, for infinitely many there exists an martingale d of size at most that -succeeds on , i.e.,
As is -pseudorandom, by Lemma 2.7, for all large enough n there exists no martingale of size at most that -succeeds on . Thus it is possible to consider an such that there exists a martingale d of size at most that -succeeds on , but does not -succeed on . Let the strings for which holds be , . So,
where , . Now, since we have that
Thus the following holds
As for infinitely many the above happens, so is not -pseudorandom, which is a contradiction. □
We can also slightly generalize the above lemma and get the following theorem.
Let,andbe three ensembles of distributions such that for some, for every,. Then for anyand,
The claim clearly holds when is either 0 or 1, so assume that . Let , and .
For the contrary, let us assume that, . Now consider , for some , . Then for infinitely many n, there exists an s-gale d of size at most such that
As , so for all large enough n, there exists no s-gale of size at most that -succeeds on . Thus we can consider an n such that there exists an s-gale d of size at most that -succeeds on , but does not -succeed on . Let the strings w for which holds be , . So,
where , for . Also, we have that
Thus
and this happens for infinitely many implying , which is a contradiction. □
The following theorem shows that in order for a distribution to have dimension less than 1, it is not sufficient to have a few positions where we can successfully predict – it is necessary that these positions occur often.
For anyand, there is an ensemble of distributionssuch that, but D is not-pseudorandom.
Let on be defined as follows.
Then D is clearly not -pseudorandom, for any value of and . Consider a predictor defined as follows. For strings w of length i, , set , and , otherwise. Then
However, we will show that . Assume that and thus for some , , for infinitely many , there exists an s-gale d, where , of size at most which -succeeds on . Now consider a string , which is in the support of . Now, and thus will imply that . Next consider the martingale (needs not be computed by any circuit) corresponding to the s-gale d. According to [22], we have , for any string . Thus,
The first inequality holds for all large enough n. Note that is same as , where is drawn according to the distribution . However by Markov inequality, , which is a contradiction and this completes the proof. □
We now give separation between two notions of dimension by providing an example of ensemble which has a very high strong dimension, but very low weak dimension.
There exists an ensemble of distributionssuch that for anyand,.
Here we actually show a much stronger claim by giving an example of an ensemble of distributions such that whereas . This is the largest possible gap between weak and strong dimension of any ensemble of distributions.
Construct an ensemble of distributions as follows: for all the odd value of n, set where is the uniform distribution over and for all even value of n, set to be such that it imposes all the probability on a single string .
Due to Markov inequality, there exists no martingale that -succeeds over distribution for any odd value of n and by Lemma 4.1, strong dimension can be at most 1. Hence, . By the argument used in the proof of Theorem 4.3, we can say that for any such that is rational, for all large enough even value of n, there exists an s-gale of size at most that -succeeds over and hence . □
Pseudoentropy and dimension
In this section we study the relation between our notion of dimension and different variants of computational or pseudo (min/Shannon) entropy. We will use standard notions and notations of information theory (e.g., Shannon entropy, KL divergence) without defining them. Readers can find a few basic notations and propositions of information theory in the following subsection and for more details, we refer the reader to the book by Cover and Thomas [8].
Basics of information theory
(Shannon entropy).
The Shannon entropy of a discrete random variable X is defined as
The joint entropy is defined to be and the conditional entropy is defined to be .
(Chain rule for Shannon entropy).
(KL divergence).
The Kullback–Leibler distance or KL divergence between two distributions P and Q is defined as
(Conditional KL divergence).
For random variables and , the conditional KL divergence from to is defined as
Just like Shannon entropy, in this case also, we have chain rule stated below.
(Chain rule for KL divergence).
High HILL-type pseudo min-entropy implies high dimension
For a distribution D, the min-entropy of D is defined as . We start with the standard definition of computational min-entropy, as given by [13].
(HILL-type pseudo min-entropy).
For any and , an ensemble of distributions has -HILL-type pseudo min-entropy (or simply -pseudo min-entropy) at least , denoted as if there exists an ensemble of distributions such that for all large enough n,
, and
is -indistinguishable from the distribution .
Several other definitions of pseudo min-entropy (metric-type, Yao-type or compression type) are present in the literature. We refer the reader to [4] for a comprehensive treatment on different definitions and the connections between them. In the remaining portion of this subsection, we focus only on HILL-type pseudo min-entropy. Now we state the main result of this subsection.
For every ensemble of distributionsand for any,, if, wherefor some, then.
The theorem is a consequence of the following lemma.
For every ensemble of distributions, if for all large enough n,, wherefor some, thenfor anyand.
Now observe that if for all large enough n, distribution is -indistinguishable from another distribution , then as otherwise the s-gale of size at most which -succeeds over exactly one of them, acts as a distinguishing circuit. This fact along with Lemma 5.8 completes the proof. □
For the sake of contradiction, let us assume that for infinitely many , there exists an s-gale d of size at most that -succeeds over , i.e.,
Now consider the following set
As ,
By taking the corresponding martingale (needs not be computed by any circuit) according to the Proposition 2.4, we have that for any , and as a consequence,
which contradicts the fact that by Markov inequality, . □
One can extend the definition of HILL-type pseudo min-entropy by allowing to be and to be and let us denote this by . We can extend the result stated in Theorem 5.7 as follows.
For any ensemble of distributions, whereis a distribution over, and, if, where, then.
The converse direction of the statement of Theorem 5.7 is also true if the distribution under consideration is pseudorandom. If the converse is true then we can apply any randomness extractor to get pseudorandom distribution from any distribution having high dimension [4]. However, we should always be careful about the circuit size with respect to which we call the output distribution pseudorandom. Unfortunately, in general the converse is not true.
Suppose one-way functions (see Definition 2.2.1 of [10]) exist, then it is well-known that we can construct a pseudorandom generator (see Definition 3.3.1 of [10]) where such that m is any polynomial in l, say . For the construction of pseudorandom generator with polynomial stretch from any one-way function, interested reader may refer to [13,33]. Now consider the ensemble of distributions . For large enough l, using the argument similar to the proof of Theorem 4.3, it can easily be shown that the distribution D has dimension (as well as weak dimension) almost 1 as the distribution on the first m bits are pseudorandom, but pseudo min-entropy is not larger than l.
Equivalence between dimension and next-bit pseudo Shannon entropy
In the last subsection, we have talked about pseudo min-entropy. In similar fashion, one can also define pseudo Shannon entropy and a natural generalization of it is conditional pseudo Shannon entropy [12,17,33].
(Conditional pseudo Shannon entropy).
A random variable jointly distributed with is said to have -conditional pseudo Shannon entropy at least k, for any and if there exists a random variable jointly distributed with such that
, and
and are -indistinguishable.
Now suppose is an ensemble of random variables jointly distributed with another ensemble of distributions . For any and , Y is said to have -conditional pseudo Shannon entropy at least given X if there exists another ensemble of distributions Z jointly distributed with X such that for all sufficiently large n, has -conditional pseudo Shannon entropy at least .
The following is the variant of pseudoentropy that we are looking for in this subsection and was introduced by Haitner et al. [12].
(Next-bit pseudo Shannon entropy).
An ensemble of random variables , where takes values in , has -next-bit pseudo Shannon entropy for any and at least , denoted as if there exists an ensemble of random variables where takes values in and for all sufficiently large n,
, and
for all , and are -indistinguishable.
Later, Vadhan and Zheng [33] provided an alternative characterization of conditional pseudo Shannon entropy by showing an equivalence between it and KL-hardness (defined below). We use this alternative characterization extensively for our purpose.
(KL-hardness).
Suppose is a -valued random variable and π be any predictor. Then π is said to be a δ-KL-predictor of given if where for all and .
Moreover, for any and , an ensemble , where each is a random variable taking value in Σ, is said to be -KL-hard given another ensemble of random variables , where each takes value in , if for all large enough n there is no predictor π of size at most that is a -KL-predictor of given .
The following theorem provides the equivalence among KL-hardness and conditional pseudo Shannon entropy of a distribution.
For an ensemblewhere eachis a-valued random variable and for any,,
If Y is-KL-hard given X, then for all large enough n,has-conditional pseudo Shannon entropy at least, where.
Conversely, if for all large enough n,has-conditional pseudo Shannon entropy at least, then for every, Y is-KL-hard given X, whereand.
Now we are ready to state the main theorem of this subsection which conveys the fact that the distributions with high dimensions also have high next-bit pseudo Shannon entropy.
For an ensemble of distributions, whereis a distribution over, for anyandsuch thatas, if, then, whereand.8
Note that constant in the expression of here is related to that appeared in Item 1 of Theorem 5.13, but not exactly the same.
For the sake of contradiction, let us assume that . Thus there exists a subset of cardinality equals to the cardinality of such that for all , Item 1 and Item 2 of Definition 5.11 do not hold simultaneously. On the other hand and thus by following the proof of Theorem 3.4, we can say that there exists a such that . Now consider some large enough n belongs to S and break into 1-bit blocks, i.e., .
For any predictor , let us define such that for . Then,
where the last equality follows from the chain rule of Shannon entropy (Proposition 5.2).
Now the definitions of next-bit pseudo Shannon entropy, conditional pseudo Shannon entropy and KL-predictor along with Item 1 of Theorem 5.13 imply that for any , there exists a predictor π of size at most such that
Hence for any ,
Thus,
As , by the definition, it implies that for all but finitely many n belong to the set S,
For all large enough n, the above inequality gives us , for any because as . Hence which is a contradiction and this completes the proof. □
The following weak converse can easily be proven.
For an ensemble of distributions, whereis a distribution over, for anyand, if, then for any, where.9
Note that constant of the term here is related to that appeared in Item 2 of Theorem 5.13, but not exactly the same.
Suppose an ensemble of distributions D is given such that . Now take any large enough n and break into 1-bit blocks, i.e., . Let us denote and thus by Theorem 3.4, there exists a constant such that . This implies that for infinitely many n, there exists a predictor π of size at most such that for any ,
Thus we can write the following,
We have already seen that for any predictor ,
Now the definitions of next-bit pseudo Shannon entropy, conditional pseudo Shannon entropy and KL-predictor along with Item 2 of Theorem 5.13 imply that for all large enough n, for every ,
Hence,
for any , for infinitely many large enough n and now setting to be such that equals to the value of mentioned in Item 2 of Theorem 5.13 concludes the proof. □
It is natural to allow to be and to be and then consider the definitions of conditional pseudo Shannon entropy, next-bit pseudo Shannon entropy (denoted as ) and KL-hardness. For details we refer the reader to [33]. Now we use the equivalence between conditional pseudo Shannon entropy and KL-hardness from [33] to derive the following corollary of Theorem 5.14.
For any ensemble of distributions, whereis a distribution over, and for any, if, thenfor.
In a similar fashion, we get the following as a corollary of Theorem 5.15.
For any,and, for any ensemble of distributions, whereis a distribution over, if, then.
Pseudorandom extractors and lower bound
We now introduce the notion of pseudorandom extractor similar to the notion of randomness extractor. Intuitively, a randomness extractor is a function that outputs almost random (statistically close to uniform) bits from weakly random sources, which need not be close to the uniformly random source. Two distributions X and Y on a set Λ are said to be ε-close (statistically close) if or equivalently .
(Deterministic randomness extractor).
For any , a family of functions where is said to be a deterministic ε-extractor for a class of ensemble of distributions if for every ensemble of distributions in , where is on , for all large enough n, the distribution is -close to .
Likewise, a seeded ε-extractor is defined and the only difference is that now it takes a -bit string chosen according to , as an extra input. Before going further, we mention that for ease of presentation, now onwards we will only talk about the definitions and results derived so far related to pseudorandomness and dimension by considering all polynomial size circuits and inverse polynomial bias. We now define the notion of a pseudorandom extractor, the purpose of which is to extract out pseudorandom distribution from a given distribution.
(Pseudorandom extractor).
A family of functions where is said to be a deterministic pseudorandom extractor for a class of ensemble of distributions if for every ensemble of distributions in , where is on , is pseudorandom.
A family of functions where is said to be a seeded pseudorandom extractor for a class of ensemble of distributions if for every ensemble of distributions in , is pseudorandom.
In this section, we will concentrate on the class of distributions having dimension at least s. It is clear from the results stated in Section 5.2 that this class of distribution is a strict superset of the class of distributions with HILL-type pseudo min-entropy at least , for which any randomness extractor will act as a pseudorandom extractor [4]. Thus it is natural to ask the following.
For any , does there exist a deterministic/seeded pseudorandom extractor for the class of ensemble of distributions having dimension at least s?
Just like the case of randomness extraction, one can easily argue that deterministic pseudorandom extraction is not possible.10
Suppose with is a deterministic pseudorandom extractor, then for all n, there exists such that . Thus E is not a pseudorandom extractor for a source D that is a uniform distribution on for all n and by Lemma 5.8, for any .
The next natural question is what the lower bound on the seed length will be. We answer this question in the following theorem.
Suppose for any,wherebe a seeded pseudorandom extractor for the class of ensemble of distributions having dimension at least s and for some,. Then.
For the sake of contradiction, let us assume that . Now by doing a walk according to the output distribution on an odd-length cycle, we achieve the following claim.
There is a deterministic -extractor where for all pseudorandom ensemble of distributions.
This claim follows from the stronger result stated in Theorem 6.8. Now construct the following function for some constant such that for all . The function is a seeded -extractor with , but it is well known due to [30, Theorem 1.9] that any such randomness extractor must satisfy and hence we get a contradiction. □
However, the question on constructing an explicit or polynomial time computable seeded pseudorandom extractor with seed length is still open and next, we formally pose this question.
For any , can one construct a seeded pseudorandom extractor where in polynomial time, for the class of ensemble of distributions having dimension at least s such that for some and ?
Note that it is important to consider dimension in the statement of Questions 6.3 and 6.5, because if we consider strong dimension instead of dimension then sometimes it might be just impossible to extract out pseudorandom distributions. For example one can consider the ensemble of distributions mentioned in the proof of Theorem 4.9 where however strong dimension is 1, as for infinitely many n, support of contains just a single string, thus one cannot hope to get any pseudorandom distribution out of it. In the next part of this section, we will see a special type of nonpseudorandom source and give an explicit construction of deterministic pseudorandom extractor for that particular type of source. Before proceeding further, we want to mention that it is also very interesting to consider nonpseudorandom distributions samplable by poly-size circuits and we will discuss on the existence of extractor for that particular source in Section 6.2.
Deterministic pseudorandom extractor for nonpseudorandom bit-fixing sources
In Section 4 while proving Theorem 4.3, we have introduced a special type of nonpseudorandom distribution which looks similar to the -bit-fixing source defined as a distribution X over such that there exists a subset where all the bits at the indices of I are independent and uniformly chosen and rest of the bits are completely fixed. This distribution was introduced by Chor et al. [6]. Now we define an analogous notion for the class of nonpseudorandom distributions, which we term nonpseudorandom bit-fixing sources.
(Nonpseudorandom bit-fixing source).
Let . An ensemble of distributions , where is an distribution over , with dimension s is an -nonpseudorandom bit-fixing source if for all n there exists a subset such that all the bits at the indices of I come from a pseudorandom distribution and rest of the bits are fixed.
We devote the rest of the section to achieve an affirmative answer to the question of constructing deterministic pseudorandom extractor for the nonpseudorandom bit-fixing sources. For this purpose, we show that a careful analysis of the technique used in the construction of the deterministic randomness extractor for bit-fixing random sources by Gabizon, Raz and Shaltiel [9] will lead us to the desired deterministic pseudorandom extractor.
For any, there is an explicit deterministic pseudorandom extractor, for all-nonpseudorandom bit-fixing sources having polynomial-size support where.
We first extract amount of almost random bits and then use the same as seed in the seeded extractor. To use the seeded extractor, we modify the source such that it becomes independent of the random bits extracted. Before going into the exact details of the proof, we first discuss the ingredients required in the proof of the theorem.
Pseudorandom walk and extracting a few random bits
Kamp and Zuckerman [19] use a technique of random walk on odd-length cycles to extract almost random bits from a bit-fixing source. We adapt this to extract almost random bits from a -nonpseudorandom bit-fixing source.
Let,. Then there is a deterministic-extractorfor all-nonpseudorandom bit-fixing sources.
We prove the above theorem using the property of pseudorandom walk together with the fact that the second largest eigenvalue of a l length odd cycle is . Note that a corollary of the above theorem is the claim used in the proof of Theorem 6.4. Before proving the above theorem, we state two lemmas required for the proof. The first is a very special case of a lemma given in [19] which was restated in [9].
Let,and. Suppose G is an odd length cycle having M vertices and having second largest eigenvalue λ. If we take a walk on G according to the bits from a-bit-fixing source, starting from any fixed vertex, then at the end of the n step of the walk, the distribution P on the vertices will be-close to.
Now we prove a similar result for -nonpseudorandom bit-fixing source using the property of pseudorandom walk. The idea of pseudorandom walk was also used previously in the domain of space bounded computation by Reingold et al. [29].
Letand. Let G be an odd length cycle having M vertices and having second largest eigenvalue λ. If we take a walk on G according to the bits from a-nonpseudorandom bit-fixing source starting from any fixed vertex, then for all large enough n, at the end of the n step of the walk, the distribution Q on the vertices will be-close to, where M is polynomial in n andfor any constant.
Let π be the stationary distribution on the vertices and since we consider an odd length cycle (a 2-regular graph), the stationary distribution is the uniform distribution on M vertices. Suppose we take a n step walk on the graph G starting from any vertex v according to the bits from a -bit-fixing source, where and the probability vector on the vertices at the end of the walk is . Now we take a n step walk on the graph G starting from the same vertex v according to the bits from a -nonpseudorandom bit-fixing source and the probability vector on the vertices at the end of the walk is , where and for any constant . This bound on can be justified as follows.
Note that the only difference between -nonpseudorandom bit-fixing source and -bit-fixing source is that on the set I, in -bit-fixing source, we have instead of pseudorandom distribution (say D) on . Also observe that actually P and Q are the distributions on vertices at the end of a k step walk, where the walk was started from the vertex v and done according to the bits coming from and D respectively, because a step according to a fixed bit will not change the output distribution and in a -bit-fixing source (also in a -nonpseudorandom bit-fixing source), all the bits are fixed. For a step according to a fixed bit gives rise to a transition matrix that is actually a permutation matrix and thus it leaves the distance from uniform unchanged. Hence, if the bound on is not true then we can use this k step walk on G as a polynomial (polynomial in k) time algorithm to distinguish between and D. Thus,
□
The above lemma together with the fact that the second largest eigenvalue of a l length odd cycle is , implies Theorem 6.8.
Let us take an odd cycle G with vertices. The second largest eigenvalue of G is . Now take a walk starting from a fixed vertex of G according to the bits from -nonpseudorandom bit-fixing source and finally output the vertex number of the graph G. Thus we get bits. From Lemma 6.10, we reach distance from uniform.
By the Taylor expansion of the cosine function, for , .
Therefore, . Hence, . Thus we get distribution of bit strings which is -close to uniform in statistical distance. □
Sampling and partitioning with a short seed
Here we restate some of the results on sampling and partitioning used in construction of deterministic extractor for bit-fixing sources from [9]. Let be some subset of size k. Now we consider a process of generating a subset such that and this process is known as Sampling.
A function is called a -sampler if for any subset , where , .
Now consider a similar process known as Partitioning, the task of which is to partition into m distinct subsets such that for every , .
According to [9], the above two processes can be performed using only a few random bits.
([9]).
For any constant, there exist constants,andsuch that for anyand, there is an explicit construction of a functionwhich is a-sampler, where.
For any constant, there exist constantsandsuch that for anyand, there is an explicit construction that uses onlyrandom bits and partitionintomany subsetssuch that for any subset, where,.
Generating an independent seed
In this subsection, we see the way of obtaining a short seed from a nonpseudorandom bit-fixing source so that we can use them in a seeded pseudorandom extractor to extract out almost all the pseudorandom part from the source. The main problem of using this short seed in a seeded pseudorandom extractor is that the already obtained seed is dependent on the main distribution. Now we describe that this problem can be removed in the case of nonpseudorandom bit-fixing sources. Even though the result is analogous to [9], the proofs differ in essential details.
(Seed obtainer).
A family of functions is said to be a -seed obtainer () if for every -nonpseudorandom bit-fixing source , the distribution can be written as 11
It means for all n, .
for , and such that and for every a, there exists a -nonpseudorandom bit-fixing source such that for all large enough n, is -close to .
In the above definition, by , we mean the product of two distributions and . From the above definition it is clear that given a seed obtainer and a seeded pseudorandom extractor for nonpseudorandom bit-fixing sources, we can easily construct a deterministic pseudorandom extractor. The following theorem provides us the details of such construction, where the correctness follows from the properties of our proposed notion of dimension described in Section 4.
Supposeis a-seed obtainer, wherefor some constantandis a seeded pseudorandom extractor for-nonpseudorandom bit-fixing sources, where. Then the functiondefined asfor all, is a deterministic pseudorandom extractor for-nonpseudorandom bit-fixing sources.
By the definition of the seed obtainer, we can write , for some ensemble of distributions Y. Now by Lemma 4.5, for all a, is pseudorandom and as a consequence, by Lemma 4.6, Y is pseudorandom as well. Then using Lemma 4.5, we can argue that is also pseudorandom as , for some constant . □
Now we give an explicit construction of -seed obtainer, which is crucial in the later part of this paper. To understand the notion of sampler used in the following theorem, the readers may refer to the last subsection.
Letwherebe a-sampler andwithbe a deterministic ε-extractor for-nonpseudorandom bit-fixing sources, wherefor any constant. Then there is an explicit-seed obtainer, where,, and.
The construction of the seed obtainer is the same as that mentioned in [9], however the proof requires a slightly different argument.
The construction of F mentioned in the theorem is as follows:
Given , compute . Denote the first bits of by and the last bits by ;
Compute and denote it as T;
Let and . If , pad it with zeros to get n-bit long string. Now output .
Note that the above construction is the same as the construction of seed obtainer given in [9]. However, the proof is not the same and more specifically the proof of the next claim differs from that of the similar claim made in [9]. Here, in the proof we use the properties of pseudorandomness and the fact that the distribution under consideration has polynomial-size support.
Let X be a -nonpseudorandom bit-fixing source and now consider any large enough n and let I be the set of indices at which the bits are not fixed. For a string , denotes and denotes . Given a string , denotes and denotes n-bit string obtained by padding with zeros. Let and . We say that a string correctly splits if .
For every which correctly splits , is -close to , where for any constant .
Let . Given a string and a string , we define as follows:
Suppose l indices of are and the indices of are . The string is defined as:
In this notation, we denote . Now consider the distribution . For every , we consider the event . As a correctly splits , there are at least “good” indices in . Now fix some such that .
Now we claim that for all subsets where , there exists a such that the distribution forms an -nonpseudorandom bit-fixing source if for some constant , and
For the sake of contradiction, let us assume that the above claim is not true. It means that there exists a subset , where
, ,
where , for some constant , and also
for all , the distributions are not forming -nonpseudorandom bit-fixing sources.
Now let us consider only the “good” positions which are many in X and at least many in . So the above assumption implies that the ensemble of distributions formed by considering those bits (this part of the string b is denoted as ) in is not pseudorandom, i.e., it has its corresponding distinguishing circuits . If this is the case, then the circuit C (by hard-wiring the good random bits) corresponding to the following algorithm A, will act as a distinguishing circuit for the pseudorandom distribution P on many bits; which is a contradiction. The algorithm A is as follows: on input , if for any , then return ; otherwise return 0 or 1 uniformly. And thus clearly,
Circuit C is nothing but the combination of all the circuits , for , each of which is of polynomial size. Now as , and by our assumption that the distribution under consideration has polynomial-size support (see statement of Theorem 6.7), the support of the subset J is at most polynomial. Hence the circuit C is of polynomial size. Note that this is the only place where we use the fact that the distribution under consideration is of polynomial-size support.
So, we can write,
where for any constant . The first inequality follows from the fact that we can split the sum in two parts one in which ’s are not -nonpseudorandom bit-fixing sources and another in which ’s are at least -nonpseudorandom bit-fixing sources. □
Next we mention a claim from [9] that makes comment on independence of the pair conditioned on the event .
([9]).
For every fixed that correctly splits , the distribution is -close to .
Note that as a correctly splits , so forms a -nonpseudorandom bit-fixing source.
The rest of the proof follows directly from the proof of correctness of the construction of seed obtainer given in [9] with the following parameters , . □
A seeded pseudorandom extractor
In this subsection, we discuss about how we can extract many pseudorandom bits using random bits. In the next subsection, we will use this seeded pseudorandom extractor and the techniques discussed in the previous subsections, to construct deterministic extractor. The construction of seeded pseudorandom generator given in the proof of the following theorem is same as that of the seeded randomness extractor given in [9]. However, the analysis is quite different and uses some of the properties of dimension.
For anand any constant, there exist constants,such that there is an explicit functionwhich acts as a seeded pseudorandom extractor for-nonpseudorandom bit-fixing sources withand.
Let X be a -nonpseudorandom bit-fixing source and for some large enough n, x be a string sampled by . The description of the extractor is as follows:
According to Lemma 6.13 provided in Section 6.1.2, using y as seed, we obtain a partition of into many sets with the parameter α;
For , compute ;
Output .
Let be the set of indices at which the bits are not fixed and let be the distribution of the output strings. We need to show that is pseudorandom.
Let be the event and be the complement event. According to Lemma 6.13, . Now we can write the output distribution as
and hence due to Lemma 4.5, Z is pseudorandom. □
Deterministic pseudorandom extractor
Now it only remains to combine all the components we discussed so far to build the final deterministic pseudorandom extractor mentioned in Theorem 6.7. We first extract amount of almost random bits by Theorem 6.8 and then use the same as seed in the seeded extractor described in Theorem 6.17. To use the seeded extractor it is required to modify the source such that it becomes independent of the random bits extracted using Theorem 6.8. For that purpose, we use the technique developed in Section 6.1.3 and this concludes the proof of Theorem 6.7.
Due to Lemma 6.12, we have a -sampler , where and . From Theorem 6.8, we have a deterministic -extractor for -nonpseudorandom bit-fixing sources where for all large enough n, and . Now we use Theorem 6.16 to get -seed obtainer where for all large enough n, and , for some constant p. According to Theorem 6.17, we have a seeded pseudorandom extractor with and for -nonpseudorandom bit-fixing sources. Since , we use F and in Theorem 6.15 to construct a deterministic pseudorandom extractor . For a large enough n, and this completes the proof. □
Discussion on pseudorandom extractor for nonpseudorandom samplable distributions
Another interesting special kind of source is samplable distributions studied by Trevisan and Vadhan [32]. In a natural way, one can extend the definition of samplable distribution to nonpseudorandom distribution as follows: for any , an ensemble of distributions is said to be s-nonpseudorandom samplable by circuit of size if for all large enough n, there exists a circuit C of size at most that samples from and . Observe that the negative results for deterministic randomness extractor in case of samplable distributions will also applicable for deterministic pseudorandom extractor in case of s-nonpseudorandom samplable distribution. By Lemma 5.8, if for all large enough n, then for any . Now by applying the argument in [32], we get the following.
Supposeis a family of functions computable in timesuch that E is a deterministic pseudorandom extractor for ensemble of distributions that are s-nonpseudorandom samplable by circuit of sizefor any. Then there is a language inof circuit complexity at least.
The existence of deterministic pseudorandom extractors implies separations between deterministic complexity classes and non-uniform circuit classes that are not yet known. So one might have to consider some complexity theoretic assumptions like in [32] to construct deterministic pseudorandom extractor. However, we do not think construction with such strong assumption like in [32] will be interesting in this case as it is known that certain hardness assumption already leads to a construction of optimal pseudorandom generator (see Section 7). Nevertheless, it is natural to ask the question of constructing explicit extractor using amount of extra randomness. We do not know any such result so far, but in the next section we will see that if some distribution is samplable using very few () random bits, then it is possible to extract out all the pseudorandom bits using extra random bits.
Approaching towards
We now show that if there is an exponential time computable algorithm with where the output distribution has dimension s (), then this will imply . We refer to this algorithm G as optimal nonpseudorandom generator. The proof of this is similar to the proof of Theorem 7.4 [26]. We start with some basic definitions.
A pseudorandom generator against a class of circuits is a function which takes a random seed as input and outputs a sequence of bits which is a pseudorandom distribution.
(Pseudorandom generators).
A function G is said to be a -pseudorandom generator if
with .
is computable in time.
For sufficiently large n, is -pseudorandom.
(Optimal pseudorandom generators).
A function G is said to be an optimal pseudorandom generator if it is an -pseudorandom generator.
Nisan and Wigderson [26] showed that there is a connection between pseudorandom generators and hard functions in :
(Hard function).
A function where is -hard for any and , if for all large enough n, for all circuits C of size at most ,
The following theorem shows that under the assumption of existence of hard function in , optimal pseudorandom generator exists [26].
There exists an optimal pseudorandom generators if and only if there is a language L inandsuch that L is-hard.
The proof of the above theorem is constructive and thus we can explicitly convert optimal pseudorandom generators to the hard function and conversely. However this is still a very strong requirement and later Impagliazzo and Wigderson weakened it.
Suppose there is a language L inandsuch that L on inputs of length n cannot be solved by circuits of size at most. Then there exists a languageinandsuch thatis-hard and as a consequence optimal pseudorandom generator exists.
Now let us state and prove the main result of this section.
Consider anyand. If there exists an algorithmwherecomputable insuch that, then.
Suppose . If , then for all large enough n, there must be a subset of indices such that and for any , loss incurred by any polynomial-size predictor at ith bit position is non-zero or in other words, for any polynomial-size circuit family , . Actually one can show a much stronger claim that there exists such a subset S such that , for some constant . Otherwise . To prove this, suppose and thus there exists a predictor π such that for every ,
Now as , so for all large enough n, for any , . Hence and by Corollary 3.7, as well.
Suppose S contains first many such indices. Also assume that and . Now we define two languages and as follows: for ,
where denotes the string generated by concatenating j and y. First of all, note that as , none of and is an empty set. Now clearly either or is the language that satisfies all the conditions of Theorem 7.5 [18]. Otherwise, there exists a predictor circuit of size at most , for some , i.e., polynomial in n, by which we can predict bit position or loss incurred by that predictor at bit position will be zero implying which is a contradiction. Thus either or can be used to construct an optimal pseudorandom generator and that eventually implies . □
Footnotes
Acknowledgements
The authors thank Somenath Biswas for helpful discussions and comments. The second author thanks Andrej Bogdanov for suggesting the study of the notion of pseudoentropy. The authors also thank the anonymous reviewers for their helpful comments and suggestions. The second and third author would like to acknowledge the support of Research-I Foundation and the third author would also like to acknowledge the support of the European Research Council under the European Union’s Seventh Framework Programme (FP/2007-2013)/ERC Grant Agreement n. 616787.
References
1.
M.Agrawal, D.Chakraborty, D.Das and S.Nandakumar, Dimension, pseudorandomness and extraction of pseudorandomness, in: 35th IARCS Annual Conference on Foundation of Software Technology and Theoretical Computer Science, FSTTCS, Bangalore, India, 16–18 December 2015, 2015, pp. 221–235.
2.
K.B.Athreya, J.M.Hitchcock, J.H.Lutz and E.Mayordomo, Effective strong dimension, algorithmic information, and computational complexity, SIAM Journal on Computing37 (2007), 671–705. doi:10.1137/S0097539703446912.
3.
K.B.Athreya and S.N.Lahiri, Measure Theory and Probability Theory, Springer Verlag, 2006.
4.
B.Barak, R.Shaltiel and A.Wigderson, Computational analogues of entropy, in: RANDOM-APPROX, S.Arora, K.Jansen, J.D.P.Rolim and A.Sahai, eds, Lecture Notes in Computer Science, Vol. 2764, Springer, 2003, pp. 200–215.
5.
M.Blum and S.Micali, How to generate cryptographically strong sequences of pseudo-random bits, SIAM J. Comput.13(4) (1984), 850–864. doi:10.1137/0213053.
6.
B.Chor, O.Goldreich, J.Håstad, J.Freidmann, S.Rudich and R.Smolensky, The bit extraction problem or t-resilient functions, 1985.
7.
T.Cover, Universal gambling schemes and the complexity measures of Kolmogorov and Chaitin, Technical Report 12, Stanford University Department of Statistics, 1974.
8.
T.M.Cover and J.A.Thomas, Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing), Wiley-Interscience, 2006.
9.
A.Gabizon, R.Raz and R.Shaltiel, Deterministic extractors for bit-fixing sources by obtaining an independent seed, in: Proceedings of the 45th Symposium on Foundations of Computer Science (FOCS 2004), Rome, Italy, 17–19 October 2004, 2004, pp. 394–403. doi:10.1109/FOCS.2004.21.
10.
O.Goldreich, The Foundations of Cryptography – Volume 1, Basic Techniques, Cambridge University Press, 2001.
11.
S.Goldwasser and S.Micali, Probabilistic encryption, J. Comput. Syst. Sci.28(2) (1984), 270–299. doi:10.1016/0022-0000(84)90070-9.
12.
I.Haitner, O.Reingold and S.P.Vadhan, Efficiency improvements in constructing pseudorandom generators from one-way functions, in: Proceedings of the 42nd ACM Symposium on Theory of Computing, STOC 2010, Cambridge, Massachusetts, USA, 5–8 June 2010, 2010, pp. 437–446.
13.
J.Håstad, R.Impagliazzo, L.A.Levin and M.Luby, A pseudorandom generator from any one-way function, SIAM Journal on Computing28(4) (1999), 1364–1396. doi:10.1137/S0097539793244708.
14.
J.M.Hitchcock, Fractal dimension and logarithmic loss unpredictability, Theoretical Computer Science304(1–3) (2003), 431–441. doi:10.1016/S0304-3975(03)00138-5.
15.
J.M.Hitchcock, Effective Fractal Dimension Bibliography, 2011, http://www.cs.uwyo.edu/~jhitchco/bib/dim.shtml (current April, 2011).
16.
J.M.Hitchcock, Resource Bounded Measure – Bibliography, http://www.cs.uwyo.edu/~jhitchco/bib/rmb.shtml (current April, 2011).
17.
C.-Y.Hsiao, C.-J.Lu and L.Reyzin, Conditional computational entropy, or toward separating pseudoentropy from compressibility, in: Proceedings of the Advances in Cryptology – EUROCRYPT 2007, 26th Annual International Conference on the Theory and Applications of Cryptographic Techniques, Barcelona, Spain, 20–24 May 2007, 2007, pp. 169–186.
18.
R.Impagliazzo and A.Wigderson, P = BPP unless E has sub-exponential circuits: Derandomizing the xor lemma (preliminary version), in: Proceedings of the 29th STOC, ACM Press, 1996, pp. 220–229.
19.
J.Kamp and D.Zuckerman, Deterministic extractors for bit-fixing sources and exposure-resilient cryptography, in: Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science, 2003, pp. 92–101.
20.
C.-J.Lu, O.Reingold, S.P.Vadhan and A.Wigderson, Extractors: Optimal up to constant factors, in: STOC, L.L.Larmore and M.X.Goemans, eds, ACM, 2003, pp. 602–611.
21.
J.H.Lutz, Gales and the constructive dimension of individual sequences, in: Proceedings of the 27th International Colloquium on Automata, Languages, and Programming, 2000, pp. 902–913, Revised as [22]. doi:10.1007/3-540-45022-X_76.
22.
J.H.Lutz, The dimensions of individual strings and sequences, Information and Computation187 (2003), 49–79, Preliminary version appeared as [21]. doi:10.1016/S0890-5401(03)00187-1.
23.
J.H.Lutz, Dimension in complexity classes, SIAM J. Comput.32(5) (2003), 1236–1259. doi:10.1137/S0097539701417723.
24.
J.H.Lutz, A divergence formula for randomness and dimension, Theor. Comput. Sci.412(1–2) (2011), 166–177. doi:10.1016/j.tcs.2010.09.005.
25.
N.Nisan and A.Ta-Shma, Extracting randomness: A survey and new constructions, J. Comput. Syst. Sci.58(1) (1999), 148–173. doi:10.1006/jcss.1997.1546.
26.
N.Nisan and A.Wigderson, Hardness vs randomness, J. Comput. Syst. Sci.49(2) (1994), 149–167. doi:10.1016/S0022-0000(05)80043-1.
27.
N.Nisan and D.Zuckerman, More deterministic simulation in logspace, in: Proceedings of the Twenty-Fifth Annual ACM Symposium on Theory of Computing, San Diego, CA, USA, 16–18 May 1993, 1993, pp. 235–244.
28.
N.Nisan and D.Zuckerman, Randomness is linear in space, Journal of Computer and System Sciences52 (1996), 43–52. doi:10.1006/jcss.1996.0004.
29.
Omer Reingold, L.Trevisan and S.Vadhan, Pseudorandom walks on regular digraphs and the rl vs. l problem, in: Proceedings of the 38th Annual ACM Symposium on Theory of Computing (STOC‘06), 2006, pp. 457–466.
30.
J.Radhakrishnan and A.Ta-Shma, Bounds for dispersers, extractors, and depth-two superconcentrators, SIAM Journal on Discrete Mathematics13 (2000), 2000. doi:10.1137/S0895480197329508.
31.
R.Shaltiel, Recent developments in explicit constructions of extractors, Bulletin of the EATCS77 (2002), 67–95.
32.
L.Trevisan and S.P.Vadhan, Extracting randomness from samplable distributions, in: FOCS, IEEE Computer Society, 2000, pp. 32–42.
33.
S.P.Vadhan and C.J.Zheng, Characterizing pseudoentropy and simplifying pseudorandom generator constructions, in: Proceedings of the 44th Symposium on Theory of Computing Conference, STOC 2012, New York, NY, USA, 19–22 May 2012, 2012, pp. 817–836.
34.
H.Vollmer, Introduction to Circuit Complexity – A Uniform Approach, Texts in Theoretical Computer Science. An EATCS Series, Springer, 1999.
35.
A.C.Yao, Theory and application of trapdoor functions, in: Proceedings of the 23rd Annual Symposium on Foundations of Computer Science, SFCS’82, IEEE Computer Society, Washington, DC, USA, 1982, pp. 80–91.