I created a problem for the National Olympiad in Informatics (NOI) 2015 which I liked a lot. The problem title is “radioactive”. It hasn’t been uploaded on the official site yet, but I’ll paste the link here when it has. Unfortunately, no one solved it fully which is quite a waste.

In short, the problem is very simple. Given N < 1018 and C < 200, count the number of binary strings of length N such that no more than C ‘1’s appear in a row. It is a very nice problem, so I would recommend giving it a few days or weeks of thought before reading the solution, which I will detail here.

The first thing we note is that for C = 1, the problem reduces to finding Fibonacci numbers, which is a very classic problem. The Nth Fibonacci number can be found in logarithmic time through fast matrix exponentiation, which is a divide and conquer algorithm. We would expect that the solution to the general problem might be rather similar. Fortunately, an understanding of “advanced” mathematical ideas like linear algebra is not required to solve this problem.

The most obvious way to divide the problem into two is to cut the binary string into two smaller strings of length $\frac{N}{2}$. If we count the number of binary strings of length $\frac{N}{2}$ that satisfy the condition that no C ‘1’s appear together – call it the C-condition – perhaps we could join them together to make a binary string of length N that satisfies the C-condition. Unfortunately, this does not always work, as the below example for C = 2 shows:

1011 + 1101 = 10111101

Clearly, while the first two strings satisfy the C-condition, the combined string does not. We note that if we split a binary string into substrings of repeated ‘1’s and ‘0’s, only the first and last substrings of repeated ‘1’s contribute to the above problem. All other substrings in between do not matter.

This irrelevance of information suggests some form of dynamic programming. In particular, we only need to store, for each N, the number of strings of length N that satisfy the C-condition such that the first substring of repeated ‘1’s is of length A and the last substring of repeated ‘1’s is of length B for each A and B. Of course, if a binary string begins with a ‘0’, then A = 0, and the same holds for B.

If we additionally note that A and B will always be less than C, we see that we will need C2 log N states in total. If we then note that for the same N, all combinations of A and B with the same A + B have the same solution, we note that we only need to store X = A + B for up to 2C different values. This gives us $\mathcal{O}(C \log{N})$ states.

Let the value of A and B for the left string be A1 and B1 and that of the right string be A2 and B2. As an example, the following is a possible way to fuse two strings of length N of A1 = 1, B1 = 2, A2 = 1 and B2 = 1 for C = 3 (so X1 = 3, X2 = 2):

1011 + 1001 = 1011 1001

The allowed combinations of A2 and B1, then, is all A2, B1 such that A2 + B1 ≤ C. We can iterate through all such combinations for each state, taking C2 time, resulting in a complexity of $\mathcal{O}(C^2 \log{N})$ time in total.

Now, let’s take a look at an example to illustrate how we could speed this up even further. Let the number of strings of length N that satisfy the C condition with A + B = X be dp[N][X]. Say C = 3. The following is, roughly, the expression for X = 0, 1, 2, 3 (assuming N is very large):

dp[N][0] = dp[N/2][0] * (dp[N/2][0] + dp[N/2][1] + dp[N/2][2] + dp[N/2][3])

+ dp[N/2][1] * (dp[N/2][0] + dp[N/2][1] + dp[N/2][2])

+ dp[N/2][2] * (dp[N/2][0] + dp[N/2][1])

+ dp[N/2][3] * (dp[N/2][0])

If we were to compute every combination of A2 and B1, it is clear that we would take C2 time. However, by grouping the terms in the above manner, we note that there is a lot of repetition in the terms in the brackets. We can thus use a cumulative sum method to first compute each of the bracketed expressions in linear time, then compute the C multiplications and add them up again in linear time. Thus we have achieved $\mathcal{O}(C)$ time for $\mathcal{O}(C \log{N})$ states, giving us the desired $\mathcal{O}(C^2 \log{N})$ total running time.

This completes the solution. Interestingly, this solution can be motivated by a more mathematical perspective in which we express the dp[N] subarray as a vector and try to derive a recurrence to obtain the subarray for larger N. If we adapt directly from the Fibonacci numbers solution and find a matrix that brings dp[N] to dp[N + 1], the sub-cubic time complexity of matrix multiplication results in a running time of slightly less than $\mathcal{O}(C^3 \log{N})$ time. If instead we found a matrix that brings dp[N] to dp[2N], since multiplying a matrix by a vector takes quadratic time, we can achieve a running time of $\mathcal{O}(C^2 \log{N})$ instead. This array can easily be constructed from the above discussion.

More interestingly, there is actually a way to do this problem in $\mathcal{O}(C \log{C} \log{N})$ time. If we express the dp[N/2] subarray as A, the cumulative sum of the dp[N/2] subarray as B and the dp[N] subarray as C, we note that $C = A \ast B$ where here the asterisk indicates convolution. We can thus use the Fast Fourier Transform to compute the transitions of all C states in linearithmic time, leading to the above overall running time. To mitigate floating point rounding errordps, the Number Theoretic Transform could be used instead.

Finally, by looking at the problem from an entirely different albeit more classical angle, we can, if we wanted to, solve the problem in $\mathcal{O}(N)$ time. This is a simple adaptation from the naive dynamic programming approach which takes $\mathcal{O}(NC)$ time.

Let dp[N] be the number of binary strings of length N + 1 that end with a zero (effectively the total number of binary strings of length N). Before that zero, we can have a contiguous series of ‘1’s of any length from zero to C:

dp[N] = dp[N – 1] + dp[N – 2] + … + dp[N – C – 1]

Rewrite this for N + 1:

dp[N + 1] = dp[N] + dp[N – 1] + … + dp[N – C]

Note that almost all the terms in the sum are the same as before. Thus, we could use a sliding window technique to speed up the computation by only adding and removing a constant number of terms at both ends of the sum each iteration. In other words:

dp[N + 1] = 2 * dp[N] – dp[N – C – 1]