Wednesday, May 06, 2026

When do we know someone has died

As the blog of record in computational complexity, we like to bring attention to those in the community who have left us. When we learn of someone in our field who has died, Bill and I will talk to each other and decide whether we should do a social media post or a full blog post, and who should write it, Bill, me, or someone else. In fact, if I call Bill, he'll often answer the phone with "who died?"

We also remember those who passed away during the year in our end-of-year post.

One challenge is how we actually know when somebody has died. Consider Michael Rabin. His death was announced on Wikipedia based on the following announcement in Haaretz, an Israeli newspaper.

Haaretz Obit (Translated by Google)

That's a pretty simple obituary for a very famous computer scientist. Rabin is a common name in Israel, and there easily could have been another professor named Michael Rabin somewhere in the country. Every mention of Michael Rabin's death that I saw was just citing the Wikipedia article, nothing from Tal Rabin or some other source that cited the family.

By the time Bill put up the first Rabin post on April 22, we figured that had our Michael Rabin not died, someone would have come forward about it. 

Tony Hoare is a different story, where our blog was one of the first to break the news. I heard from two separate people that they had heard from the family that he had passed away. It helped that I was in Oxford at the time, where Hoare spent much of his career.

And too often a theoretical computer scientist passes away but the news never reaches us and we don't remember them. It's always sad when someone passes, but it is a good opportunity to remember how they helped shape our field. But we need your help to know when someone has passed away. So if you know someone in our community has passed away, please let us know, and how you know, so that we can know we know.

Sunday, May 03, 2026

A few notes on Michael Rabin

Michael Rabin passed away on April 14,2026. I blogged about him here

My post listed results of his that proved upper and lower bounds on problems. My point was that he proved upper and lower bounds for MANY different levels- from decidable to regular. And I am sure I left out some of his results. 

Here are some things I did not mention.

1) Rabin and Scott shared the Turing Award in 1976.  My not mentioning it raises the following question:

If I want to say someone has an impressive set of results which is a better way:

listing the awards they've won, or

listing  their results. 

I leave this to the reader. 

2) I had Rabin for two graduate courses at Harvard: Algorithms and Complexity Theory. He was a great teacher and gave insights into the results, some of which he had either proven or worked on.

3) I recalled thanking him in my PhD Thesis. So I dusted it off to see what I had said: 

The many courses I have taken at Harvard and MIT have helped me create this thesis. I am especially indebted to Michael Rabin, Mike Sipser, and Michael Stob for their excellent courses in algorithms, complexity theory, and recursion theory. Their pedagogy has been an inspiring example of what good teaching can and should be.

What is the probability that all three great teachers were named Michael? I do not know, however, I suspect Michael Rabin could have told me. 




Wednesday, April 29, 2026

Because It Doesn't Have To

My favorite quote about networking came from Jim Kurose.

The Internet works so well because it doesn't have to.

The IP and lower layers of the internet stack make no promises of delivery. Complete failure fulfills the protocol. This allows for simpler and more powerful protocols without the extra complexity needed to guarantee success. TCP aims for delivery basically by restarting the IP communication when it fails, and even TCP can report failure to the layers above.

We can say the same about modern artificial intelligence.

Machine learning works so well because it doesn't have to.

With the softmax function that neural nets use to determine the probability of outputs, neural nets never completely rule out a possibility, always giving it at least some tiny probability. In cases where the complexity is just too difficult, neural nets give several possibilities with nontrivial probabilities, as I described in my recent post, where a machine learning model would generate a uniform distribution to capture the output of a pseudorandom generator. Instead of rigidly forcing the model to give us a specific answer, by looking at distributions we allow the models to make mistakes.

Thus a machine learning model can be correct when it makes probabilistic guesses in situations too complicated to solve directly, which allows it to achieve its best possible performance. Because we allow the models to make mistakes, they have the flexibility to solve complex problems far more frequently.

Sunday, April 26, 2026

LEAPing into the Future of Coding

A few months ago in Oxford, Bernard Sufrin, an emeritus fellow, said he's looking to hire a student to implement LEAP (Logic Engine for Argument by Pointing), a way to teach logic by proving basic logic theorems via pointing and clicking. Rahul Santhanam said why not give it to AI. Bernard said AI can't handle this task. I thought why not give it a try.

I gave Claude a single prompt that Bernard helped me formulate:

Architecture of a system to support proof by pointing in first order logic using LEAP

About an hour later, without giving additional details, we had a working prototype. Give it a try. I put the code and some more details on GitHub.

Watching Claude work was amazing. Claude created an architecture based on the paper Proof by Pointing by Yves Bertot, Gilles Kahn and Laurent Théry.

Claude then asked me:

Would you like me to proceed with implementing any of these layers?

Why not? So I told Claude to go ahead.

It started coding and it created 78 test cases and kept debugging itself until all those test cases worked out. It took me longer to get the program working on the web than Claude took to program it. I sent the link to Bernard who responded "Wow! I take it back if the program was all synthesized from the prompt." 

When I asked Bernard a month later if I could post about this program, he agreed, though he had some additional comments.

The consolation for me is that the outcome, though surprising and I suppose delightful (though it didn't manage rules for quantifiers), didn't refute my conjecture that Claude et al cannot do "architecture". The architecture (MVC) is stock; the "proof by pointing" hint led it to a paper of the same name which gave details of how to derive terms/formulae/goals in the (unfinished) proof from screen coordinates. Maybe you could give a substantively different prompt.

It may be of interest to know that the Bornat/Sufrin JAPE program, still on GitHub, was eventually a lot more ambitious when it came to selecting things: terms, subterms, goals. But it took more than 2 hours to build!

Bernard has a point. It wasn't just the fifteen-word prompt alone. Claude leaned on the Bertot et al. paper to guide its design and implementation. Still, that Claude did architect, build and debug the system from the prompt and the paper is truly impressive. We've truly crossed a threshold for coding, far beyond what would have been possible just a few months earlier.

Wednesday, April 22, 2026

Michael Rabin Passed Away on April 14, 2026, at the age of 94

Michael Rabin passed away on April 14, 2026 at the age of 94.  (Scott Aaronson has also blogged about his passing, see  here.) 


I had many points to make about him; however, the first one got so long that I will just do that one for today's blog post.

Rabin is an extremely well-rounded \(\ldots\) computer scientist? Computer scientist seems too narrow, and the point of this point is that he was well rounded. So I will start this thought again.

The following is an extremely important question that permeates computer science, mathematics, and I am sure other fields:

Given a problem, how hard is it?

Note that this is a rather broad problem since the terms problem and hard are very broad.  And by hard I mean upper bounds and lower bounds.

Rabin had worked on this problem in many different domains. I list them in roughly the order of hardness, starting from the hardest.

1) Recursive Algebra: Mathematicians had proven that every field has an algebraic closure (an extension that is algebraically closed).  In 1960 Rabin asked and answered affirmatively: Does every computable field (the elements of the field are computable, \(\times\), \(+\), are computable) have a computable algebraic closure. This was an early result in what was later to be Nerode's Recursive Math Program which later became a subcase of the  Friedman/Simpson Reverse Math Program.

2) The Decidability of S2S: In 1969  Rabin proved that the the second order theory of 2 successors (S2S) is decidable. In a course I had with him he taught the proof that the weak second order theory of 1 successor (WS1S) is decidable. I teach that when I teach Automata Theory since it brings together Finite Automata and Decidability.  Here are my slides:here

S2S is one of the only decidable theories where one can actually state theorems in math of interest in them.  (I may blog about that some other time.)

S2S is the decidable theory with the hardest proof of its decidability. Rabin's proof used transfinite induction, though later proofs did not.

Personal Note: I was the subreferee on a paper by Gurevich and Harrington that simplified the proof tremendously, back in the early 1980's. Their proof is the one to read now.  Rabin was happy that the proof was simplified. The proof is still hard, just not as hard. 


3) In 1974  Fisher & Rabin showed that any algorithm to decide Presburger arithmetic required time \(\ge 2^{2^{cn}}\) for some constant \(c\). I asked chatty if better is known and it gave me answers that didn't make sense. An earlier version of this paragraph said so and had some incorrect statements in it. One of the commenters told me what is TRUE which made it easier to look it up. Anyway, there is a triple-exp algorithm, and there is a complexity class that Presburger is complete for---see the comment. Spellcheck thinks Presburger is spelled incorrectly. I can understand why it thinks so, but its wrong. 

When Rabin taught this result he pointed out that Hilbert and others not only thought that math was decidable but also that perhaps that algorithm could be used to really do math. The complexity results show that even if a theory is decidable it may still be really hard. (With AI maybe we can use computers to do math, but that is way too big a topic, and too big a tangent, to get into within a blog-obit).

4) In 1972 Rabin proved the following. Given linear forms \(L_1(x),\ldots,L_m(x)\) over the reals (so the intent is \(x\in R^n \)) we want to know if there exists \(x\in R^n \) such that, for all \(1\le i\le m \), \(L_i(x)>0 \).  The model of computation is a decision tree where each internal node can ask a question of the form \(L(x) {\rm\  RELATION\  } 0 \) where RELATION can be any of \(<,=,>\). Each leaf is labelled YES or NO.  Then the depth of the tree is \(\ge 2^{\Omega(n)} \).  This is an early paper in decision tree complexity.

5) In 1977 the Handbook of Math Logic came out. Rabin did the article on Decidable theories. He mentioned P vs NP as being really important and being the next logical (no pun intended) direction for logic. He was the only author to mention P vs NP.

6) While Rabin did not invent randomized algorithms he worked on them a lot early on.  In 1977 Solovay and Strassen obtained a polytime randomized algorithm for primality.  In 1976 Rabin noticed that Miller's Primality algorithm (which showed that if the Extended Riemann Hypothesis is true then primality is in P) could be modified to be a randomized polynomial time algorithm for primality. While preparing this blog post I noticed that I often hear about the Miller-Rabin algorithm (many cryptography protocols need a primality algorithm) but I hardly ever hear about the Solovay-Strassen algorithm. I asked Google AI why this was. In brief: Miller-Rabin is faster, has a lower probability of error, and is simpler. Is Google AI correct? I think yes since Miller-Rabin is used and Solovay-Strassen is not. I might be employing circular reasoning here. 

The Miller-Rabin primality test might be his second best known work, the best known being the last item on this list, the result of Rabin and Scott that NFA's are equivalent to DFA's.  (It's last since it is the lowest complexity class he worked on.)

 
7) Rabin obtained other randomized algorithms. I mention two:

a) Rabin-Shallit: When teaching automata theory I often give my students the following sets in NP and ask them to determine which ones are known to be in P:

\( \{ x \in N \colon (\exists y)[x=y^2] \} \)

\( \{ x \in N \colon (\exists y_1,y_2)[x=y_1^2+ y_2^2] \} \)

\( \{ x \in N \colon (\exists y_1,y_2,y_3)[x=y_1^2+ y_2^2+ y_3^2] \} \)

\( \{ x \in N \colon (\exists y_1,y_2,y_3,y_4)[x=y_1^2+ y_2^2+ y_3^2+y_4^2] \} \)

\( \{ x \in N \colon (\exists y_1,y_2,y_3,y_4,y_5)[x=y_1^2+ y_2^2+ y_3^2+y_4^2+y_5^2] \} \)

etc.

The input is in binary so searching for all possible \(y_1,y_2,y_3,y_4,\ldots,y_n\) is exponential in the length of the input.

I leave the first three to the reader.

This one:

\( \{ x \in N \colon (\exists y_1,y_2,y_3,y_4)[x=y_1^2+ y_2^2+ y_3^2+y_4^2] \} \)

is sort-of a trick question. Every number is the sum of 4 squares, so this set, and all later ones, are trivially in P. But that raises the question: how to find \( y_1,y_2,y_3,y_4 \)? This is a curious case of a problem that is in NP, the decision part is in P (in fact, trivial),  but finding the witness seems hard.

When I first assigned this problem I then looked up what was known about finding the \(y_1,y_2,y_3,y_4\). 

In 1986 Rabin and Shallit showed that, assuming ERH, there is a randomized \( O(\log^2 n) \) algorithm. I am surprised that you need both ERH and Randomness. This seems to be a less-well known result though I don't know why since it's a natural question.

b) Karp-Rabin: In 1987 Karp and Rabin obtained a really fast, really simple (that's a plus in the computer science world)  randomized pattern matching algorithm.  Since it is fast and simple I wondered if it is really used. It is! To quote Google AI:

Yes, the Karp-Rabin algorithm is used in the real world, particularly in scenarios requiring the detection of multiple patterns simultaneously, such as plagiarism detection,data deduplication, and bioinformatics.

Is Google AI correct? I leave that as an exercise for the reader. 

8) In 1979 Rabin devised a cryptosystem whose security is equivalent to factoring. How come RSA became the standard and not Rabin's system? Broadly two possibilities

a) RSA was better.

b) RSA was faster to the marketplace and other random factors.

Which is it? I leave that to the reader.

9) In 1958 Rabin and Scott showed that NFAs are equivalent to DFAs. This may be his best known work.



Thursday, April 16, 2026

Machine Learning and Complexity

At Oxford I focused my research and discussions on how we can use the tools of computational complexity to help us understand the power and limitations of machine learning. Last week I posted my paper How Does Machine Learning Manage Complexity?, a first step in this direction. Let me give a broad overview of the paper. Please refer to the paper for more technical details.

Instead of focusing on the machine learning concepts like neural nets, transformers, etc., I wanted to abstract out a model defined in terms of complexity, as time-efficient non-uniform computable distributions with minimum probabilities. Let's unpack this abstraction.

Neural nets as next token predictors don't always give the same next token, rather they give a distribution of tokens, which leads to a distribution on text of length up to the size of the context window. The way probabilities are computed via a softmax function guarantee that every output can occur, with at least an exponentially small probability, the "minimum probability" in the abstraction. 

In computational complexity, we study two main kinds of distributions. A distribution is sampleable if there is an algorithm that takes in uniformly random bits and outputs text according to that distribution. A distribution is computable if we can compute not only the probability that a piece of text occurs, but the cumulative distribution function, the sum of the probabilities of all outputs lexicographically less than that text. Every efficiently computable distribution is efficiently sampleable but not likely the other way around.

Neural nets as next token predictors turn out to be equivalent to computable distributions. We need these kinds of distributions both on how neural nets are trained but also on how we use them. Computable distributions allow for conditional sampling which lets us use a large-language model to answer questions or have a conversation. You can't get conditional sampling from a sampleable distribution. 

Neural nets have a limited number of layers, typically about a hundred layers deep which prevents them from handling full time-efficient (polynomial-time) computations. They can get around this restriction either with reasoning models that allow a model to talk to itself, or by directly writing and executing code which run efficient algorithms of any kind.

Most of the algorithms we use, think Dijkstra, or matching, are uniform, that is the same algorithm on graphs of twenty nodes as the algorithm on twenty million. But neural nets change their weights based on the distributions they train on. These weights encode a compressed view of that data and the extra computation needed to process that data. So we have different algorithms as our technology and data sources improve. I wrote more about this connection between AI and nonuniformity last fall. 

What does this abstraction buy us?

Restricting to computable distributions forces machine learning algorithms to treat complex behavior as random behavior, much like we treat a coin flip as a random event because it is too complicated to work out all the physical interactions that would determine which side it lands on.

We illustrate this point by our main result. Let D be the distribution of outputs from a cryptographic pseudorandom generator and U be the uniform distribution of words of the same length. If \(\mu\) is a time-efficient non-uniform computable distribution with minimum probabilities and \(\mu\) does as good or a better job than U of modeling D then \(\mu\) and U are essentially the same distribution. Machine learning models the complex PRG as a uniform distribution, simplifying what we can't directly compute by using randomness. 

More in the paper and future blog posts.

Tuesday, April 14, 2026

Guest Post from Peter Brass, Former NSF Theory Director, on the NSF budget.

 Guest post from Peter Brass, Former NSF Theory director (though not affiliated with the NSF now) on the White House NSF budget for FY 2027.

---------------------------------------------

Dear Colleagues

A week ago the White House released the NSF budget request for FY 2027( here) so I want to provide you an update. As always, this is just my interpretation, I am not connected to the NSF any more

The general news is bad, we get a replay of FY 2026 (FY 2026 is October 2025 to September 2026)

For context: the NSF enacted budget in FY 2023 was $9.5B. The CISE budget was $0.9B. Budgets start with a request by the White House, and then get negotiated with Congress.

A year ago, the  White House made a FY 2026 budget request of $3.9B, which was a ~60% cut from the previous level. After negotiation with Congress, the enacted budget became $8.7B

However, a long period of FY 2025 were funded by continuing resolutions, and continuing  resolutions are based on the White House budget request, so for the first half of FY 2025 the  60% cut was in effect, and awards only now pick up. In addition, there was the government shutdown, and the removal of NSF from the Eisenhower Ave building in Alexandria, so processing will be delayed.

The cancellation of awards, although it caused a huge media echo, had a very small impact outside the EDU and SBE directorates, about 2% of awards were affected, and perhaps 1/3 of them were restored.

The FY 2027 budget request asks for $4.8B, but subtracting $0.9B reserved for a new antarctic research vessel, the request is the same $3.9B.  We hope that Congress will again restore much, but clearly the White House continues not supportive of basic research. The proposed cuts for NSF are a larger percentage than for NASA, NIH, NIST, or the DoE Research.  The request reduces the CISE budget from $0.9B to $0.3B.

The NSF projects a slight decrease in the number of proposals, and a huge decrease in the funding rate, across all research proposals a decline from 18% to 7%. The average award size and award duration are projected to increase slightly, giving much fewer people a bit more resources.

So far NSF has not done any hiring since the start of the current administration, and although Congress restored much of the research funding, it did not restore the cuts in the staffing level.  Rotators continue to end their rotation, and are not replaced, the current plan seems to discontinue the concept of a rotator entirely.
With the reduction is program staff, the remaining program directors are overworked, and are put in charge of programs where they have no previous connection to the research community, without time to establish the connection.  The programs themselves remain unchanged, as they are governed by the solicitations, but the people managing the programs can give less time to the individual program. And the money available depends on the outcome of the budget process.

That is the current situation, or at least the plan of the White House; we will see what Congress makes of it.

TO DO: tell your congress member that you thank for their previous support of basic research at the NSF, and hope they continue it.



Thursday, April 09, 2026

Afterthoughs on Banach Tarski and the Miracle of loaves and Fishes

I posted about using the Banach-Tarski Paradox(BT)  to explain the miracle of Loaves and Fishes (LF) here.

Darling says that whenever I fool my readers or my students then I have to tell them later, so I'll tell you now: The story about me meeting with Pope and talking about the BT Paradox (that would be a good name for a rock band: B-T-Paradox) was not true. I think my readers know that.

 

1) I first learned the Banach-Tarski Paradox as a grad student in 1981 when I read Hillary Putnam's article Models and Reality where he writes on Page 470:

One cannot simply sit down in one's study and ``decide'' that ``V=L'' is to be true, or that the axiom of choice is to be true. Nor would it be appropriate for the mathematical community to call an international convention and legislate these matters. Yet , it seems to me that if we encountered an extra-terrestrial species of intelligent beings who had developed a high level of mathematics, and it turned out they rejected the axiom of choice (perhaps because of the Tarski-Banach Theorem), it would be wrong to regard them as simply making a mistake. To do that would, on view, amount to saying that acceptance of the axiom of choice is built into our notion of rationality itself; that does not seem to me to be the case.

I agree with him and I wonder if we accept AC too readily. See my blog post on the BT paradox and my wife's strong opinion (she's against it).    

2) Back in 1981 my first thought was I wonder if someone has thought to use the BT paradox to explain the LF? And if so, were they serious or was it some kind of  joke? And does the Pope really get a discount at Pope-Yes? 

I also had the meta-thought (which I could not have said as cleanly as I will now):

 I wonder how I could find out if anyone else has thought of the BT-LF connection? 

Recall that back in 1981 the Internet was but a glint in Al Gore's eyes.  So back then I could not find out if anyone else had that thought of BT-LF. 

But now I can! And indeed, as I expected, some other people have made the connection of BT to LF: 

A tweet and a Reddit thread discussing the tweet: here. Not serious 

A serious article, I think, here   

A 24-page article about Holy Water and BT.  I can't imagine an article that long being a parody so I think its serious, see here. On the other hand, there is a 12-page article about Ramsey Theory and History that I think is supposed to be a parody, see here. (My proofreader points out that a different definition of parody is A feeble or ridiculous imitation so the article may well be a parody in that sense.

A parody article, I think, here

I am sure there are more. 

3) I had thought of doing a blog post about BT and LF  a long time ago,  but Pope Leo having a math degree was the final push I needed.

4) The word cardinal has three very different meanings: (a) a type of infinity, (b) a position in the Catholic church, (c) the bird.  Same for large cardinal.

5) One of my students who proofread the post thought that people will know it's a hoax since I am a vegetarian and hence would not eat at Pope-Yes, even if the Pope was paying.