The Outer Limits of Reason: A Summary

In the last 2 years, I have picked up my reading speed to about 5 books per month. And since I mostly read non-fiction, the concepts and ideas keep swirling in my head, sometimes I get a dizzy sense of not understanding completely what I just read. I’m talking about that odd sense when someone asks you what is the book about and you feel while you understood the main ideas behind the book, you still struggle explaining the details. So taking “if you can’t explain it, you haven’t understood it” to its logical extreme, I will start summarizing books that I like. In addition to cementing my understanding of the book, hopefully it will help others decide whether they want to read the book or not.

The Outer Limits of Reason by Noson S. Yanofsky

This book is written by Yanofsky, a professor in the Department of Computer and Information Science at Brooklyn College at the University of New York. Most popular science books like The Brief History of Time or The Selfish Gene talk about what we know about specific subjects. This book, like the title says, describes problems and concepts which are inherently out of bounds for us. This fantastic, eye-popping perspective challenges the intellectual bravado of mankind who believe that progress in science will ultimately be able to solve all problems and enable us to understand everything. What’s intriguing about the entire exercise of knowing what cannot be unknown is that the glimpse of unknown must be had with the known. It’s like taking a telescope, pointing it to the sky and observing a sign saying ‘there’s nothing to see here’.

The book does a great job at explaining concepts simply and I’m going to try simplifying it even more in this summary. Instead of going in a random order, I’ll follow the order of book’s chapters in which ideas appear.

Language Paradoxes

The book begins with a simple statement many of us might have come across earlier. Consider the sentence: “This sentence is false.” Is that sentence true or false? If we assume it’s true, then as per the sentence itself, it must be false. If we assume it’s false, it means it must be true. So we have an inconsistency and it’s there because language is a human made construct and full of inconsistencies. We can make up sentences that make no sense and that sentence was an example of such meaningless sentences. So far, so good.

Let’s talk about a famous paradox called Russell’s Paradox. In the late 19th century, mathematicians working on set theory (notably George Cantor) believed that entire mathematics can be derived from set theory. For example, zero could be represented by an empty set (the one with no elements). Addition could be simply combining two sets and counting the number of elements in the combined set. Taking set theory as it was prevalent in late 1800s, Russell proposed a paradox that I’ll discuss soon.

Sets are simply a collection of items. A set is an idea that if you have a bunch of items in a group, you can ask questions about it or do series of operations on it. For example, if you have a set of 5 oranges, you can ask questions like how many items remain in the set if we take 1 orange from it or you can take that set of 5 oranges and put them in a set of 5 mangoes and now you have a set of 10 items. With physical objects, sets are intuitive and easy to comprehend but the key idea is that sets are an abstract idea that can be applied to any collection of any types of items. So even though you will never see an infinite number of oranges, you can still define an infinite set (say a set of all whole numbers 0,1,2,3..).

There are some sets that contain themselves. Now the idea may seem strange but remember we’re talking mathematics and the idea of some sets containing themselves makes perfect sense. For example, define S as a set as collection of all sets that contain more than 1 element. There are infinite number of sets with more than 1 element (for example, you can define sets of two consecutive numbers {1,2}, {2,3}, {3,4}, etc.). So by definition S set itself fits in the definition of S (i.e. sets with >1 element). Hence S contains itself. For a simpler example, consider a book containing names of all books that are published. That book must contain its own name.

Back to Russell’s paradox. What Russell did was to define set R as a set of all sets that do not contain themselves. The question we must ask now is whether R contains itself. Thinking about this for a minute you’ll realize that this leads to an inconsistency. If R contains itself, it cannot contain itself (because it’s a set of all sets that do not contain themselves). If R does not contain itself, then it must contain itself. Voila, there we have the mathematical version of “this sentence is false” statement. Notice now that here we can’t dismiss this paradox as easily. It’s built on logic – which is the foundation of science.

Russell’s paradox announced in 1901 threw mathematics in a crisis. It was rescued by putting more axioms (which are truths that mathematicians take as self evident) and that led to a stronger version of set theory called Zermelo–Fraenkel set theory. But as we will see later, even that wasn’t enough because Gödel later demonstrated that in any consistent formal system, there are statements that can neither be proved or disproved.

Multiple types of infinities

One of the most interesting topics in mathematics is the topic of infinity. Even more interesting is the fact that there are multiple types of infinities. Well, what does that mean? Hang on.

Let’s start by counting. This seems like a simple enough exercise: if you have a few bananas, to count the total number, mentally you put them into a one-to-one correspondence to numbers. So you go.. “1 banana (pointing to the first banana), 2 bananas (pointing to the second one), 3 bananas (you know the drill)” and that’s how you know there are 3 bananas. Similarly if we define the concept of infinity as the counting of all natural numbers (1,2,3,4,5…), by putting other sets into one-to-one we can see how many other sets of numbers are infinite too.

Consider the set of all even numbers. Since even numbers are a subset of natural numbers, it’s very odd to think about that there are as many even numbers as there are natural numbers, but that’s exactly how it turns out to be through one to one correspondence. For each natural number, you can multiply it by 2 and get a corresponding number, and you can keep on doing this without ever running out of numbers. This means that both sets are of equal size, which is infinite. To distinguish it from other sets, this particular type of infinity is called countable infinity and is represented by aleph0 (pronounced as aleph zero). It may seem surprising, but even the set of all rational numbers (those of the form m/n where both m and n are natural numbers) are also of the same size as natural numbers. So, 1/2, 1/3, 1/4…. 2/3, 2/4, 2/5,… 3/4, 3/5, 3/6.. is equal to the size of 1,2,3,4,5.. The proof is simple, you can check it out here.

The amazing thing is that an infinity larger than these countable infinite numbers exists. And what’s even more amazing is that it can be proven. In order to do so, let’s talk about real numbers.

For those who need a little refresher, real numbers include all numbers on the number line. For this discussion, we can restrict ourselves to numbers between 0 and 1, but the argument holds for all real numbers out there. To prove that there exists a larger infinity, we will assume there isn’t a larger infinity and that all real numbers can be put into correspondence to natural numbers. This proof is called diagonalization proof.

To prove that the size of real numbers is bigger than the size of natural numbers, it’s sufficient to prove that there exists a real number that can’t be put into correspondence with natural numbers. To prove that first let’s assume that there’s some fancy correspondence between natural numbers (1,2,3,) and real numbers between 0 and 1. It’s not important what this correspondence is, the point is to assume that a perfect correspondence between real numbers and natural numbers exists and then to find out a real number that isn’t contained in the correspondence. See below:

Diagonal proof

The proof simply demands that for any correspondence, you take the first digit from the first row and change it to something else, take the second digit from the second row and change it to something else, and so on. The resulting changed digits when combined, make up for a real number that isn’t contained in the original one-to-one correspondence (because it’s different from the first number, the second number and so on.. in fact it’s different from all numbers). Voila! We have a real number that isn’t contained in the correspondence and hence we have proven that the set of real numbers is larger than the set of natural numbers. Since by changing different rows to different numbers, you can generate infinite new numbers, this proves that the size of real numbers is actually infinitely larger than the countable infinite size of natural numbers. This larger infinity is called an uncountable infinity and it’s size is 2aleph0. (This is because the set of real numbers can be put into one-to-one correspondence with the power set of natural numbers, and the size of any power set is 2size_of_the_set, e.g. super set of {1,2} is {{}, {1}, {2}, {1,2}}. For more info, see cardinality of continuum.)

Now here’s a kicker and the first time book introduces something that’s truly beyond reason: the continuum hypothesis (CH), which essentially postulates that there exists no set whose size is between the size of natural numbers (aleph0) and real numbers (2aleph0). Cantor put out this hypothesis in 1878 and spent many years trying to prove it. Then in 1940, Gödel proved that CH cannot be disproved within the set theory axioms (ZFC type, which is introduced above). And then in 1960s, Paul Cohen proved that axioms of set theory cannot prove CH. In short, CH is independent of set theory. There have been many attempts to extend the set theory but all attempts seem artificial since the current version of ZFC set theory seems to be foundational to much of modern mathematics. Anything we add or modify in ZFC theory to specifically accommodate CH seems a bit arbitrary (needless to say, impossibly difficult without making it inconsistent).

So returning to the question: are there any sets whose size is between natural numbers or real numbers? That’s an answer that reason cannot tell. It’s also a reflection on the question of whether mathematics is “discovered” or “invented”. Immediate thought is to call it a discovery as many people have same thoughts (e.g. number 1,2,3 exists in virtually all cultures), but then the question arises: discovered from where? If mathematics is discovered, there’s a definite answer to CH and we just don’t know it (or can’t know it). If it’s invented, then the question is meaningless. All we can say is that CH is independent of ZFC set theory, and the question demanding an answer about CH in context of ZFC set theory doesn’t make sense.

To know more about this topic, watch this video:

Practically impossible problems to solve: computational complexity theory aka P=NP?

Imagine you have a well defined puzzle or a problem that needs a solution. You further know that a solution exists for that problem. The question is: even if the solution exists, is there a guarantee that we will be able to find it?

I used to naively believe that given enough computing resources or human minds, nothing is impossible. However that’s not true and to understand that we will have to understand two classes of problems: easy problems and hard problems.

We call a problem easy to solve if it requires polynomial number of steps to find a solution. To simplify further, imagine a problem with n inputs (say n in case of {numbers: 4, 2, 7, 9, 1} is 5). Now the problem statement is of sorting these numbers from lowest to highest. One way to solve this problem is to use selection-sort algorithm which goes through the entire list, finds the minimum number so far and swaps first number with it, and it keeps doing that until the last number. Now in the worst case scenario of this algorithm (where list of numbers goes from highest to lowest), the algorithm will take maximum number of steps. Given such list of n numbers, it will take n steps in first go, n-1 steps in second go (skipping the first number because due to swap in first step, it is already the lowest), n-2, n-3, and so on. The total number of steps the algorithm takes is n + (n – 1) + (n – 2) + … + 1. The sum comes out to be close to half of n2, so we call the complexity of this algorithm to be of the order of n2.

Selection sort time complexity (source: Stackoverflow)
Selection sort time complexity (source: Stackoverflow)

Polynomials are equations with powers of n. Say n, or n2 or a combination of these n/2 + n3, etc. In computation complexity theory, problems with best known algorithms that take polynomial steps to give a solution are known to belong to P class problems. Other examples of such problems are searching in a list, multiplying large numbers, and many others.

The other class of problems are hard problems. These are the class of problems where best known algorithms give a solution in non-polynomial (NP) number of steps. An example of such a (NP) problem is the travelling salesman problem.

Travelling Salesman Example (source: Mathworks website)
Travelling Salesman Example (source: Mathworks website)

Imagine you are a salesperson in US and you have 100 cities to travel in a year. Because you have a salesman bravado and you also have an interest in solving hard problems, you decide to spend the minimum total distance on the road – that is, you will like to reduce the total distance that needs to be travelled.

This problem seems easy enough to understand and on the surface of it seems to be solvable, but solving it for the correct solution (that is the shortest path possible) will take an exceptionally long number of steps (and hence time). To solve a 100 city path correctly, you will have to start searching from every city to every city and keep comparing the current route to the shortest one you have found so far. For n number of cities, such an algorithm requires n * (n-1) * (n-2) * .. * 1. This number equals to n! (it’s called n factorial). To see how big number this is, consider that 100! = 9.332621544×10157. That is 10 followed by 157 zeros number of operations for a problem with merely 100 inputs.

Now you may say that a computer works really fast, so does this matter? The most modern computers do about a billion operations per second. Divide 10157 by a billion (8 zeros) and you get 10149 seconds to solve this problem and that is about 10138 centuries on a single computer. If you combine all total computers in earth solving for this problem (let’s take that as a billion), it still comes to be 10130 centuries to come up with the correct solution.

This is mind boggling. Like traveling salesman problem, there are many other hard problems. The definition of non-polynomial time problems is that they take an exponential time (say n!). There are other NP problems such as set partitioning that take 2n time or steps. To see how these grow with number of inputs, see how much time will it require to solve for polynomial (P) or non-polynomial (NP) problems.

Growth of algorithms
Growth of algorithms

What I find amazing is that this constraint on whether we can ever know solution to a 100 city traveling salesman problem is put by the nature of problem and not by the limitation of physical resources available to us. We can have every atom of the universe calculating answer to a hard problem and still not get an answer anytime soon. I find this inability to find answers to hard problems absolutely fascinating – it’s like a teacher playing a cruel joke on a student by giving a problem she knows cannot be solved. It’s beyond the limits of what universe lets humans to find out about.

I’m sure you must have heard the famous equation P=NP. To understand this, you have to know that problems per se aren’t hard or easy, the best solutions or algorithms for them are hard or easy. The P=NP equation is asking us to explore whether for problems where best known algorithms are NP (non-polynomial / hard), an easier polynomial time algorithm exists or not. That is, does an algorithm for traveling salesman problem exists that can find the shortest route in polynomial time, and not in n! time (like the one we discussed). The interesting thing about P=NP is that if we are able to find a polynomial time solution to one NP problem, we would have found find polynomial time solution to all NP problems. This is because in some sense all NP problems are related to each other – you can use one NP algorithm to find solution to another NP problem. (Proof of this is related to Cook-Levin Theorem). You can see the following video for an easier explanation of it.

Nobody knows whether P=NP or P not = NP. Just like the 100 city travelling salesperson problem, the solution to whether P=NP is itself extremely hard. Many computer scientists believe P is not = NP and there’s strong indications towards that, but there is no definitive proof. The fun thing is that we may actually never be able to prove or disprove this – it may be beyond limits of our reason!

The problems that can’t even theoretically be solved: the halting problem

Have you ever been frustrated by your laptop or desktop getting completely frozen? The infamous and dreaded ‘System Not Responding’ message. As a programmer, I knew that usually this happened if the program somehow went to an infinite loop. In fact, as a child, I remember we used to write infinite loop programs and set them as autoexe.bat so that our school computer lab programs would go into an infinite loop. It was a sweet little indulgence in mischief but I used to wonder why the hell can’t Windows detect such programs and break these loops?

System not responding - but why?
System not responding – but why?

I think I now know the answer. You may have also heard of the Halting Problem before. This is a problem that Alan Turing put out that famously proved to be impossible to answer. The formulation of the problem is simple and this is how it goes:

Given a computer program X and corresponding set of its inputs Y, can a program exist that can predict if X will go into an infinite loop or will it halt

Turing proved that no such program can exist which can predict whether another program will halt or not. This is why Windows cannot predict and halt a program which goes into an infinite loop. There is an elegant proof for undecidability of the halting problem but the basic logic goes like this: imagine there was a program that could predict whether another program X halts on inputs Y or not. Let’s call this program Halt(x,y). Using this program, let’s make another program D which goes like:

Program D:
If Halt(X,Y) = “Yes” Loop Infinitely
Else Output “No” and Stop

In essence, the program D says if program X halts, loop infinitely and if program X loops infinitely, halt. Now imagine that you feed Program D to Halt program again and ask whether D halts or not (assume Y = no inputs). So we’re asking for an answer of Halt(Program D) – that is, given no inputs, does program D halt or not. Once we ask this question, we encounter a contradiction: if Program D halts, it loops infinitely and if Program D loops infinitely, it halts. Since a contradiction cannot exist, this means Halt(X,Y) cannot exist.

I find this result very, very interesting that there could exist no program, logic, algorithm or a process that can tell whether another program will halt or not. The only way to find out is to run that program itself, but even then, how would you know whether it will ever halt? Given a non-trivial program, if the program is still running for 1 hour and has given no input, will you conclude that it will never halt? How about 1 year? 100 years? A million years? This is the amazing part: we can never tell if the program will keep on running forever, or maybe after a billion years, it will stop and given an output (hopefully 42).

Halting problem is related to many classes of problems known as undecidable problems.

Limitations of Science

If Mathematics is about the abstract, Science is about pragmatism. The major force in science is to explain and predict various different phenomena happening all around us and in general – in the search of a grand unifying theory of everything – one hopes that mankind will keep on progressing towards a complete and wholesome understanding of the universe. This gives me a lot of hope, but there is this big gorilla in Science that keeps saying ‘No, you can’t have that’. Yep, I’m talking about Quantum Mechanics.

Quantum Mechanics has been talked about and explained a million times already, so I’m not going to repeat that here. For those who haven’t been introduced to the topic, this is a good introduction:

We have often heard that light is both a wave and a particle, and other such variations of it. The famous Heisenberg Uncertainty Principle says we cannot measure both momentum and position of a particle (like electron) with accuracy. You measure one, and the other one becomes uncertain (i.e. you won’t know what it is). If you are able to locate a particle, you cannot know how fast it is going and if you are able to tell how fast it is going, you cannot tell where it is. Quantum phenomena expresses itself in many, many ways. For example, if you have one atom of radioactive Uranium (or Radium), you cannot predict when it will disintegrate into smaller atoms. Sure, you know half-life of Uranium (i.e. you know if you have 100 atoms, what is the time period in which 50 will spontaneously disintegrate but you cannot tell which ones will disintegrate and when). It’s like, just as humans, uranium atoms are bounded by laws of the universe but also have a mind of their own.

The baffling aspect of quantum mechanics is not that we don’t have instruments precise enough to measure. The limitation is not in instruments, it’s what the fundamental nature of reality is. The universe does not or cannot let us know everything there is to know – and we don’t know why is that so.

Another limitation of science in predictions is seen in complex systems that depend on many variables. Even though theoretically we say science is deterministic and that everything happens mechanically according to laws of physics (which is a simple explanation for those who don’t believe in free will). However, is something really predictable if we say something is theoretically predictable but we cannot practically predict it? After all, just like NP problems, if we say that a solution exists but we can’t hope to find it, does the solution really exist?

Take the famous butterfly effect which says that a butterfly flapping its wings in Australia can cause a tornado in US. It would be an absurd oversimplification if it were not true. If complex systems like weather, stock markets, economies, etc. depend on many factors with many variables (each impacting another), can we ever predict a future scenario with confidence? If we cannot, then future is one big open problem which we can always guess but never be certain about. If even for a trivial long range prediction, more computing resources than what are available in universe is required, we should be confident of giving up on predicting that phenomena and accept the limitations of reason and science there.

Which theories are true?

If you were living in the year 1543 and were either associated with Church or were a Physicist other that Copernicus, your world would have turned upside down when it was postulated that Earth revolves around Sun. I think right now it’s hard to imagine the disbelief Copernicus’ brave result would have caused. How could what you used to know which such certainty change overnight?

If you have read Popper’s version of the philosophy of science, you will know that scientific theories can never be proven, they can only be disproved. This directly relates to the issue of induction – that is, if you have seen 100 white swans, how sure can you be that swans are always just white? No matter how many swans do you see, you will never be 100% sure as just 1 black swan can disprove your theory.

It’s a little unfair to throw science under the towel just because we can’t be 100% sure, but I do find it interesting that with a new discovery our entire worldview could change any moment in future. That does put some limitations to our confidence in knowing and understanding. We will always be a little bit uncertain about even the staunchest of our beliefs. (This perspective is also called The Problem of Induction and was first made popular by the philosopher David Hume)

The question of why

Alas, we encounter the biggest question of all: why does anything exist? Science, mathematics, and logic can only tell what and how, but they fail miserably when the why question is asked. Why is there something rather than anything? What breathes fire into mathematical equations? Is there a God or not?

If these questions can only be answered on religious grounds, then rationalists can never hope to get a satisfying answer to these questions via reason and logic. I find this last frontier endlessly fascinating and as the book Beginning of Infinity hinted, probably answer to these questions cannot be answered by reason but probably with faith, beauty or aesthetics. If that is so, then poets, mystics and artists have found answer to this question long back! But the scientist and skeptic in me is still not 100% convinced 🙂

PS: This post took me a couple of weekends, and I rushed in the end so I skipped one chapter from the book. It’s called Mathematical Obstructions. It’s an interesting chapter which talks about shock caused in the Greek world by the discovery of irrational numbers, and various mathematical problems such as Tiling Problem, Galois group theory, Tarski’s theorem.

Thanks Aakanksha for reviewing this post. You can follow the books I read and recommend on my ShelfJoy page.

2 comments

Comments are closed.