A friend of mine and I recently had a conversation about the accumulation of knowledge and the limits of human information processing. He asked me whether my reading and research habits (which I’ll admit are a bit maniacal) were an example of cognitive overload that would lead to a bias against action.
This was a continuation of many similar discussions we’d had before about how brains compare to computers, and how to best use knowledge once it’s been acquired. Our most recent conversation came down to one root question: is knowledge accumulation worth our time if we have limited processing abilities?
The line of questioning that had led up to this came as a result of his reading from a book about how computer science can be used in everyday life (which I have not read yet).
This was incidentally something I’d been thinking about and focusing on quite a bit, and my response morphed into an essay on its own. Since the nature of this conversation ended up taking on a form that could be generally useful, I’ve decided to edit/expand my original writing and turn it into a post here. Enjoy.
Q: Are are we becoming computers with tons of knowledge and finite processing?
Although there are some useful analogies that can be gleaned from computer science, they represent two very different domains of operation and, as a result, the best way to use them depends heavily on the problem you’re trying to solve. On the hand, yes, brains do have a finite amount of processing power. But, as you’ll see, this is not the end of the story.
Linearity vs. Nonlinearity
Computers operate within well-defined, highly deterministic environments. In general, if you want to solve a computer science problem, you go through this process:
- Come up with some well-defined inputs.
- Put together a transformation or set of transformations for those inputs.
- Generate outputs and check them against what you want the outputs to be.
- Repeat until you get the outputs you’re looking for.
Although I’m simplifying to a large degree, this is in fact the base model for all operations within a computer. And because computers work this way, what you end up with are groups of linear systems. What’s a linear system? Glad you asked!
A linear system is a system that operates in a predictable, proportional manner. For example, let’s say you have a fleet of 10 trucks that can all move 10,000 pounds of cargo each. If you need to move 1,000,000 pounds of cargo from point A to point B, combining all of them together to move that cargo will require 100 total roundtrips.
It’s easy to understand because it’s predictable, and both inputs and outputs are clearly defined. You have a set amount of cargo that needs to be moved and a specific number of trucks that each have a physical limit for hauling cargo.
If you needed to move more cargo, you can increase the number of trips, increase the number of trucks or upgrade the trucks. You can’t magically change that formula because it’s a linear relationship: each truck adds a specific amount of capacity, and each trip adds a specific amount of delivered cargo. One or zero, heads or tails, black or white.
For all you fitness buffs out there, weightlifting can be seen in linear terms as well. Someone who can deadlift 500 pounds is twice as strong as someone with a deadlift max of 250 pounds. Again, this is an unambiguous, additive relationship: you add something to the system, and you get a predictable output. Twice as much weight equals twice as much strength.
Mathematicians and computer scientists prefer linear systems. It’s much easier to model things if they’re linear because the behavior exhibited by linear systems is so predictable. Need more of something? Add a piece here. Too much of something else? Just get rid of another piece there.
Adding to this is the fact that computers, at the lowest level, operate in a binary fashion. Everything that happens on a computer comes down to 1
s and 0
s, making computers excellent at dealing with problems that can be dealt easily in black and white terms.
Nonlinear systems, on the other hand, are much more difficult to deal with or even reason about. A nonlinear system that many people have experience with is the stock market (I’m inclined to think of the US stock market because I live in the US, but this applies to any stock market).
At any given moment, there are millions of people participating within the market, and the behavior that emerges from all this interaction is largely unpredictable and lacking any kind of proportion.
For example, if a company misses an earnings estimate by even one cent, the response can be tremendously negative. Large percentage moves can result, industry “experts” might start warning about the company’s impending downfall and a feedback loop forms that turns all the negative press into a self-fulfilling prophecy.
Meanwhile, other companies may beat earnings expectations and see their stocks go down, stay neutral or go up only slightly. A few might even see their stock prices soar. Financial news outlets will almost always print some kind of reason for what’s happening in these types of situations, but the reality is that nobody really knows exactly why these sorts of movements occur.
There are so many actors interacting within the system, that untangling their motivations and figuring out how those motivations figure into large-scale market behavior is all but impossible.
This sort of ambiguity makes mathematicians and computer scientists very uncomfortable, because modeling this sort of system is difficult, or even impossible in some circumstances.
It also makes building computer systems for solving ambiguous real-world problems very hard. How do you model something so unpredictable and, more importantly, how can you create computer programs that accurately predict (and deal with) the behavior of this type of system?
Nobody has a clear answer to this question, although some people in the tech world (most notably Ray Kurzweil and his followers) believe that our most important nonlinear questions will be answered by AI. In the meantime, we tend to utilize computers for linear problems.
It’s much easier to calculate the trajectory of a rocket launch than it is to predict the weather, because we understand physics much better than we understand the complex, nonlinear interactions of weather systems.
What’s funny about even comparing the two types of systems is that nonlinear systems are often treated by mathematicians as oddities, even though the real world is mostly filled with nonlinear systems.
Mathematician Stanislaw Ulam was opposed to even using the term “nonlinear system” because of their ubiquity and famously quipped that “Using a term like nonlinear science is like referring to the bulk of zoology as the study of non-elephant animals.”
Brains vs. Computers
By and large computers operate in a very linear, predictable fashion. Information is processed according to extremely well-defined algorithms (from the motherboard up). It’s also stored in a very simplistic, linear manner: once a bit is stored in one place, you can count on it to be there until something physically alters the medium it is being held in or some other operation purposefully changes it. Retrieving that information is a matter of using an algorithm to find it. There is very little ambiguity. The information is either there or it’s not.
There’s also a hard limit on what can be stored in a computer. It may be very large (some computers have petabytes worth of space), but at no point will you be able to continue storing anything beyond that limit without adding more storage to the system. Think back to the truck example I outlined before and you’ll see how this is yet another manifestation of linearity.
Brains, while possessing some qualities that line up with how computers operate – namely that they both process information – are a totally different animal. Exact details about how the brain processes information are still largely mysterious, but we do know a few things about it as of right now:
-
Processing is nonlinear. A small sensory input (even a very noisy or seemingly irrelevant one) to a brain can produce a large, nontrivial output. Likewise, seemingly significant inputs can produce little or no reaction. If you’ve ever had a “Eureka!” moment of insight that was produced by a seemingly trivial cue – a bit of music, the taste of a cookie – then you have experienced your brain’s nonlinear processing.
-
Long-term memory capacity is nearly infinite. Information needs to be properly encoded in order for reliable retrieval to take place, but once it’s in your memory banks, it will be accessible for a very long time. Memories are also stored in a distributed fashion, giving them an incredibly durable nature that is highly resistant to total destruction (a process known as catastrophic interference). There are some downsides to this – namely that memories can be distorted in the encoding and recall processes – but it also means that memories are rarely completely gone. Keep this point in mind, as it will come into focus later.
- Our recall abilities are tremendous. The downside of massive storage capacity is, like you alluded to in your question, trying to find specific bits of information. Once again, nonlinearity comes into the picture. Although recall (like most operations in the brain) is not fully understood, one popular model is a process called spreading activation, which involves networks of semantic chunks being activated by retrieval cues. See the picture below for reference. This gives us a strange sort of hack for finding information in our memories: if you can associate ideas with specific cues (images, sounds, smells, etc.), then you can rapidly access large amounts of information. It’s even faster if you can internalize structure for information, such as creating a hierarchy with general concepts at the top and increasingly details chunks as you move downward.
We do have a limited ability to process, yet our brains have evolved ways to compensate for that which makes them far superior to computers in some important ways. We can overcome many of our biological handicaps by understanding and exploiting the nature of our brains.
Using Brains vs. Using Computers
From everything I’ve just described, we can deduce that brains are nonlinear information processing machines and computers are linear information processing machines. So what?
Here’s where things get interesting. Computers offer us the ability to do all kinds of things very quickly and to store massive amounts of information that we can safely leave out of our own brains.
Computers can do math billions of times better than us, send messages across thousands of miles in less than a second and generally take care of tasks that our brains can’t match when it comes to speed and/or scale.
Need to compute the movements of planetary bodies? How about airflow over the wing of an airplane that hasn’t been built yet? Or perhaps you simply need to do a large calculation to finish your math homework. Computers are well-suited to all of those tasks.
Where computers fall tremendously short, however, is using all information to generate useful novelty.
What I mean by “useful novelty” is fairly straightforward: new things that humans can use to solve problems. Computers are very good at coming up with useless novelty. It’s not difficult to get a computer to generate random pixels, for example. But if you told a computer to solve something truly complex, such as world hunger, the computer wouldn’t even know where to begin.
To begin with, you’d need to create a linear, deterministic definition of “world.” You could probably just cram a bunch of geographic, political and economic data into a model that could suffice as “the world,” although you’d be leaving out quite a bit of detail.
But that would be the easy part. After this first step, you’d need to really put on your thinking cap and come up with a linear, deterministic way to describe factors like geopolitics, resource scarcity, war, drought, and so on. It’s so far beyond what a computer can currently do because the descriptive phase alone would overwhelm any current computer system so quickly that implementation would be completely out of the question.
Even if you accomodated the descriptive phase of this project, you’d immediately run into a (currently insurmountable) roadblock because you’d need to be able to accurately describe and predict what happens when all these factors interact.
What happened when one computer attempted to solve world hunger. Image source
This happens because computers run into problems with the defining characteristic of nonlinear systems: interaction between components. For example, think about the difference between a single neuron and the roughly 100 billion neurons that make up your brain. It’s actually pretty easy to describe and understand a single neuron.
There’s a head (soma) with branches (dendrites) for receiving information and a tail (axon) for sending information to other neurons. This is a gross simplification, of course, but what I’m leaving out is mostly just a matter of biological specifics – it’s not unknowable. If you go pick up a neuroscience textbook, you can learn the exact structure of a neuron pretty quickly.
While understanding a single neuron is straightforward, things get fuzzy as soon as you start looking at groups of interacting neurons.
There is nothing that hints at emergent phenomena like consciousness when you look at one neuron in isolation. Yet somehow it pops up as a result of large networks of neurons firing in seemingly random patterns. This is known as emergence, and it’s everywhere in nature.
Any time you combine large numbers of components (even simple ones) and let them interact, you get emergent behavior. Your conscious experience of the world is just one example of a nonlinear, emergent phenomena and it fits into an entire class of problems that is currently beyond our ability to solve.
The complex nature of stock markets that we considered earlier is another real-world example of emergence and the problems it creates for analysis.
And yet, for all our foibles, computers lag severely behind us when it comes to describing and understanding our nonlinear world. You can feed a computer loads of data, setup sophisticated learning systems that utilize the latest deep learning whiz-bang technology, and it still won’t give you anything other than noise.
The only time we’ve been able to really get computers to do our bidding in a successful way has been when we give them very narrow domains to operate in, and the task at hand cannot be done (or cannot be done efficiently) by a human mind. Calculating numbers rapidly is simple for a computer, but once you start feeding it very human, ambiguous concepts, it breaks down quickly.
What Sets Us Apart
My primary message, after all of that text you just read, is this: creativity is largely the distinguishing factor between computers and people. Computers lack it entirely, and nearly every person possess it in droves – whether they know it or not. We can look at disparate facts, ideas and abstract thoughts and come up with novel ideas.
Most of the time we’re just creating models, but, as George Box famously put it, “All models are wrong, but some are useful.” More importantly, we have the ability to use our current models to move on to whatever the next evolution of understanding is.
We can update and reshape models to reflect new information, or destroy them entirely and replace them with either an entirely new type of model or, better yet, a true solution.
We can look at a collection of pictures and almost instantly know which ones are cats. Seeing specific types of cats can trigger a cascade of other information tidbits, such as what the breed is, what your first cat was like, and so on.
A computer can be trained to identify what a cat is, yet its accuracy will be far below our level, it will need millions upon millions of examples in order to pick up on some patterns, and, at the end of the day, all it will really be able to do is identify cats.
Its ability to generalize from there is nonexistent – it isn’t going to actually know anything about cats except a rough idea of what they look like. Concepts like fur, meowing, purring and behavior don’t exist to the computer.
To go back to our “world hunger” example earlier, you can feed a computer boatloads of data about each part of the problem – income disparities, frequency of violent conflict, etc. – and all the computer can shoot back is more numbers. A computer cannot currently look at all the data and reply with “Here’s a step-by-step process for eliminating this problem…” and if it did the answers would probably be unintelligable.
There is some evidence that creativity is correlated to long-term memory (paywall). Long-term memory is the most malleable of the memory registers, so a person with a desire to improve their LTM can do so with enough time and deliberate effort (something that you can do by following the system laid out in my book). The more effectively encoded informational chunks exist within LTM, the more creative a person can potentially be.
But it isn’t just about filling out LTM with information. Once we have a large number of organized chunks related to a large number of domains, the real skill that becomes important is determining relationships at both the chunk and domain levels. You need to be able to see how things are connected to one another, and, even more importantly, what emerges when those concepts are combined. Most great innovations have sprung up from combining existing concepts in novel ways.
For example, the Wright brothers combined the aerodynamic expertise of early glider pilots like Otto Lilienthal with combustion engines, which were designed for ground transportation. Neither of these were invented by the Wright brothers, but when they combined them they created something novel.
Knowing this, our great task becomes determining where useful connections are not being made across domains and figuring out how to make them. And the only way to do this in a meaningful way is to build up our base of knowledge via intellectual exploration.
Our problem ends up being an example of the exploration-exploitation dilemma: deciding on whether to continue using one option whose payoff is known and understood or exploring potentially better options at the risk of ending up with a payout that is lower than the currently exploited path. In other words, do you keep going down a known successful path, or take a risk by exploring other paths – which may or may not be more rewarding.
My overall answer to this problem is that you cannot find a reasonably suitable path to exploit without first exploring at a nontrivial level. Over time, the impulse to explore needs to go down substantially and a behavior pattern of mostly exploitation should be the end result.
But exploration should never cease completely, and in the beginning stages (where we currently find ourselves) many different avenues should be explored. Breadth of knowledge is crucial for the process of connection that I described, and we cannot shy away from that just because it isn’t profitable upfront.
Bottom line: In order to figure out what to exploit, a large base of information in LTM needs to be present so creative solutions can be identified and later used for maximum reward.
Furthermore, it should be viewed not as a one-time process, but rather a cycle that needs to be repeated throughout life in order to stay ahead of opposing forces. Computers are not able to handle even the first stages of this process, and if/when they do, then we will have likely achieved true artificial intelligence.
P.S. if you think you’re too old to engage in this process, you’re wrong.