Splendid. At last, a coherent response to the question I've been asking since we first discussed Tononi's
Integrated Information Theory (more than a year ago, I think). I copied out some extracts into a Word doc but can't at the moment get my Word program to cooperate. So in the meantime, I'll c&p the following piece from farther down that page Steve linked for us:
"This past Thursday, Natalie Wolchover—a math/science writer whose work has typically been outstanding—published a piece in
Quanta magazine entitled
“A Theory of Reality as More Than the Sum of Its Parts.” The piece deals with recent work by
Erik Hoel and his collaborators, including
Giulio Tononi (Hoel’s adviser, and the founder of
integrated information theory, previously
critiqued on this blog). Commenter Jim Cross
asked me to expand on my thoughts about causal emergence in a blog post, so:
your post, monsieur.
In their new work, Hoel and others claim to make the amazing discovery that scientific reductionism is false—or, more precisely, that there can exist “causal information” in macroscopic systems, information relevant for predicting the systems’ future behavior, that’s not reducible to causal information about the systems’ microscopic building blocks. For more about what we’ll be discussing, see Hoel’s FQXi essay
“Agent Above, Atom Below,” or better yet, his paper in
Entropy,
When the Map Is Better Than the Territory. Here’s the abstract of the
Entropy paper:
The causal structure of any system can be analyzed at a multitude of spatial and temporal scales. It has long been thought that while higher scale (macro) descriptions may be useful to observers, they are at best a compressed description and at worse leave out critical information and causal relationships. However, recent research applying information theory to causal analysis has shown that the causal structure of some systems can actually come into focus and be more informative at a macroscale. That is, a macroscale description of a system (a map) can be more informative than a fully detailed microscale description of the system (the territory). This has been called “causal emergence.” While causal emergence may at first seem counterintuitive, this paper grounds the phenomenon in a classic concept from information theory: Shannon’s discovery of the channel capacity. I argue that systems have a particular causal capacity, and that different descriptions of those systems take advantage of that capacity to various degrees. For some systems, only macroscale descriptions use the full causal capacity. These macroscales can either be coarse-grains, or may leave variables and states out of the model (exogenous, or “black boxed”) in various ways, which can improve the efficacy and informativeness via the same mathematical principles of how error-correcting codes take advantage of an information channel’s capacity. The causal capacity of a system can approach the channel capacity as more and different kinds of macroscales are considered. Ultimately, this provides a general framework for understanding how the causal structure of some systems cannot be fully captured by even the most detailed microscale description.
Anyway, Wolchover’s popular article quoted various researchers praising the theory of causal emergence, as well as a single inexplicably curmudgeonly skeptic—some guy who sounded like he was so off his game (or maybe just bored with debates about ‘reductionism’ versus ’emergence’?), that he couldn’t even be bothered to engage the details of what he was supposed to be commenting on.
Hoel’s ideas do not impress Scott Aaronson, a theoretical computer scientist at the University of Texas, Austin. He says causal emergence isn’t radical in its basic premise. After reading Hoel’s recent essay for the Foundational Questions Institute, “Agent Above, Atom Below” (the one that featured Romeo and Juliet), Aaronson said, “It was hard for me to find anything in the essay that the world’s most orthodox reductionist would disagree with. Yes, of course you want to pass to higher abstraction layers in order to make predictions, and to tell causal stories that are predictively useful — and the essay explains some of the reasons why.”
After the
Quanta piece came out, Sean Carroll
tweeted approvingly about the above paragraph, calling me a “voice of reason [yes, Sean; have I ever not been?], slapping down the idea that emergent higher levels have spooky causal powers.” Then Sean, in turn, was criticized for that remark by Hoel and others.
Hoel in particular raised a reasonable-sounding question. Namely, in my “curmudgeon paragraph” from Wolchover’s article, I claimed that the notion of “causal emergence,” or causality at the macro-scale, says nothing fundamentally new. Instead it simply reiterates the usual worldview of science, according to which
- the universe is ultimately made of quantum fields evolving by some Hamiltonian, but
- if someone asks (say) “why has air travel in the US gotten so terrible?”, a useful answer is going to talk about politics or psychology or economics or history rather than the movements of quarks and leptons.
But then, Hoel asks, if there’s nothing here for the world’s most orthodox reductionist to disagree with, then how do we find Carroll and other reductionists … err,
disagreeing?
I think this dilemma is actually not hard to resolve. Faced with a claim about “causation at higher levels,” what reductionists disagree with is not the object-level claim that such causation
exists (I scratched my nose because it itched, not because of the Standard Model of elementary particles). Rather, they disagree with the meta-level claim that there’s anything
shocking about such causation, anything that poses a special difficulty for the reductionist worldview that physics has held for centuries. I.e., they consider it true both that
- my nose is made of subatomic particles, and its behavior is in principle fully determined (at least probabilistically) by the quantum state of those particles together with the laws governing them, and
- my nose itched.
At least if we leave the
hard problem of consciousness out of it—that’s a separate debate—there seems to be no reason to imagine a contradiction between 1 and 2 that needs to be resolved, but “only” a vast network of intervening mechanisms to be elucidated. So, this is how it is that reductionists can find anti-reductionist claims to be both wrong and vacuously correct at the same time.
(Incidentally, yes, quantum entanglement provides an obvious sense in which “the whole is more than the sum of its parts,” but even in quantum mechanics, the whole isn’t more than the density matrix, which is still a huge array of numbers evolving by an equation, just different numbers than one would’ve thought
a priori. For that reason, it’s not obvious what relevance, if any, QM has to reductionism versus anti-reductionism. In any case, QM is not what Hoel invokes in his causal emergence theory.)
From reading the philosophical parts of Hoel’s papers, it was clear to me that some remarks like the above might help ward off the forehead-banging confusions that these discussions inevitably provoke. So standard-issue crustiness is what I offered Natalie Wolchover when she asked me, not having time on short notice to go through the technical arguments.
But of course this still leaves the question: what
is in the mathematical part of Hoel’s
Entropy paper? What exactly is it that the advocates of causal emergence claim provides a
new argument against reductionism?
To answer that question, yesterday I (finally) read the
Entropy paper all the way through.
Much like Tononi’s integrated information theory was built around a numerical measure called Φ, causal emergence is built around a different numerical quantity, this one supposed to measure the amount of “causal information” at a particular scale. The measure is called
effective information or EI, and it’s basically the mutual information between a system’s initial state sI and its final state sF, assuming a uniform distribution over sI. Much like with Φ in IIT, computations of this EI are then used as the basis for wide-ranging philosophical claims—even though EI, like Φ, has aspects that could be criticized as arbitrary, and as not obviously connected with what we’re trying to understand.
Once again like with Φ, one of those assumptions is that of a uniform distribution over one of the variables, sI, whose relatedness we’re trying to measure. In my
IIT post, I remarked on that assumption, but I didn’t harp on it, since I didn’t see that it did serious harm, and in any case my central objection to Φ would hold regardless of which distribution we chose. With causal emergence, by contrast, this uniformity assumption turns out to be the key to everything.
For here is the argument from the
Entropy paper, for the existence of macroscopic causality that’s not reducible to causality in the underlying components. Suppose I have a system with 8 possible states (called “microstates”), which I label 1 through 8. And suppose the system evolves as follows: if it starts out in states 1 through 7, then it goes to state 1. If, on the other hand, it starts in state 8, then it stays in state 8. In such a case, it seems reasonable to “coarse-grain” the system, by lumping together initial states 1 through 7 into a single “macrostate,” call it A, and letting the initial state 8 comprise a second macrostate, call it B.
We now ask: how much information does knowing the system’s initial state tell you about its final state? If we’re talking about microstates, and we let the system start out in a uniform distribution over microstates 1 through 8, then 7/8 of the time the system goes to state 1. So there’s just not much information about the final state to be predicted—specifically, only 7/8×log2(8/7) + 1/8×log2(8) ≈ 0.54 bits of entropy—which, in this case, is also the mutual information between the initial and final microstates. If, on the other hand, we’re talking about
macrostates, and we let the system start in a uniform distribution over macrostates A and B, then A goes to A and B goes to B. So knowing the initial macrostate gives us 1 full bit of information about the final state, which is more than the ~0.54 bits that looking at the
microstate gave us! Ergo reductionism is false.
Once the argument is spelled out, it’s clear that the entire thing boils down to, how shall I put this, a normalization issue. That is: we insist on the uniform distribution over microstates when calculating microscopic EI, and we also insist on the uniform distribution over macrostates when calculating macroscopic EI, and we ignore the fact that the uniform distribution over microstates gives rise to a
non-uniform distribution over macrostates, because some macrostates can be formed in more ways than others. If we fixed this, demanding that the two distributions be compatible with each other, we’d immediately find that, surprise, knowing the complete initial microstate of a system always gives you at least as much power to predict the system’s future as knowing a macroscopic approximation to that state. (How could it not? For given the microstate, we could in principle compute the macroscopic approximation for ourselves, but not vice versa.)
The closest the paper comes to acknowledging the problem—i.e., that it’s all just a normalization trick—seems to be the following paragraph in the discussion section:
"Another possible objection to causal emergence is that it is not natural but rather enforced upon a system via an experimenter’s application of an intervention distribution, that is, from using macro-interventions. For formalization purposes, it is the experimenter who is the source of the intervention distribution, which reveals a causal structure that already exists. Additionally, nature itself may intervene upon a system with statistical regularities, just like an intervention distribution. Some of these naturally occurring input distributions may have a viable interpretation as a macroscale causal model (such as being equal to Hmax [the maximum entropy] at some particular macroscale). In this sense, some systems may function over their inputs and outputs at a microscale or macroscale, depending on their own causal capacity and the probability distribution of some natural source of driving input.'
As far as I understand it, this paragraph is saying that, for all we know,
something could give rise to a uniform distribution over macrostates, so therefore that’s a valid thing to look at, even if it’s not what we get by taking a uniform distribution over microstates and then coarse-graining it. Well, OK, but unknown interventions could give rise to many other distributions over macrostates as well. In any case, if we’re directly comparing causal information at the microscale against causal information at the macroscale, it still seems reasonable to me to demand that in the comparison, the macro-distribution arise by coarse-graining the micro one. But in that case, the entire argument collapses.
Despite everything I said above, the real purpose of this post is to announce that I’ve changed my mind. I now believe that, while Hoel’s argument might be unsatisfactory, the conclusion is fundamentally correct: scientific reductionism is false. There is higher-level causation in our universe, and it’s 100% genuine, not just a verbal sleight-of-hand. In particular, there are causal forces that can only be understood in terms of human desires and goals, and not in terms of subatomic particles blindly bouncing around.
So what caused such a dramatic conversion?
By 2015, after decades of research and diplomacy and activism and struggle, 196 nations had finally agreed to limit their carbon dioxide emissions—every nation on earth besides Syria and Nicaragua, and Nicaragua only because it thought the agreement didn’t go far enough. The human race had thereby started to carve out some sort of future for itself, one in which the oceans might rise slowly enough that we could adapt, and maybe buy enough time until new technologies were invented that changed the outlook. Of course the Paris agreement fell far short of what was needed, but it was a start, something to build on in the coming decades. Even in the US, long the hotbed of intransigence and denial on this issue, 69% of the public
supported joining the Paris agreement, compared to a mere 13% who opposed. Clean energy was getting cheaper by the year. Most of the US’s largest corporations, including Google, Microsoft, Apple, Intel, Mars, PG&E, and ExxonMobil—
ExxonMobil, for godsakes—vocally supported staying in the agreement and working to cut their own carbon footprints. All in all, there was reason to be cautiously optimistic that children born today wouldn’t live to curse their parents for having brought them into a world so close to collapse.
In order to unravel all this, in order to steer the heavy ship of destiny off the path toward averting the crisis and toward the path of existential despair, a huge number of unlikely events would need to happen in succession, as if propelled by some evil supernatural force.
Like what? I dunno, maybe a fascist demagogue would take over the United States on a campaign based on willful cruelty, on digging up and burning dirty fuels
just because and
even if it made zero economic sense, just for the fun of sticking it to liberals, or because of the urgent need to save the US coal industry, which employs fewer people than Arby’s. Such a demagogue would have no chance of getting elected, you say?
So let’s suppose he’s up against a historically unpopular opponent. Let’s suppose that even then, he still loses the popular vote, but somehow ekes out an Electoral College win. Maybe he gets crucial help in winning the election from a hostile foreign power—and for some reason, pro-American nationalists are totally OK with that, even cheer it. Even then, we’d still probably need a string of
additional absurd coincidences. Like, I dunno, maybe the fascist’s opponent has an aide who used to be married to a guy who likes sending lewd photos to minors, and investigating that guy leads the FBI to some emails that ultimately turn out to mean nothing whatsoever, but that the media hyperventilate about precisely in time to cause just enough people to vote to bring the fascist to power, thereby bringing about the end of the world. Something like that.
It’s kind of like, you know that thing where the small population in Europe that produced Einstein and von Neumann and Erdös and Ulam and Tarski and von Karman and Polya was systematically exterminated (along with millions of other innocents) soon after it started producing such people, and the world still hasn’t fully recovered? How many things needed to go wrong for
that to happen? Obviously you needed Hitler to be born, and to survive the trenches and assassination plots; and Hindenburg to make the fateful decision to give Hitler power. But beyond that, the world had to sleep as Germany rebuilt its military; every last country had to turn away refugees; the UK had to shut down Jewish immigration to Palestine at exactly the right time; newspapers had to bury the story; government record-keeping had to have advanced just to the point that rounding up millions for mass murder was (barely) logistically possible; and finally, the war had to continue long enough for nearly every European country to have
just enough time to ship its Jews to their deaths, before the Allies showed up to liberate mostly the ashes.
In my view, these simply aren’t the sort of outcomes that you expect from atoms blindly interacting according to the laws of physics. These are, instead, the signatures of
higher-level causation—and specifically, of a teleological force that operates in our universe to make it distinctively cruel and horrible.
Admittedly, I don’t claim to know the exact mechanism of the higher-level causation. Maybe, as the physicist Yakir Aharonov has
advocated, our universe has not only a special, low-entropy initial state at the Big Bang, but also a “postselected final state,” toward which the outcomes of quantum measurements get mysteriously “pulled”—an effect that might show up in experiments as ever-so-slight deviations from the Born rule. And because of the postselected final state, even if the human race naïvely had only (say) a one-in-thousand chance of killing itself off, even if the paths to its destruction all involved some improbable absurdity, like an orange clown showing up from nowhere—nevertheless, the orange clown would show up. Alternatively, maybe the higher-level causation unfolds through subtle correlations in the universe’s initial state, along the lines I sketched in my 2013 essay
The Ghost in the Quantum Turing Machine.
Or maybe Erik Hoel is right after all, and it all comes down to normalization: if we looked at the uniform distribution over macrostates rather than over microstates, we’d discover that orange clowns destroying the world predominated. Whatever the details, though, I think it can no longer be doubted that we live,
not in the coldly impersonal universe that physics posited for centuries, but instead in a tragicomically evil one.
I call my theory
reverse Hollywoodism, because it holds that the real world has the
inverse of the typical Hollywood movie’s narrative arc. Again and again, what we observe is that the forces of good have every possible advantage, from money to knowledge to overwhelming numerical superiority. Yet somehow good still fumbles. Somehow a string of improbable coincidences, or a black swan or an orange Hitler, show up at the last moment to let horribleness eke out a last-minute victory, as if the world itself had been rooting for horribleness all along.
That’s our universe.
I’m fine if you don’t believe this theory: maybe you’re congenitally more optimistic than I am (in which case, more power to you); maybe the full weight of our universe’s freakish awfulness doesn’t bear down on you as it does on me. But I hope you’ll concede that, if nothing else,
this theory is a genuinely non-reductionist one."
{A blind third alternative (given my general ignorance of mathematics and quantum mechanics), but is it possible that in our little part of the cosmos, where life has evolved to a certain degree of intelligence, humanly generated ideas in the humanly lived macrosphere also exist in fragile temporary superpositions that [for some reason or reasons] fall apart completely -- decohere? What better concept could we arrive at than decoherence to describe the human world we've constructed upon the prolific ground of earth? If so, our species might well expect to revert again to an earlier stage of life on earth, another effort by nature to evolve intelligent beings, perhaps producing a species capable of learning how to live sensibly and cooperatively, even justly, within the constraints of our natural ecology. So I can't put it down, as the author, Scott Aaronson, does, to "our universe’s freakish awfulness." Why has our species failed to assume its proper role as 'shepherds of being', in Heidegger's later philosophy? That's something we could try to figure out.}