Book review: “The Code Breaker: Jennifer Doudna, Gene Editing, and the Future of the Human Race” by Walter Isaacson

Scientists analyzing DNA helix and editing genome within organisms, CRISPR technology.

Long gone are the days when genetically engineered humans solely existed in science fiction. Mary Shelley’s “Frankenstein: or, The Modern Prometheus,” H.G. Wells’ “The Time Machine”, and Aldous Huxley’s “Brave New World,” all posited ideas of human-like scientific experiments and how humanity might progress given the power of genetic engineering. Over the past few decades, those ideas from fiction have gradually become more real. Like a story unfolding, American biochemist Jennifer Doudna, a hero behind the Crispr-cas9 technique, takes center stage. 

In “The Code Breaker: Jennifer Doudna, Gene Editing, and the Future of the Human Race,” the famous historian Walter Isaacson follows the steps of Doudna on her journey to uncover the secrets of the genetic code and the moral questions resulting from tampering with human life. 

After Isaacson has established himself thoroughly as a historian and journalist with his biographies on Albert Einstein, Steve Jobs, Benjamin Franklin, and other men, he now shares the story of Doudna, one of the women behind the pioneering work in the field of gene editing. 

Her technique relies on Crispr, an acronym for the “clustered regularly interspaced palindromic repeats,” found in the bacteria in everyone’s DNA, that lets scientists edit parts of the genome with precision and versatility. With the associated protein Cas9, an enzyme that uses Crispr sequences to select and remove specific parts of the DNA, Doudna’s Crispr-cas9 technique would prove its usefulness in letting scientists edit genomes in vivo, or right in place in a living organism. 

Put in the hands of amazing scientists, Doudna and her team at Berkeley hope to use Crispr-cas9 to “engineer inheritable edits in humans that will make our descendants less vulnerable to virus infections.” Such a powerful technology could change the human race as we know it. 

Coming at no time more opportune than the pandemic, the creation of effective vaccines for the COVID-19 has brought the need for Crispr-cas9. And, when Doudna earned the Nobel Prize in chemistry last year alongside French microbiologist Emmanuelle Charpentier, the story ends with events unfolding in the news today. When you close the cover of the book at the end, the next chapter continues before your eyes. 

Isaacson’s biography wasn’t just history: it was reality. Reading the book feels like watching a movie with everything we’ve witnessed in Crispr-Cas9 research leading up to these momentous events we’ve read about in the news only recently. 

In 2012, the two women, Charpentier and Doudna, published their manuscript on the first use of the technique, using Crispr RNAs to provide adaptive immunity to bacteria and archaea. Eight years later, they won the Nobel Prize for their groundbreaking work. Their story involves a race between scientists competing against one another for publications and patents while questions of how the technology will shape the future take the stage. 

The power of genetic engineering has raised questions of how scientists should use it for decades. American biochemists Stanley N. Cohen and Herbert W. Boyer pioneered recombinant DNA technology, letting scientists cut DNA into fragments and rejoin them to create and insert new sequences at their will in the early 1970s. When 150 physicians and biologists gathered at the Asilomar conference in 1975 to debate what sort of restraints should be placed on these new genetic engineering technologies, Stanford biochemist Paul Berg set the stage by describing the science behind it: recombinant DNA technology, formed from different organisms, has made it “ridiculously simple” to create new genes. Berg claimed that, given its risks were so difficult to determine, the technology should be banned. Others dissented, including MIT biologist David Baltimore, who argued for a solution that would restrict the use of recombinant DNA to “crippled” viruses so they wouldn’t spread. 

Using Crispr-Cas9 to get rid of sickle cell anemia, cancer, Huntington’s disease, or COVID-19 may sound amazing already, but, given the power to change the genome, why not use it to change other things about who we are? With the technology to change our height or make us smarter, the questions of what right we have to do so arise. Isaacson goes into the deep moral questions that the technique poses. Why not engineer a child to be the perfect athlete, the perfect student, or the perfect scientist? 

The innovative success in pushing the boundaries of how closely and thoroughly Crispr scientists can change the genome has also raised questions of how much one should tamper with what makes us human. Isaacson doesn’t stop at the science, though. His writing on the theories of justice by philosophers John Rawls and Robert Nozicke that relate to the ethics of Crispr would help readers understand methods of philosophical reasoning that can be used to define and provide a framework for hot topic issues such as eugenics and pre-implantation genetic diagnosis. 

Even the issues of how unequal access to the technology could consolidate disadvantaged and marginalized subjects and groups, the way human life is treated through motives and methods of reasoning, such as the justification of giving priority to patients who need an organ immediately to prevent death, even if there is a utilitarian motive for giving the organ to a healthier patient, all depend upon the essential questions of human life that Crispr raises. 

It’s difficult to downplay the sexism Doudna faces on her journey to own her own research. Isaacson’s story brings light to these subtle ways men overshadow and overlook the work of women in science with notable similarities to the treatment of Rosalind Franklin. When Eric Lander, biologist at MIT and Harvard’s Broad Institute, published his essay “The Heroes of CRISPR,” in the journal Cell in January 2016, he concluded that, “the narrative underscores that scientific breakthroughs are rarely eureka moments. They are typically ensemble acts, played out over a decade or more, in which the cast becomes part of something greater than what any one of them could do alone. It’s a wonderful lesson for the general public, as well as for a young person contemplating a life in science.” Though it may seem noble to praise the scientists behind Crispr this way, what wasn’t said revealed the misogynistic undertones. 

In her Jezebel article “How One Man Tried to Write Women Out of Crispr, the Biggest Biotech Innovation in Decades” writer Joanna Rothkopf pointed out that Lander failed to mention the billion-dollar patent battle between his institute and Doudna and Charpentier. “Not only did the Cell paper fail to disclose the potential conflict of interest, it significantly minimized the role of Doudna’s lab in advancing the technology,” Rothkopf wrote. Such a manipulative method of subtly ignoring the work of women spoke to larger systemic issues of sexism in science. Similarly, the patent battle only epitomized the greater, more universal competition that pitted scientists against one another throughout Isaacson’s book.

And what a war it was. Isaacson emphasizes the nature of scientific research as competition with individuals racing to discover the blueprints of the genetic code. Through bitter rivalries and battles, each combatant with their own motives and research techniques, the book feels like a series of duels between scientists in a culture that rewards national competitiveness and firsts. The reader is left to wonder whether the fault for misconduct and discrimination lies with the decisions the individual makes or with the culture that glorifies provocative research and celebrity status. 

Like a mystery story unraveling, Doudna’s journey is all about finding the clue that leads to the next step, one by one, with each step getting closer to fighting against disease. One finds a noble cause among Crispr scientists searching for truth and answers for the greater good, showing the worth of their character. Echoing what Lander wrote, the young person contemplating a “life in science” should understand how grand narratives emerge out of ensemble acts that play out over decades. Indeed, one should hope that life in science would be determined by the actions of the scientist herself, not by her DNA. 

Explaining Hidden Markov Models (HMMs) using Pac-Man!

The DNA Pac-Man game (https://github.com/HussainAther/dnapacman) can represent how protein sequences are generated (using Hidden Markov Models). We can draw an analogy of how HMMs work in the context of generating protein sequences using the Pac-Man. The next token/letter you eat is the next letter in the sequence of generating a sequence of protein amino acids.

Essentially, if we organize regions of the Pac-Man board representing different hidden states in a Markov Model, then the next letter that Pac-Man eats can represent the next state an HMM selects. We can change the probabilities a certain letter may appear and, when Pac-Man enters the hidden state, then the probabilities would change.

We can observe how the probability for different states changes as people play based on which letter they choose next. We can compare the HMM of Pac-Man to the HMMs in modeling eukaryotic genes following the methods of this manuscript “Hidden Markov Models and their Applications in Biological Sequence Analysis”: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2766791/

Full post here: https://github.com/HussainAther/DNAPacManHMM

Political Virtue of a Virus

A cell infected with a virus in the show “Cells at Work!” Metaphors for a disease can teach us more about them.

Coronavirus has become something else. Analogies of COVID-19 being like an evil force of nature, on the edge of life and death just as a virus would be. As alluring as it is to use grandiose metaphors and contemplate their meaning, it’s hard to separate truth from fiction. Any metaphor that lets us understand a deeper or hidden similarity we couldn’t otherwise explain runs the risk of straying from what something actually is. Still, a global pandemic that overturns notions of morality, reality, politics, and everything else can’t be explained without resorting to analogies. With that, the coronavirus is an experimental hypothesis of ethics. It’s a test to our character and morality in how we fight a virus as though it were something evil.

The politics of globalization and communication, like the anonymous force that spreads a viral video spreading, are at their end for this era. The promise of rising living standards and faith in government authority will fall along with them. With them, the experiment of liberalism, in forming unity and common bonds between people, has ended. The virus becomes a test of what can best answer the issues raised by these losses with the mistrust and tension between individualism and collectivism it brings.

When 19th-century Austrian physician Ignaz Semmelweis realized washing hands would prevent the high death rate among pregnant women due to post-partum infections, he was ostracized and sent to a mental asylum where he he would die. Just as Aristotle in Politics described the “exceptional man” who could sing better than the others in the chorus and, as a result, become ostracized by them, we can determine which exceptions we can’t afford to ignore through methods like washing hands and vaccinations.

When philosopher Michel Foucault wrote that modern sovereign power was biopolitical, expressed through the production, management, and administration of “life,” philosopher Giorgio Agamben responded that there was a “state of exception” in which an authority could exercise power in areas law had not otherwise granted to it. During the emergency of the pandemic, we find ourselves in this state. Knowledge itself has become a privilege. Only some voices are valued. Those who choose to spread knowledge and let ideas flourish would be virtuous during this time.

The virus invites us to reflect and meditate upon the world. We are mortal, finite, contingent, lacking, wanting, and many other things. These ideals have been true and will always be, but the virus only further reveals them. Philosopher Baruch Spinoza ridiculed how other thinkers put humans above nature, the idea that man, in nature, is a dominion within a dominion. The coronavirus breaks down solidarity between humans and creates walls between them. It sows divisions and prevents information and righteousness from reaching one another, much the way we self-isolate and quarantine. We must, then, find common solutions that can overcome these obstacles.

We may see the fall of postmodernism. Though nature may seem sinister with how threatening the virus is, we can’t address these issues and help one another without turning to nature. With the rise of “red zone” hotspots, domestic seclusion, and militarized territories, “neighbors” can be “anyone.” Turning to nature for answers and seeking unifying, grand narratives to unite people among one another would bring about a return to modernist ideals. Even fighting against fascism, an ideology that would otherwise welcome barricaded borders and segregation from superior groups, means coming to terms with the idea that the enemy is not some foreigner or outsider. As Agamben wrote, on coronavirus, “The enemy is not outside, it is within us.” Blocking communication with other nations, as sovereignists like Trump may want, won’t solve the problem. Conspiracy theories that Asian individuals or 5G are to blame may also show this xenophobia that attempts to remedy our anxieties.

With certainty, I believe the virus has made politics more of a morality test. There’s a political “virtue” in how we react to it with wisdom and resilience. If the political virtue abandons the “human, all too human,” illusion that we can appropriate nature like a dominion in a dominion, then the morality test of politics means we must learn how to govern nature, not control it. The Greeks would have called political “cybernetic” or nautical, and, like a sailor fighting against a stormy sea, politics means caring about the crew to survive.

Much like the coronavirus was named “corona” for its crown-shape, the authority, legitimacy, and power of individuals who rule nations come into question. Like a virus, neither dead nor alive, we find ourselves in a state of motionless solitude during isolation and quarantine. Teetering on the brink of despair, we have to regain our balance. When governments and economies begin starting up again, we can only fight against the virus so it doesn’t retain its power.

Can we upload our minds onto computers?

Is the singularity approaching? Science and philosophy have raised possible answers. We can now scan human brains on the level of a molecule. Recording this data is only a step toward artificial immortality, some argue, where we’d exist forever in data. This data would provide the basis for emulating everything the brain normally would whether through a robotic body or a virtual being. Though it wouldn’t be the exact molecules that make up who you are, this digital copy of yourself could, in some ways, be you.

Such ideas open up questions of metaphysics and being about how possible it is to even upload minds to computers. If you’re having doubts about whether a mind can actually become completely digital, you probably won’t be surprised to hear there’s been debate. Even if you could upload your mind to a computer, it would be a matter of arranging all the molecules the way to match your mind. It raises the question of whether this can account for everything a mind is capable. But, if your identity remained, would it still be you?

In “The Singularity: A Philosophical Analysis,” David Chalmers wrote about how a computer may take someone’s uploaded mind or even follow someone’s social media feed in reconstructing everything about who they are. Philosopher Mark Walker talked about a “type identity” that mind uploading preserves. Mental events can have these types corresponding to physical events of the brain. Philosopher John Searle has argued that mind uploading, part of starting a computer program, couldn’t lead to a computer consciously thinking. He goes into more detail with his Chinese room argument. Others like philosopher Massimo Pigliucci have been more pessimistic. Pigliucci has argued consciousness as a biological phenomena don’t let it lend itself to mind uploading as others may argue. Even more pressing, the philosophers Joseph Corabi and Susan Schneider believe you possibly wouldn’t even survive being uploaded.

Despite these issues, scientists and philosophers have put forward effort to make this future a possibility. Director of Engineering at Google Ray Kurzweil has worked toward this immortality. In the hopes of surviving until the singularity, he has written on the possibility of machines reaching human-like intelligence by 2045. These “transhumanists” like philosopher Nick Bostrom argue we’ll see mind uploading technology during the 21st century. The nonprofit Carbon Copies, headed by neuroscientist Randal Koene, has directed efforts towards mind uploading.

Mind uploading also centers on the question of what you are, philosophy Kenneth Hayworth suggests. With personal identity some consider the most important target to preserve through mind uploading and using the mind to define personal identity, many have chosen to use the phrase “personal transfer to a synthetic human” (PTSH) in lieu of “mind uploading.” This has lead philosophers to argue what would constitute a “personal identity.”

Work in mind uploading should remain conscious of the ethics of various outcomes for the offspring of one another. Seeking the best outcomes for mankind as a whole could mean that the more optimistic about mind uploading may believe the process would produce more intellectual and social good for the species. Humanity progressing towards a future dominated by uploading like a transhumanist or posthumanist would. They may even overpower others and thrive in a futuristic “digital Darwinistic” scenario. Those more wary and cautious of the technology would be cast aside even if humans might go extinct. Or they may be deleted without any sort of backup. In any case, the rest would be history, and, perhaps, a bit of metaphysics.

Life and Logic: “Hegel’s Concept of Life” by Karen Ng

Georg Wilhelm Friedrich Hegel is one of philosophy’s giants, and his influence on the science of logic and self-consciousness can’t be ignored. Philosopher Karen Ng puts Hegel’s thought and arguments into words in Hegel’s Concept of Life. Reason comes from life in itself, Ng explains.

Following and responding to Immanuel Kant’s writing, Hegel describes a type of internal purposiveness around which self-consciousness, freedom, and logic develop. Hegel derives a purposiveness from Kant’s third Critique of Judgment. Nature itself has a purposiveness, and, from this, judgement attains its power.

For a thing generated either by art, or by nature, …Art is the principle in a thing other than that which is generated, nature is a principle in the thing itself.

Aristotle, Metaphysics

Hegel cites Kant’s use of Aristotle’s understanding of nature in distinguishing between external and internal purposiveness. While the external purposiveness uses artifact creation and instrumental action, the internal type uses organic production and life the same way Aristotle differentiated between art and nature. This is pertinent for understanding Hegel’s philosophical method in the Differenzschrift (1801) and Phenomenology of Spirit (1807). In those texts, Hegel cites Fichte and Schelling in arguing against Kant that internal purposiveness is part of the activity of cognition. Ng offers her own interpretation, too. Hegel’s critique of Fichte’s idealism as “subjective” rests on Fichte’s inability to conceive of nature as internally purposive and living. From there, the cognition relates to the self and the world.

Ng interprets Hegel’s Science of Logic in a nuanced fashion that Hegel’s Subjective Logic are part of Hegel’s version of a critique of judgement. One can understand life as making intelligibility possible. Hegel’s theory of judgement is made up of reflective and teleological judgements such that a species or kind creates the objective context for predication. “Objective universality” is the context needed for predication, particularly the normative predicates ascription to the subject. Life is, then, something original of judgement, and presupposes the actualization of self-conscious cognition.

Books about Nothing: On the death of the novel

What use do books have nowadays? In The Decline of the Novel, author Joseph Bottum paints a grim portrait. The novel is dead, and, if not, dying. Fiction is no longer about grappling with reality.

For almost three hundred years, the novel was a major art form, perhaps the major art form, of the modern world—the device by which . . . we tried to explain ourselves to ourselves.

Joseph Bottum

Novels used to emerge from storytelling. They were a more mature form of them that would let the reader take a look inside someone else. Yet, they have met their fate. Long gone are the days of Austen, the Brontës, Dickens, Kafka, or anyone else we can remember. Now, novels are about letting readers find something else to divert their attention or entertain themselves. Bottum delivers this story of stories through examples and illustrations of these changes in reading. He supports his arguments through an exploration of how the role art plays has changed over time.

The Catholic author Bottum began his research on the historic trends of Protestantism losing cultural significance in American and European life. He believes the first novel, Cervantes’ 1605 Catholic Don Quixote, and how the form of the novel began were meant to reveal the “thick self in a thin universe.” For the first 300 years of its life, the novel kept this purpose. In the midst of the Reformation, the “Protestantism of the air” set the scene for the writing of the time, even for non-Protestants. And it caught on by readers and scholars of religious groups, especially Protestants.

The modern novel…came into being to present the Protestant story of the individual soul as it strove to understand its salvation and achieve its sanctification, illustrated by the parallel journey of the new-style characters, with their well-finished interiors, as they wandered through their adventure in the exterior world.

Joseph Bottum

Society began embracing the idea that they were pushing forward humanity somewhere. With progress of all forms, the novel promised something more than detailed stories of modern selves. They would create stories of the crises of modern selves with the urgency and relevance for the readers of the time period. In some cases, they offered solutions to the problems of the self. Like a remedy for the soul, they could enchant the reader who hungered for stories to understand the world around them. In the reduction of existence to science and technology, government and bureaucracy, and commerce and economics, novels provided meaning in reality. They gave purpose to the natural and physical world in which there was none, making them supernatural and metaphysical.

With this power, the novel was religious, Bottum argues. At the time, the secular realm consisted of the social norms that built civilization enforced by power. Uncovering this social value was, then, a religious act.

But those were the days of the past. The novel’s slow and steady decline reflects society’s inability to address the issues in the supernatural and metaphysical realm. “The decline of the novel’s prestige reflects and confirms…a new crisis born of the culture’s increasing failure of intellectual nerve and terminal doubt about its own progress,” Bottum argues. With modernity of all aspects of society, “the thick inner world of the self increasingly came to seem ill-matched with the impoverished outer world, stripped of all the old enchantment that had made exterior objects seem meaningful and important. . . . This is what we mean by the crisis of the self: Why does anything matter, what could be important, if meaning is invented, coming from the self rather than to the self?”

The dying began with the final decade of the twentieth century. Protestantism lost traction in Western civilization. “Of the authors who have published novels since the early 1990s, none are mandatory reading,” Bottum writes. How true this is depends on who you ask. It’s definitely possible that the novels over the past century don’t conform to Bottum’s view. Writers who are amazing with portraying characterization, dialogue, plot, and other features can still fall far from this purpose of a novel. The novel’s purpose in creating a meaning that transcends words themselves doesn’t follow from those aspects of writing. It’s something else. Many books over the past century start off well with a lot of potential, but don’t seem to reach this stage of a novel’s purpose. They, instead, leave the reader wanting more, circling around to how things started in the beginning, or use some other method of avoiding this transcendence. A lot of writers tend to shy away from metaphysical purpose and stick with themes that there’s no meaning in their work, that it was never meant to provide that to begin with, or some similar postmodern theme. We’ve come to associate these secular searches for truth by avoiding what Bottum has described as the purpose of the novel.

In thrusting the novel to the limits of postmodernism, one can ask “Where do we go from here?” Hungarian novelist László Krasznahorkai has tried pushing through this nihilism in his writing like The Melancholy of Resistance. His apocalyptic godless story has a description of a human body decomposing. In graphic detail of the chemical, it creates a redundancy of the form. Bottum argues that the American writer Tom Wolfe has a metaphysical component from the absence of a moral framework. Instead of having the ideal conception of ethics from which to measure distance of events and actions, Wolfe’s writing “needs a greater thickness than the world seems to possess,” Bottum says. “What he discovers instead is the culture’s failure of nerve, and it ruins the attempt to go where he wants to go,” Bottum writes. “The ending of a Tom Wolfe novel is usually a disaster, or at least a minor fall, because the resources necessary to conclude a story of justification and sanctification simply do not exist for him.” The American George Saunders and French Michel Houellebecq have established themselves as post-postmodernists to this end. Beyond the boundaries of postmodernism, they’ve written stories that capture what Bottum has intended. They give life to the novel in a sort of resurrection. They provide a philosophy with aesthetics, metaphysics, and other characteristics that Bottum argues novels have lost.

Krasznahorkai and Houellebecq perceive Bottum’s problem with the death of the novel. They both attempt to provide solutions with a metaphysics of the imagination for the empty space in their work. Houellebecq creates worlds that use transcendence in fitting ways even when in the lost abyss. Krasznahorkai avoids the problem by making a testament to modernity in ways other writers don’t.

Works across popular biography, New Journalism, graphic novels, and genre fiction have explored new forms of novelism. Though these aren’t novels, we can turn to them in determining things that novels have missed. They don’t quite capture the movements that Dickens or Austen once created. Bottum believes this “signals…an end of confidence, about the past values and future goals of what conceived itself as Western culture.”

Bottum is cautious in arguing that, though the novel has died, the writers before our time did not have more genius than current ones do. He mentions Naipaul, Vargas Llosa, Byatt, Pynchon, Roth, DeLillo, Coetzee, Robinson, Amis, Rushdie, McCarthy, Murakami, Eugenides as examples who are talented writers, yet show something different than what Defoe, Dickens, Austen, Faulkner, Hemingway, Steinbeck, or Mann did in their work.

All this is testimony, I think, to the current problem of culture’s lack of belief in itself, derived from the fading of a temporal horizon….Without a sense of the old goals and reasons…all that remains are the crimes the culture committed in the past to get where it is now. Uncompensated by achievement, unexplained by purpose, these unameliorated sins must seem overwhelming: the very definition of the culture.

Where do we go? “Why, indeed, should we write or even read book-length fiction for insight into the directions of the culture and the self?” Bottum asks.

There aren’t many people nowadays who believe reading novels is essential to being part of the public sphere like reading the news or searching for a peek into the human condition. While Bottum’s book still suffers from issues in the way its constructed such as how it centers on essays that don’t resonate as well as they could, it still reveals this truth about our novel-reading habits. One may argue about why or how these changes have occurred. In the age of information technology, our communication has become more superficial and simplistic. Fiction no longer carries the mysterious aura of power it once did. Freudian psychoanalysis has made the human person itself instrument such that everything we do becomes mere mechanical processes. The novelist, in these dimensions, doesn’t have much to work with. There’s no transcendence.

It’s also worth emphasizing the times of religious authority in many areas of society no longer holds the same water given the advances in communication and culture over the centuries since Don Quixote. As the novel gained its own aesthetic and culture significance, it had already begun losing the curiosity of the elusive human condition.

Writing itself has changed, too. It’s become a form of seeking status. Nowadays writing is more about changing the world rather than investigating it, and many people are more concerned with the prestige and power that comes along with it rather than the long, arduous craft of becoming a better writer.

The media’s portrayal of reality, through all these trends, becomes more supra-fictional. The phrase “truth is stranger than fiction” may resonate with many readers nowadays. Some areas of fiction like crime and horror still try their best to catch our attention, though.

Overall, the story of the novel meeting its demise presents these postmodern and post-postmodern issues that many of us experience whenever we open up a book. When its story ends, we’ll see if a new one begins.

Where do numbers come from? Philosophers have sought answers.

Zellini’s book is a nice story about numbers that introduces you to new ways of looking at the world.

The answer may lie in Irish mathematician Paolo Zellini’s recent book The Mathematics of the Gods and the Algorithms of Men: A Cultural History. The philosophical debate determines to answer the question if numbers are discovered like a diamond in a cave or invented like a new phone. Whether numbers are real or fake, it doesn’t make a difference to most people, even those who use mathematics in their everyday lives. An engineer needs to know if the physics of a bridge are sturdy enough, but probably doesn’t need to know whether those physics were invented or discovered. Still, understanding that it’s not relevant to most issues means that you can appreciate a greater philosophical inquiry by approaching the problem. Figuring it out for what it is presents those new methods of reasoning. When there’s nothing practical to gain, then the real learning begins.

So where did mathematics come from? How did we start using numbers to count things? Zellini says that, historically, “2 apples” came before the number 2 did. We saw many things in front of us and used numbers to count them. Enumeration itself was meant to give reality to these things. Mathematics and numbers were powerful, and this attribution gave them their power. Philosophers who wrote about divinity believed numbers created this reality through divine powers, as Zellini explains in The Mathematics of the Gods and the Algorithms of Men.

So if math was from the Gods, were algorithms from the men? In some way. The debates throughout the 1800s and 1900s lead to the theories of computer science in solving algorithms and difficult math problems. The ways numbers behaved in different calculations were the basis for questions of how things can change or not. Einstein’s theories of relativity and developments in the creation of computers took advantage of these methods of thinking. There, the foundation of mathematics in science and technology is apparent. But Zellini takes things a step further. Math not only showed how important calculations are to society, but dictated fundemental searches for what is real.

Numbers have a reality. This isn’t the same reality as the difference between real and imaginary numbers (such as the imaginary i unit). It’s a reality of how these numbers came about. They tell us what is and isn’t. Zellini writes their “calculability,” or this mathematical practicality, determines this. These theoretical questions of what kinds of math problems can be solved or how algorithms behave speak to how a system of rules for numbers may work. Zellini is very careful not to draw too many conclusions that math is the sole method of understanding reality or that these revelations will change every field of research that uses numbers. Instead, he presents more of a guide for how the amount of money you have in your pocket or temperature forecast tomorrow are real enough for the purposes they serve, even if other numbers aren’t as real.

Zellini’s writing is still insightful and relevant, though. Numbers are different from what they enumerate. The power of hundreds of thousands of voters supporting one candidate over the other relies on calculations in an increasingly data-driven world. The models built upon machine learning and statistics depend upon all sources of information. This data comes from a small part of our experience, though. The algorithms and computers that control the analysis, prediction, and other methods create the reality that can dominate the experience they claim to represent. As we rely increasingly on forecasts and cost-benefit models of risks, we, in many ways, find ourselves turning back to the philosophical power of numbers. Back to the big questions of what a 50% win chance in an election means adds up, Zellini reveals.

It’s disappointing, then, that Zellini’s appeal to philosophy depends so much on ancient mathematics that don’t flow so well with the philosophy itself. Making strong references to Heidegger and Nietzsche and a rambling explanation from classical philosophy are fine, but the work still falls short. It stays too well within the intellectual landscape of dead white men in a way that it doesn’t represent numbers, calculation, and algorithms as well as it could. The connections between mathematics and philosophy are still weak. Zellini even makes incorrect historical claims about the cultural history of math and philosophy.

I’m sure there are better stories of the history of mathematics and philosophy such historian David Wootton’s The Invention of Science: A New History of the Scientific Revolution. Still, Zellini’s explanation of the power of numbers is difficult to ignore in today’s issues of population and economics.

Why life might be worth living, according to philosopher William James

If you ever find yourself in doubt of yourself and other things in your life, remember to remain cognizant in evaluating things. Questioning whether life is worth living is only a part of many larger questions that many people face at some point or another. Whether you find satisfying answers can be difficult, though. Turning to philosophy can provide answers, with some effort at least.

“Is Life Worth Living?” A bold title for the 1896 lecture of philosopher and psychologist William James. And what better way to begin such a than with an 1881 self-help book of a similar title. James himself had been through the existential dilemma. He would ask, was life worth living?

The short answer is that it depends on the liver. Satisfied? If not, there is a more elaborate response. Philosopher John Kaag’s new book Sick Souls, Healthy Minds: How William James Can Save Your Life explores the Father of American Psychology’s personal journey in figuring out if life is worth living.

James would wake up each day “with a horrible dread at the pit of [his] stomach,” contemplating suicide in his early 20s and wondering “how other people could live, how I myself had ever lived, so unconscious of that pit of insecurity beneath the surface of life.” Through an arduous journey of figuring out what made life meaningful and worth living, the philosopher ends up conceding to “our usual refined optimisms and intellectual and moral consolations” and live as though life were worth living.

After Kaag witnessed a suicide by jumping off of the William James Hall at Harvard University in 2014, the philosopher began questioning why it had happened. Sick Souls, Healthy Minds aims to remedy those actions by offering James as a friend in those trying times of misery. Kaag shares own difficult time at age 30 as he was researching William James at Harvard University while going through a divorce and dealing with the death of his alcoholic father. Like his previous book on Nietzsche, Kaag searches for practical wisdom by combining his autobiographical experience alongside the famous philosopher. I still found myself believing that, though Kaag himself went through a tremendous amount of stress, his own story still pales in comparison to James’ style and work.

James’ research in studying philosophy and psychology alongside one another, radical empiricism, pragmatism, “anti-intellectualism” (to be clarified later) and overall revolutionary role in the theory of emotion that still resounds to this day make his life and rumination on its meaning much more impactful. His own life, from going on a scientific journey through the Amazon, studying medicine, and pondering life’s purpose, especially in light of On the Origin of Species, published in 1859, lead him to think humans were merely animals in a deterministic world of cause-and-effect. Choice, like free will, was only an illusion. This lead to his diary entries in 1870, in which he assumed free will was no illusion, and, out of his own free will, he would believe in free will. He wrote he would, “accumulate grain on grain of willful [sic] choice like a very miser” through making habits. After reading French philosopher Charles Renouvier’s, he came to believe these thoughts and kept them close in everything he did.

James’ pragmatism, that truth is not statically there to be perceived or discovered but is, in many cases, what we create in the stride of living, we can jump across the abyss that Nietzsche warned about staring into by jumping across it. James would write about a type of “anti-intellectualism” against the idea that the minds have “a world complete in itself” and need simply to find this world while having no power to re-determine its already-given character. These gave the psychologist-philosopher a type of deterministic that James would use to describe a type of “rich and active commerce” between minds and reality.

When new ideas join older ones, they “marry” one another, James described. You can form beliefs as hypotheses, and their values depend on how they relate to you. This hypothesis of life makes life valuable.

But Kaag also warns the prideful dangers of pragmatism, even if his explanations are a bit indulgent. Kaag’s doubts crept up on him during his first wedding, but his mother suggested to continue with the wedding as planned. He realized he could determine the truth that his marriage would be a happy one, but he also couldn’t the same way he could. It seemed as though James’ free will wouldn’t have helped.

James’ other work reflects the groundbreaking discoveries in psychology and cognitive science while creating the Department of Experimental Psychology at Harvard. James believed emotions are “constituted by, and made up of, those bodily changes which we ordinarily call their expression or consequence.” Being sad is not the cause of crying, but is what it feels like to cry in this sort of “biofeedback” in which we figure out our own emotions. This means, according to James, that whistling a happy tune could prevent yourself from feeling sad. The psychologist-philosopher mocked the cognitivist idea that emotions could simply be states of mind which cause us to have visceral reactions. Without the fiery passion of anger within your heart or heavy weight of mourning at a funeral, an emotion would only be “feelingless cognition.”

If, as Nietzsche said, every great philosophy is “a confession on the part of its author and a kind of involuntary and unconscious memoir,” then the emphasis should be on “involuntary and unconscious.” Maybe, in philosophizing, the personal should let themselves feel what they feel.

Neurons that work together, explained

A theoretical physicist can sit at a computer with a pen and paper may not seem like a likely candidate to understanding how the brain works, but, according to physicists who study statistics and algebra, they can figure out revolutionary theories about how the nervous system works. When I met Princeton theoretical physicist William Bialek in 2013 during my undergraduate years at Indiana University-Bloomington, I asked him about the “magic” of physics and how scientist can capture abstract ways of thinking and apply them to how neurons in the brain work. Bialek’s book “Spikes: Exploring the Neural Code,” one of my inspirations to step into neuroscience research, and his work alongside other researchers in physics and mathematics can answer key questions in neuroscience.

Pairwise Interactions

Often in neuroscience we are confronted with a small sample measurement of a few neurons from a large population. Although many have assumed, few have actually asked: What are we missing here? What does recording a few neurons really tell you about the entire network? Correlations of neurons dominated large networks of neurons. Using Ising models from statistical physics, the researchers of Schneidman et al. 2006 looked at large networks and their ability to correct for errors in representing sensory data. They argue that correlations are due to pairwise, but not 3-wise interactions between neurons, although some might argue that closer inspection reveals otherwise. Pairwise interactions are how neurons forms pairs among themselves to act together. Their pairwise maximum entropy approach can capture the activity of RGB neurons effectively.

Using an elegant preparation retina on a micro electrode array (MEA) viewing defined scenes/stimuli, the researchers showed that statistical physics models that assume pairwise correlations, but disregard any higher order phenomena, perform very well in modeling the data. This indicates a certain redundancy exists in the neural code. The results are also replicated with cultured cortical neurons on a MEA. They noted a dominance of pairwise interactions. This would imply that learning rules depending on pairwise correlations could, on their own, create nearly optimal internal models describing how the retina computes codewords. The brain could, then, assess new events for their degree of surprise with reasonable accuracy. The central nervous system alone could learn the maximum entropy model from the data provided by the retina alone, but the conditionally independent model is not biologically realistic in this sense. Although the pairwise correlations are small and weak and the multi-neuron deviations from independence are large, the maximum entropy model consistent with the pairwise correlations captures almost all of the structure in the distribution of responses from the full population of neurons. The weak pairwise correlations imply strongly correlated states. 

If you modeled the cells independent from one another, they would form the Poisson distribution. The actual distribution is almost exponential, so this doesn’t fit well. For example, the probability of K = 10 neurons spiking together is ~105 x larger than expected in the independent model. For this model, the specific response patterns across the population of neurons show that the N-letter binaries (patterns of 0s and 1s) differ greatly from the experimental results. These discrepancies show the failure of independent coding. The difference between prediction and empirical observation is anti-correlated in clusters of spikes. 

Instead, a group of neurons comes to a decision through pairwise correlations. These rates are predicted with >10% accuracy. The rates scatter between predictions and observations is confined largely to rare events for which the measurement of rates is itself uncertain.

The Jensen–Shannon divergence measures similarity between two probability distributions. This metric can be used to measure mutual information of a random variable to an associated mixture distribution, as the researchers did. In previous work, the researchers had used the same principle to a joint distribution and the product of its two marginal distributions and measure how reliably you can decide if a given response comes from the joint distribution or the product distribution. 

The fractions of full network correlations in 10-cell groups the maximum entropy model of second order plotted as a function of the full network correlation, measured by the multi-information IN. The ratio is larger when IN itself is larger, so that the pairwise model is more effective in describing populations of cells with stronger correlations, and the ability of this model to capture ~90% of the multi-information holds independent of many details. 

The Maximum Entropy Method


Maximum entropy estimate: constructive criterion for setting up probability distributions, on the basis of partial knowledge.

The most general description of the population activity of n neurons, which uses all possible correlation functions among cells, can be written using the maximum entropy principle as shown in the equation above for a probability p̂, Lagrange multipliers hi, and Jij, Z as the normalization constant, and the other variables representing each individual event probability. This method also uses Laplace’s principle of insufficient reason, which states that two events are to be assigned equal probabilities if there is no reason to think otherwise, and Jayne’s principle of maximum entropy, the idea that distributions are determined so as to maximize the entropy (as a measure of uncertainty) in a way consistent with given measurements.

For N neurons, the maximum entropy distributions with Kth-order correlations (K=1, 2, …N) can account for the interactions. Entropy difference (multi-information)  IN = S1 – SN measures the total amount of correlation in the network, independent of whether it arises from pairwise, triplet or more-complex correlations. They found this across organisms, network sizes, appropriate bin sizes, Each entropy value SK decreases monotonically toward the true entropy S : S1 ≥ S2 ≥,… ≥ SN. The contribution of the Kth-order correlation is I(K) = SK-1 – SK and is always positive. More correlation always decreases entropy. 

In a physical system, the maximum entropy distribution is the Boltzmann distribution, and the behavior of the system depends on the temperature, T. For the network of neurons, there is no real temperature, but the statistical mechanics of the Ising model predicts that when all pairs of elements interact, increasing the number of elements while fixing the typical strength of interactions is equivalent to lowering the temperature, T, in a physical system of fixed size, N. This mapping predicts that correlations will be even more important in larger groups of neurons.

The active neurons are those that send an action potential down the axon in any given time window, and the inactive ones are those that do not. Because the neural activity at any one time is modelled by independent bits, Hopfield suggested that a dynamical Ising model would provide a first approximation to a neural network which is capable of learning.

The researchers looked for maximum entropy distribution consistent with experimental findings. Ising models with pairwise interactions are the least structured, or maximum-entropy, probability distributions that exactly reproduce measured pairwise correlations between spins. Schneidman and the researchers used such models to describe the correlated spiking activity of populations of neurons in the salamander retina subjected to naturalistic stimuli. They showed that for groups of N≈10 neurons (which can be fully sampled during a typical experiment) these models with O(N2) tunable parameters provide a good description of the full distribution over 2N possible states. 

They found the maximum entropy model of second order captures over 95% of the multi-information in experiments on cultured networks of cortical neurons. There would be implications for learning rules could be enough to generate nearly optimal internal models for the distribution of “codewords” in the retinal vocabulary and let the brain accurately evaluate new events for their degree of surprise.

Accounting for Cell Bias

The researchers noted they needed to account for the pairwise interactions and cell bias values. Interactions have different signs, the researchers showed that frustration would prevent the system from freezing into a single state in about 40% of all triplets. With enough minimum energy patterns, the system has a representational capacity, and the network can identify the whole pattern uniquely just as Hopfield models of associative memory do. The system would have a holographic or error-correcting property, so that an observer who has access only to a fraction of the neurons would nonetheless be able to reconstruct the activity of the whole population.

The pairwise correlation model also uncovers subtle biases in decision making. It will tell you about how they influence each other, on average. Pairwise maximum entropy models reveal that the code relies on strongly correlated network states and shows distributed error-correcting structure.

To figure out if the pairwise correlations are an effective description of the system, you need to determine if the reduction in entropy from the correlations captures all or most of the multi-information IN. The researchers conclude that, even if the pairwise correlations are small and the multi-neuron deviations from independence are large, the maximum entropy model consistent with the pairwise correlations captures almost all of the structure in the distribution of responses from the full population of neurons. This means the weak pairwise correlations imply strongly correlated states. 

Other Effects

Intrinsic bias dominates small groups of cells, but, in large groups, almost all of the ~N2 pairs of cells are significantly interacting (top). This shifts the balance so that the typical values of the intrinsic bias are reduced while the effective field contributed by other cells increases (bottom). In the Ising model, if all pairs of cells interact significantly with one another, you can limit the typical size of interactions by showing how Jij changes with increasing N. There were no signs of significant changes in J with growing N with the values the researchers tested. 

Extrapolation

For weak correlations, you can solve the Ising model in perturbation theory to show that the multi-information IN is the sum of mutual information terms between all pairs of cells, and IN ~ N(N – 1) (left). This is in agreement with the empirically estimated IN up to N = 15, the largest value for which direct sampling of the data provides a good estimate. Monte Carlo simulations of the maximum entropy models suggest that this agreement extends up to the full population of N = 40 neurons in their experiment (G. Tkačik, E.S., R.S., M.J.B. and W.B., unpublished data). The potential for extrapolation to larger networks of neurons can be shown through the error-correction that comes about (right). The error-correction emerges when figuring out how N-cell activity can predict (N+1)-cell activity. Uncertainty decreases by the number of cells. In a 40-cell population, three cells with spiking probability have an near-perfect linear encoding of the number of spikes generated by other cells in the network. Through these methods of becoming more and more accurate and robust, they showed findings that are similar to how single pyramidal cell spiking correlates with more collective responses. 

Challenges to the Model

The case of two correlated neurons has proven to be particularly challenging, because the Fokker–Planck equations are analytically tractable only in the linear regime of correlation strengths (r ≈ 0) and only for a limited set of current correlation functions. Some analytical results for the spike cross-correlation function have been obtained using advanced approximation techniques for the probability density and expressed as an infinite sum of implicit functions (Moreno-Bote and Parga, 2004, 2006). Similarly, the correlation coefficient of two weakly correlated leaky-integrate-and-fire neurons has been obtained for identical neurons in the limit of large time bins. 

Correlations between neurons can occur at various timescales. It’s possible, by integrating the cross-correlation function (xcorr in Matlab, correlate in numpy) between two neurons, to read off the timescale of the correlation (Bair, Zohary and Newsome 2001). This can help to distinguish correlations due to monosynaptic or disynaptic connections, which are visible at short timescales, with correlations due to slow drift in oscillations, up-down states, attention, etc., which occur at much longer timescales. Correlations depend on physical distance on the cortical map as well as tuning distance between two neurons (Smith and Kohn, 2008).

Decoding techniques of Ising model can be applied to simulated neural ensemble responses from a mouse visual cortex model with an improvement in decoder performance for a model with heterogeneous as opposed to homogeneous neural tuning and response properties. Their results demonstrate the practicality of using the Ising model to read out, or decode, spatial patterns of activity comprised of many hundreds of neurons (Schaub et al. 2011).

Discussion

The research seems to reflect general trends of “the whole is greater than the sum of its parts” or even “less is more,” both concepts in science and philosophy that date back centuries. I even emailed Elad Schneidman a few days ago about this, and he responded, “I think that this idea must predate the ancient greeks ;-)”.

Their work used the application of the maximum entropy formalism of Schneidman et al. 2003, to ganglion cells. The same way a group of neurons behaves differently than the sum (or combination) of each independent neuron gives the research leverage and potential for these systems-like problems of neurocomputation and emergent phenomena. 

The work in deriving an Ising model (or using a maximum entropy method) from statistical mechanics shows the importance of a priori proof work in using equations and theories to deduce “what follows from what.” It’s a great example of using the principles and methods of abstraction that mathematicians and physicists use in solving problems in biology and neuroscience. In my own writing, I’ve described this sort of attention to abstract models and ideas as relevant to biology in a previous blogpost.

In this paper, the researchers very well theorized which shortcomings and limitations their model would have and addressed them appropriately by fitting their model to experimental work. As a result, their research testifies to the power of computational and theoretical research in both describing and explaining empirical phenomena. 

Recreating the Results

With the MaxEnt Toolbox, I used MATLAB to recreate the results, which can be found here: https://github.com/HussainAther/neuroscience/tree/master/maxent/schneidman.

Related Research

In that same year, Tkačik and other researchers would use the same recordings and use Monte-Carlo-based methods to construct the appropriate Ising model for the complete 40-neuron dataset. They showed that pairwise interactions still account for the observed higher-order correlations and argue why the effects of three-body interactions should be suppressed. 

They examined the thermodynamic properties of Ising models of various sizes derived from the data to suggest a statistical ensemble from which the observed networks could have been drawn and, consequently, to create synthetic networks of 120 neurons. They found that with increasing size the networks operate closer to a critical point and start exhibiting collective behaviors reminiscent of spin glasses. They examined more closely the appearance of multiple single-spin-flip stable states.

The method of using a maximum entropy model is equivalent to the method of Roudi et al. 2009, where they described a method of normalizing the the Kullback–Leibler divergence DKL(P, P˜) (for P˜ approximation to distribution P, with the distance from the independent maximum entropy fit. The quality of the pairwise model comes from normalizing this by the corresponding distance of the distribution P from an independent maximum entropy fit DKL(P, P1), where P1 is the highest entropy distribution consistent with the mean firing rates of the cells (equivalently, the product of single-cell marginal firing probabilities): Δ = 1 – DKL(P, P˜)/DKL(P, P1) where Δ = 1 means the pairwise model perfectly fits the additional information left out by the independent model, and Δ = 0 means the pairwise model doesn’t improve at all compared to the independent model. 

In 2014, Tkačik and the researchers from Schneidman et al. 2006 published “Searching for Collective Behavior in a Large Network of Sensory Neurons” with K-pairwise models, more specialized variations of the pairwise models to estimate entropy, classify activity patterns, show that the neural codeword ensembles are extremely inhomogeneous, and demonstrate that the state of individual neurons is highly predictable from the rest of the population, which would allow for error correction. 

Barreiro et al. 2014 found that, over a broad range of stimuli, output spiking patterns are surprisingly well-captured by the pairwise model. They studied an analytically tractable simplification of the retinal ganglion cell mode, and found that in the simplified model, bimodal input signals produce larger deviations from pairwise predictions than unimodal inputs. The characteristic light filtering properties of the upstream retinal ganglion cell circuitry would suppress bimodality in light stimuli, thus removing a powerful source of higher-order interactions. The researchers said this gave a novel explanation for the surprising empirical success of pairwise models.

Ostojic et al. 2009 studied how functional interactions would depend on biophysical parameters and network activity that variations in the background noise changed the amplitude of the cross-correlation function as strongly as variations of synaptic strength. They found that the postsynaptic neuron spiking regularity has a pronounced influence on cross-correlation function amplitude. This suggests an efficient and flexible mechanism for modulating functional interactions.

In 1995, Mainen & Sejnowski showed that single neurons have very reliable responses to current injections. Nevertheless, cortical neurons seem to have Poisson or supra-Poisson variability. It’s possible to find a bound on decodability using the Fisher information matrix (Sompolinsky & Seung 1993). Under the assumption of independent Poisson variability, it is possible to derive a simple scheme for ML decoding that can be implemented in neuronal populations (Jazayeri & Movshon 2006).

The accumulation of noise sources and various other mechanisms cause cortical neuronal populations to be correlated. This poses challenges for decoding. You can get a little more juice out of decoding algorithms by considering pairwise correlations (Pillow et al. 2008).

References

Bair, W. “Correlated firing in macaque visual area MT: time scales and relationship to behavior.” (2001). Journal of Neuroscience. 

Barreiro, et al. “When do microcircuits produce beyond-pairwise correlations?” (2014). Frontiers. 

Bialek, William and Rangnathan, Ramek. “Rediscovering the power of pairwise interactions.” (2018). Arxiv. 

Hopfield, J.J. “Neural networks and physical systems with emergent collective computational abilities.” (1982). Proc. Natl Acad. Sci. USA 79, 2554–-2558. 

Jazayeri, M, Movshon, A. “Optimal representation of sensory information by neural populations.” (2006). Nature Neuroscience. 

Mainen, ZF, Sejnowski, TJ. “Reliability of spike timing in neocortical neurons.” (1955). Science

Moreno-Bote, R., and Parga, N. “Role of synaptic filtering on the firing response of simple model neurons.” (2004). Phys. Rev. Lett. 92, 028102.

Moreno-Bote, R., and Parga, N. “Auto- and crosscorrelograms for the spike response of leaky integrate-and-fire neurons with slow synapses.” (2006). Phys. Rev. Lett. 96, 028101.

Ostojic, et al. “How Connectivity, Background Activity, and Synaptic Properties Shape the Cross-Correlation between Spike Trains.” (2009). The Journal of Neuroscience. 

Pillow, Jonathan, et al. “Spatio-temporal correlations and visual signalling in a complete neuronal population.” (2008). Nature

Roudi, et al. “Pairwise Maximum Entropy Models for Studying Large Biological Systems: When They Can Work and When They Can’t.” (2009). PLoS Computational Biology. 

Schaub, Michael and Schultz, Simon. “The Ising decoder: reading out the activity of large neural ensembles.” (2011). Journal of Computational Neuroscience

Schneidman et al. “Network Information and Connected Correlations.” (2003). Physical Review Letters. 

Schneidman et al. “Weak pairwise correlations imply strongly correlated network states in a neural population.” (2006). Nature

Seung, HS, Sompolinsky, H. “Simple models for reading neuronal population codes.” (1993). PNAS

Shlens, Jonathan, et al. “The structure of multi-neuron firing patterns in primate retina” (2006). The Journal of Neuroscience 26.32: 8254-8266.

Smith, Matthew, and Kohn, Adam. “Spatial and Temporal Scales of Neuronal Correlation in Primary Visual Cortex.” (2008). Journal of Neuroscience. 

Tkačik, Gašper et al.  “Ising models for networks of real neurons.” (2006). arXiv.org:q-bio.NC/0611072. 

Tkačik, Gašper et al. “Searching for Collective Behavior in a Large Network of Sensory Neurons.” (2014). PLoS Comput Biol. 

The Journalist’s Guide to Statistics

There are three kinds of lies: lies, damned lies, and statistics.

Mark Twain, “My Autobiography”

Journalists need a good understanding of numbers. Tapping into the power of data would let them create more meaningful and effective stories. But making sense of numbers can be difficult. Reporting on data is often not as straightforward or manageable as other types of journalism. Writers need to separate signal from noise.

What’s more, researchers and writers need to know the context of data to draw appropriate conclusions. You can know everything about how candidates in an election fare against one another through polls and surveys, but, until you know the causes behind why people would vote that way, you can’t say much about those statistics. I’ve written more here on the nature of causation in the context of scientific research. This guide provides a logical, reader-friendly approach to writers wanting to harness the power of statistics.

Table of contents:

  1. Know the numbers
  2. Study the source
  3. Remember the reader
  4. Present the product

1. Know the numbers

Too often, writers throw around numbers not knowing what they mean. Here is a run-down of statistics terms you should know as a journalist:

  • Bayesian statistics
    • If it rains, how does that affect which football team will win? This branch of statistics lets you figure out how likely something may occur based on how it depends on other factors. This lets you account for factors like false positives (when an experiment detects something that doesn’t exist) such as medical screening flagging false cues as cancer. With Bayesian models, you can account for different sources of information in putting together these conditional probabilities.
    • Using Bayesian statistics to predict how likely future events are is “Bayesian inference.”
  • Beta distribution
    • Using a pre-defined distribution, you can determine how well a baseball player will do at the beginning of a season even when you haven’t collected much data so far. Using her batting average of .270, you can create beta distribution (shown above with α = .81 and β = .219. The average is .270 and the standard deviation is σ2 = .115.
    • If you don’t know the exact probability something occurs, you can figure out how which probability is most likely by selecting it from a beta distribution of probabilities. You can use α and β to calculate the mean μ and standard deviation σ with:
    • You’ll also find binomial distributions which use the same probability for all trials instead of letting it change.
  • Chi-square test (χ2 test)
      Suppose you wanted to find the relationship between being HIV positive and sexual preference. You survey 30 males and find the following data (in a contingency table):
      Sexual preference
      MaleFemaleBothTotal
      HIV+4239
      Not HIV+316221
      7185
      Then, you can multiply the raw numbers and divide by the total to calculate how likely it is HIV+ determines sexual preference. This gives you expected values, different from the observed ones as shown below:
      Sexual preference
      MalesFemalesBothTotal
      HIV+
      Observed (O)4239
      Expected (E)(9*7)/30  = 2.1(9*18)/30 = 5.4(9*5)/30=1.5
      (O-E)1.9-3.4-1.5
      (O-E)^23.6111.562.25
      Not HIV+
      Observed (O)316221
      Expected (E)(21*7)/30 = 4.9(21*18)/30 = 12.6(21*5)/30 = 3.5
      (O-E)-1.93.4-1.5
      (O-E)^23.6111.562.25
      30
    • If you have an expectation or prediction of what your results should look like, the chi-square test compares them to what you actually observe to tell you how well your predictions match what happens. This example is borrowed from David Stockburger at Missouri State.
    • Researchers calculate this by finding the difference between observed and expected values using the formula χ2 = (observed − expected)2/expected.
    • Sometimes you’ll see the difference between observed and expected values referred to as the “residual.”
  • Confounding variable
    • If you want to test if texting leads to an increase in crashes, you would want to make sure that text messages, not weather or traffic, cause the crashes. These extra variables the study doesn’t account for are confounding variables.
  • Controlled experiment
    • If you give a drug to students to observe how it affects sleep, you should compare this group (the treatment group) to a controlled group, a set of students under the same conditions, but without the drug. This makes sure you can determine that it was the drug causing differences in sleep and not some other variable.
  • Correlation
    • This tells you how well two variables are related to one another. Two stocks that change in similar ways to one another over time may be correlated.
  • Fisher’s exact test
      CuredNot CuredTotal
      Drug A4258100
      Drug B1486100
      Total56144
    • Similar to the chi-square test, this test compares whether an outcome occurs using a contingency table (shown above). There’s no formal calculation, but it can give you an idea of the probability an effect occurs.
  • Histogram
    • This tells you how data is distributed with the normal distribution shown above. The height of each bar shows you how many data points in the bin along the x-axis or how likely it is to fall in that bin. For probabilities, the area of the bins should equal 100 percent.
  • Margin of error
    • When you make a measurement, the margin of error (sometimes called “uncertainty”) tells you how much that measurement can change due to other factors. You’ll typically find this in a range of a confidence interval, such as “40 percent +/- 1 percent.”
    • If you’re polling a sample of people, the margin of error can tell you how close the sample is as representative of the entire population.
    • Writer Robert Niles defines this as “1 divided by the square root of the number of people in the sample.”
    • You can further break down error into bias and systematic error:
      • The same way standing on a weighing scale while wearing clothes makes you heavier, a bias creates an error based on how you measure something.
      • If, instead, the weighing scale itself isn’t calibrated properly, there’s a systematic error. This affects all results due to the nature of your measuring equipment itself.
  • Mean
    • This is the average of a set of data points, generally written using μ. When dealing with statistics, keep your language precise to communicate the most effective message possible. If the average life expectancy in the U.S. is 79 years, know the standard deviation and sample size. You may not need to report those factors, but they’ll help you put your averages in context.
    • When journalists write about the “average citizen” or the “average voter,” in most cases, they’re not referring to the strict mathematical definition of an average (the sum of each data point divided by the number of data points). Rather, journalists tend to refer to the “average” as a common, representative individual in a population. Keep in mind the statistical average only represents this “average individual” based on how the standard deviation and sample size.
  • Median
    • If you listed your data points from highest to lowest, the value in the middle is the median. Because this doesn’t depend on how far spread out or varied the data points are, the median is, more or less, the “middle.” It doesn’t matter that the richest person in America makes four times as much as the middle-class. What matters is whether it’s greater or lesser than those in the middle.
    • In some cases, the median can give you a more accurate idea of the “average” person in a population when reporting. Make sure you understand where the median falls in the space between the highest and lowest data point. That can tell you more about how the numbers are distributed.
    • Paleontologist Stephen Jay Gould quoted Twain’s “damned lies” quote to argue that using the eight-month median survival time for peritoneal mesothelioma was misleading. Many people, like Gould who lived for two more decades, would live for years and take an optimistic, positive view of statistics in general.
  • Mode
    • The mode is the number occurring most often. This simple and clean measurement can tell you who’s the most popular candidate in an election. You won’t see this much, but it’s helpful for comparing raw numbers against one another like sales figures.
  • Multiplication rule
    • If there’s a 1/2 chance you’ll draw a red card from a deck and a 1/13 chance the card is a King, then there chance to draw aa red King is 1/2*1/13= 1/26. This holds for independent events.
    • Keep track of how one event may affect the other. If you draw red card from a deck (with a 1/2 probability), the chance the next card is red is now 25/51 because you have one less red card in the deck.
  • Normal (or standard or Gaussian) distribution
    • Imagine taking a set of heights in a population and graphing the heights on the x-axis with how many times they occur on the y-axis. If the data is “normally” distributed, then most people should fall around an average height with fewer and fewer heights farther away from this average as shown in the graph above. In the normal distribution, you can define this distribution using the mean and standard deviation with  σ as the standard deviation and μ as the mean.
    • The normal distribution centers on the average and, with a greater standard deviation, it becomes more spread out in both directions. You most likely won’t report the normal distribution explicitly in a news story.
    • The standard deviation lets you compare the mean to the distribution. About 70 percent of people are one standard deviation from the mean (in either direction), 95 percent are two standard deviations away and almost everyone, three standard deviations away. The Z-score also tells you how far away a data point is from the mean.
    • If you wanted to test if a new psychiatric drug changed the frequency of mood swings, you might measure the number of mood swings in a population with the drug and a population without. If you found that the means of the two distributions are separated by a certain number of standard deviations, you can convert that to a p-value. The smaller the p-value, the more likely it is that the drug itself, not some random variable. This gives you a probability that the drug works.
  • Null hypothesis (H0)
    • To figure out if smoking truly causes cancer, scientists look for ways to show that “smoking doesn’t cause cancer” is false. This is a null hypothesis (H0), usually used to show that there is no effect or no relationship between what you want to show. In the words of scientists, they look for ways to “reject the null hypothesis.”
    • When creating a standard distribution, the p-value tells you how likely it is to reject the null hypothesis.
  • Quartile
    • Split the data into four equally sized groups. The lowest quarter is the lower quartile, the highest quarter is the upper quartile and everything in between is the middle quartile. The range of the middle quartile is the interquartile range.
  • Range
    • This is the highest value minus the smallest. Note that the range is a single number, not a range of numbers.
  • Regression
    • Regression tests what causes something to happen. If smoking really does cause an increase in cancer, then you should see it if you make a graph of cancer prevalence vs. smoking like the graph above, usually with a line of best fit (shown in red). Given enough linear regressions, you can separate a scientific observation explain the relationship between into the variables that cause it.
    • Keep in mind correlation does not imply causation. If you find that video game sales rise around similar times when violent crimes occur, you still need to show that one caused the other before drawing conclusions between the two. Otherwise things may be a coincidence or just a matter of randomness.
    • You’ll see an R value (how well one variable explains the other) or an R2 value (how well the model fits the data). The ANOVA (Analysis of Variance) creates an R2 and whether the result is “statistically significant.”
  • Standard deviation
    • The standard deviation is how widely values are spread apart or how much the data varies. This, along with mean, defines a normal distribution.
    • You can calculate the standard deviation of the population the formula above with x̄ as the average of data points x over n number of data points with Σ the sum of each value (xi – x̄)2. If you want the standard deviation of a specific sample, use n-1 instead of n in the denominator because you only know the mean of that sample, not the population.
    • The standard deviation squared gives you the variance. Sometimes researchers use “deviation” and “variance” interchangeably so keep in mind the difference.
  • Stochastic models
    • These are ways to predict future data like financial portfolios or weather forecasts that depend on randomness. Using distributions like the normal or beta distributions, you can simulate what future data will look like and form predictions.
  • Variable
    • Variables are anything that differs from person to person or sample to sample.
    • Categorical variables are ways of labeling people into groups (like biological sex or state of residence), continuous ones lie on a scale (like age or temperature), qualitative ones use adjectives (like colors) and random variables are what scientists measure as outcomes of experiments (like flipping a coin).
  • 2. Study the source

    The Society for Professional Journalists dictates you should remain accountable and transparent, seek the truth and report it and act independently. In the context of numbers, this means remaining open and honest in data analysis, scrutinizing findings and mathematical methods and doing so free from anything that my interfere with an investigation. After you know the definitions of statistics, you need to know where those numbers came from. This means not only knowing how data was collected, but appealing to statistics in a way that reflects the current principles of journalism.

    Writer William Davies argued the authority of statistics and the researchers who study them is declining. In a post-statistical society, journalists need to remain objective and skeptical to statistics while still appreciating them for what they are. It won’t be a battle between elite facts and populist feelings, but, rather, public rhetoric and the forces against it.

    Remember to keep numbers in context of their original source or how they were measured. If someone asks where you got your information from or how a number was calculated, you should have an appropriate answer. If you’re reporting a p-value for biomedical study, which variables were measured? How does the standard deviation affect the certainty of the results? Make sure that, for whatever claim or argument a scientist has put forward in a study, you can be responsible for however you report on it.

    As you become more statistically literate, you’ll naturally reevaluate how you reason. Becoming aware of common fallacies and pitfalls journalists fall into can make you more prepared to present accurate scientific findings. Be careful when you read a study suggesting that, because people are losing jobs, the economy must be doing poorly or that, if a study found no evidence on the link between fossil fuels and climate change that you conclude there’s evidence of absence. You can begin to see through the arguments that the majority of people saying something is true makes it true and, instead, take a more empirical approach to forming an opinion.

    Much more sinister are those who prey on individuals without a strong statistical or mathematical literacy. Showing that the cost of attending college is a smaller percent of the national debt now than it was in the 1960s doesn’t show that today’s college students pay less for their education. As you study the context and nuances of scientific findings, you’ll become better prepared to separate signal from noise in these situations.

    If there’s a 20 percent chance of rain, does that mean it will rain 20 percent of the time? If a medical procedure has a false positive rate of 1 out of 10 trials, how does that change its effectiveness? It’s easy to appeal to the authority of statistics and science without investigating for yourself. Check what experiments were performed or the historical use of tests like the Fisher’s exact test.

    This way, you’re acting as both a writer and a researcher. The key here is to avoid resorting to phrases like “studies show” or “survey says,” and, instead, ask yourself if you really know what the scientific studies purport. Many times scientists will refer to terms like “standard deviation” or “variance” interchangeably so make sure you know what’s being reported.

    3. Remember the reader

    Now that you have a deep understanding of what you’re reporting and what it means, you need to put it in a context that a general audience can understand.

    If you ask a drunkard what number is larger, 2/3 or 3/5, he won’t be able to tell you. But if you rephrase the question: what is better, 2 bottles of vodka for 3 people or 3 bottles of vodka for 5 people, he will tell you right away: 2 bottles for 3 people, of course.

    Edward Frenkel, “Love and Math: The Heart of Hidden Reality”

    In the quote above, how does drunkard arrive at the correct answer? The statistics are presented differently. In the rephrased question, he has a more “tangible,” usable way of understanding how the proportions of vodka would be arise from the distribution among people.

    How well do you understand what you write? Try answering this question to find out.

    Imagine you conduct a breast cancer screening using mammography in a certain region. You know the following information about the women in this region: The probability that a woman has breast cancer is 1 percent (known as “prevalence”). If a woman has breast cancer, the probability that she tests positive is 90 percent (“sensitivity”). If a woman does not have breast cancer, the probability that she nevertheless tests positive is 9 percent (false-positive rate). A woman tests positive. She wants to know from you whether that means that she has breast cancer for sure, or what the chances are. What is the best answer?

      A. The probability that she has breast cancer is about 81 percent.
      B. Out of 10 women with a positive mammogram, about 9 have breast cancer.
      C. Out of 10 women with a positive mammogram, about 1 has breast cancer.
      D. The probability that she has breast cancer is about 1 percent.

    When German psychologist Gerd Gigerezner posed the question to about 1000 gynecologists, about 21 percent chose the correct answer, C. While that is a little worse than random guessing, I must admit that, on my first attempt, I failed to answer this question correctly, as well. Through his research, Gigerezner has crafted a theory of understanding statistics that would help us in situations like this.

    Similar to Frenkel’s example of the fractions of vodka, psychologists like Daniel Kahneman and Gerd Gigerezner have shown that asking statistics questions in different ways can influence the ways we understand them. For example, when the information preceding the question is framed differently (as shown below), 87 percent of gynecologists answered correctly.

    Assume you conduct breast cancer screening using mammography in a certain region. You know the following information about the women in this region:

    • Ten out of every 1,000 women have breast cancer
    • Of these 10 women with breast cancer, 9 test positive
    • Of the 990 women without cancer, about 89 nevertheless test positive

    In both examples (of breast cancer screening and of bottles of vodka), when we change from “conditional probabilities” to “natural frequencies,” we suddenly understand statistics much better. Like Gigerezner, I believe we can teach the appropriate way to interpret statistics, and, with the effect it has on our health and society, we have a moral imperative to do so.

    You can use a confusion matrix like the one above to keep track of the accuracy metrics of an experiment when presenting information to colleagues.

    This isn’t a simple case of deliberately communicating false information or lying about the statistics we use. While there may be agendas and conflicts-of-interests between professionals (including scientists), we simply don’t understand how to interpret statistics. And, in the field of medicine, this can have disastrous results. We make poor decisions about how long a patient may live, how prevalence of cancer among smokers, and understanding the harms and benefits of screening for breast cancer.

    4. Present the product

    Many ways of visualizing, illustrating or explaining statistics exist no matter the medium. Looking across FiveThirtyEight, The Guardian‘s Data section or other data journalism publications, you can find effective ways of communicating complicated concepts either to the audience of your publication or to colleagues. Use figures and graphs to explain take-home messages and conclusions from your reporting. Make sure they’re easy to read and follow.

    Python and R offer ways of visualizing statistical findings with R providing much more extensive libraries for statistics than Python. My work in creating interactive network graphs, word clouds and even periodic tables show some examples. To produce a confusion matrix like the one shown below, you can use this code.

    Compare this confusion matrix to the Null hypothesis table above. Though it might be too complicated for someone reading a newspaper, you can use it to present findings to other researchers.

    It’s a good idea to value openness and transparency with your code and work in creating visualizations. This gives other researchers and writers ways to check and re-examine what you’ve done. The chart below shows how much the University of California Santa Cruz Science Communication class of 2020 used Slack during their fall quarter (with its code here). Interactive graphs give the reader a better sense of data and let you communicate more information as effective as possible.

    Make sure to perform statistical tests to confirm results from research when you report. In the movie “Rosencrantz and Guildenstern Are Dead,” the two protagonists flip a coin heads 92 times in a row. The chances this may happen is about 1 in 5 octillion. In a more realistic setting, the Dallas Cowboys have won 6 out of 8 coin tosses in the history of Super Bowls. In R, you can use a binomial distribution to return the value 0.109375.

    probability <- .5 # Set the odds of getting heads to .5. 
    wins <- 6 # number of winning coin flips
    totalFlips <- 8 # total coin flips
    dbinom(wins, totalFlips, probability)
    > 0.109375

    In the code, the comments are written with a # in front of them explaining what each line does. These comments are notes programmers write to explain things without affecting the code.

    With enough coin tosses you can make a graph of how these probabilities are based on the number of heads and flips. When you only have two outcomes (heads or tails), it’s a Bernoulli distribution.

    How likely is it the coin is fair? (Code found here.)

    Not all visuals are created equal. Statisticians William Cleveland and Robert McGill found that people can tell differences between length and angles much more easily than shapes and colors. This means, where appropriate, you should use charts and plots that rely on lines slopes when possible and avoid pie charts.

    No matter the code or plot you make, taking an independent investigative approach to statistics can let you harness the power of data in your stories. Becoming more savvy with numbers and calculations can let you present more accurate, verified findings. Though you can’t just drop statistics without context or understanding of how they came about, newsrooms and other workplaces for publications can use a more empirical approach in their findings in presenting scientific research for what it is. Whether its journalists themselves or a hired analyst creating statistical models of disease prevalence, they should adhere to the established standards of journalism.

    Life expectancy: visualized. (Code found here.)

    Journalism emphasizes quick, easy-to-understand conclusions and messages. While some projects can require more complicated work flows such as Bayesian models, bootstrapping or exploratory data analysis, sometimes all that matters is whether an experiment worked or didn’t. In many cases, you simply don’t have the time or capacity to explain what a p-value or regression test is. Still, becoming statistically literate and understanding the mathematics behind calculations involved in research can make you all the more prepared in presenting stories. Being able to tell the difference between causation and correlation can save you from drawing false conclusions and make your arguments more justified on the basis of statistics. It can give you the power to check the work of others and move journalism into a domains of peer-reviewed, egalitarian work. In writing this guide, I hope to do so as well.