Graph algorithms practical examples in apache spark and neo4j 1st edition mark needham

Visit to download the full and correct content document: https://textbookfull.com/product/graph-algorithms-practical-examples-in-apache-spark -and-neo4j-1st-edition-mark-needham/

More products digital (pdf, epub, mobi) instant download maybe you interests ...

Graph Algorithms for Data Science: With examples in Neo4j 1st Edition Tomaž Bratanic

https://textbookfull.com/product/graph-algorithms-for-datascience-with-examples-in-neo4j-1st-edition-tomaz-bratanic/

Spark in Action - Second Edition: Covers Apache Spark 3 with Examples in Java, Python, and Scala Jean-Georges Perrin

https://textbookfull.com/product/spark-in-action-second-editioncovers-apache-spark-3-with-examples-in-java-python-and-scalajean-georges-perrin/

High Performance Spark Best Practices for Scaling and Optimizing Apache Spark 1st Edition Holden Karau

https://textbookfull.com/product/high-performance-spark-bestpractices-for-scaling-and-optimizing-apache-spark-1st-editionholden-karau/

Stream Processing with Apache Spark Mastering Structured Streaming and Spark Streaming 1st Edition Gerard Maas

https://textbookfull.com/product/stream-processing-with-apachespark-mastering-structured-streaming-and-spark-streaming-1stedition-gerard-maas/

Graph Data Science with Neo4j: Learn how to use Neo4j 5 with Graph Data Science library 2.0 and its Python driver for your project Scifo

https://textbookfull.com/product/graph-data-science-withneo4j-learn-how-to-use-neo4j-5-with-graph-data-sciencelibrary-2-0-and-its-python-driver-for-your-project-scifo/

Practical

Neo4j Jordan Gregory

https://textbookfull.com/product/practical-neo4j-jordan-gregory/

Introducing .NET for Apache Spark: Distributed Processing for Massive Datasets 1st Edition Ed Elliott

https://textbookfull.com/product/introducing-net-for-apachespark-distributed-processing-for-massive-datasets-1st-edition-edelliott/

Beginning Apache Spark Using Azure Databricks: Unleashing Large Cluster Analytics in the Cloud 1st Edition Robert Ilijason

https://textbookfull.com/product/beginning-apache-spark-usingazure-databricks-unleashing-large-cluster-analytics-in-thecloud-1st-edition-robert-ilijason/

Beginning Apache Spark Using Azure Databricks: Unleashing Large Cluster Analytics in the Cloud Robert Ilijason

https://textbookfull.com/product/beginning-apache-spark-usingazure-databricks-unleashing-large-cluster-analytics-in-the-cloudrobert-ilijason/

Graph Algorithms

Practical Examples in Apache Spark & Neo4j

Mark Needham & Amy E. Hodler

Graph Algorithms

Practical Examples in Apache Spark and Neo4j

Mark Needham and Amy E. Hodler

BostonFarnhamSebastopolTokyo Beijing

Graph Algorithms

by Mark Needham and Amy E. Hodler

Printed in the United States of America.

Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472.

O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://oreilly.com). For more information, contact our corporate/institutional sales department: 800-998-9938 or corporate@oreilly.com

AcquisitionEditor: Jonathan Hassell

Editor: Jeff Bleiel

ProductionEditor: Deborah Baker

CopyEditor: Tracy Brown

Proofreader: Rachel Head

May 2019: First Edition

RevisionHistoryfortheFirstEdition

2019-04-15: First Release

Indexer: Judy McConville

InteriorDesigner: David Futato

CoverDesigner: Karen Montgomery

Illustrator: Rebecca Demarest

See http://oreilly.com/catalog/errata.csp?isbn=9781492047681 for release details.

The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. Graph Algorithms, the cover image of a European garden spider, and related trade dress are trademarks of O’Reilly Media, Inc.

While the publisher and the authors have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the authors disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work. Use of the information and instructions contained in this work is at your own risk. If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights.

This work is part of a collaboration between O’Reilly and Neo4j. See our statement of editorial independ‐ence.

978-1-492-05781-9

[LSI]

Centrality Algorithms.

6. Community Detection Algorithms.

Preface

The world is driven by connections—from financial and communication systems to social and biological processes. Revealing the meaning behind these connections drives breakthroughs across industries in areas such as identifying fraud rings and optimizing recommendations to evaluating the strength of a group and predicting cascading failures.

As connectedness continues to accelerate, it’s not surprising that interest in graph algorithms has exploded because they are based on mathematics explicitly developed to gain insights from the relationships between data. Graph analytics can uncover the workings of intricate systems and networks at massive scales—for any organization.

We are passionate about the utility and importance of graph analytics as well as the joy of uncovering the inner workings of complex scenarios. Until recently, adopting graph analytics required significant expertise and determination, because tools and integrations were difficult and few knew how to apply graph algorithms to their quandaries. It is our goal to help change this. We wrote this book to help organiza‐tions better leverage graph analytics so that they can make new discoveries and develop intelligent solutions faster.

What’s in This Book

This book is a practical guide to getting started with graph algorithms for developers and data scientists who have experience using Apache Spark™ or Neo4j. Although our algorithm examples utilize the Spark and Neo4j platforms, this book will also be help‐ful for understanding more general graph concepts, regardless of your choice of graph technologies.

The first two chapters provide an introduction to graph analytics, algorithms, and theory. The third chapter briefly covers the platforms used in this book before we dive into three chapters focusing on classic graph algorithms: pathfinding, centrality, and community detection. We wrap up the book with two chapters showing how

graph algorithms are used within workflows: one for general analysis and one for machine learning.

At the beginning of each category of algorithms, there is a reference table to help you quickly jump to the relevant algorithm. For each algorithm, you’ll find:

•An explanation of what the algorithm does

•Use cases for the algorithm and references to where you can learn more

• Example code providing concrete ways to use the algorithm in Spark, Neo4j, or both

Conventions Used in This Book

The following typographical conventions are used in this book:

Italic

Indicates new terms, URLs, email addresses, filenames, and file extensions.

Constant width

Used for program listings, as well as within paragraphs to refer to program ele‐ments such as variable or function names, databases, data types, environment variables, statements, and keywords.

Constant width bold

Shows commands or other text that should be typed literally by the user.

Constant width italic

Shows text that should be replaced with user-supplied values or by values deter‐mined by context.

x | Preface

This element signifies a tip or suggestion.

This element signifies a general note.

This element indicates a warning or caution.

Using Code Examples

Supplemental material (code examples, exercises, etc.) is available for download at https://bit.ly/2FPgGVV.

This book is here to help you get your job done. In general, if example code is offered with this book, you may use it in your programs and documentation. You do not need to contact us for permission unless you’re reproducing a significant portion of the code. For example, writing a program that uses several chunks of code from this book does not require permission. Selling or distributing a CD-ROM of examples from O’Reilly books does require permission. Answering a question by citing this book and quoting example code does not require permission. Incorporating a signifi‐cant amount of example code from this book into your product’s documentation does require permission.

We appreciate, but do not require, attribution. An attribution usually includes the title, author, publisher, and ISBN. For example: “Graph Algorithms by Amy E. Hodler and Mark Needham (O’Reilly). Copyright 2019 Amy E. Hodler and Mark Needham, 978-1-492-05781-9.”

If you feel your use of code examples falls outside fair use or the permission given above, feel free to contact us at permissions@oreilly.com.

O’Reilly Online Learning

For almost 40 years, O’Reilly has provided technology and business training, knowledge, and insight to help companies succeed.

Our unique network of experts and innovators share their knowledge and expertise through books, articles, conferences, and our online learning platform. O’Reilly’s online learning platform gives you on-demand access to live training courses, indepth learning paths, interactive coding environments, and a vast collection of text and video from O’Reilly and 200+ other publishers. For more information, please visit http://oreilly.com.

Preface | xi

How to Contact Us

Please address comments and questions concerning this book to the publisher:

O’Reilly Media, Inc.

1005 Gravenstein Highway North Sebastopol, CA 95472

800-998-9938 (in the United States or Canada)

707-829-0515 (international or local)

707-829-0104 (fax)

We have a web page for this book, where we list errata, examples, and any additional information. You can access this page at http://bit.ly/graph-algorithms.

To comment or ask technical questions about this book, send email to bookques‐tions@oreilly.com.

For more information about our books, courses, conferences, and news, see our web‐site at http://www.oreilly.com.

Find us on Facebook: http://facebook.com/oreilly

Watch us on YouTube: http://www.youtube.com/oreillymedia

Acknowledgments

We’ve thoroughly enjoyed putting together the material for this book and thank all those who assisted. We’d especially like to thank Michael Hunger for his guidance, Jim Webber for his invaluable edits, and Tomaz Bratanic for his keen research. Finally, we greatly appreciate Yelp permitting us to use its rich dataset for powerful examples.

Foreword

What do the following things all have in common: marketing attribution analysis, anti-money laundering (AML) analysis, customer journey modeling, safety incident causal factor analysis, literature-based discovery, fraud network detection, internet search node analysis, map application creation, disease cluster analysis, and analyzing the performance of a William Shakespeare play. As you might have guessed, what these all have in common is the use of graphs, proving that Shakespeare was right when he declared, “All the world’s a graph!”

Okay, the Bard of Avon did not actually write graph in that sentence, he wrote stage However, notice that the examples listed above all involve entities and the relation‐ships between them, including both direct and indirect (transitive) relationships. Entities are the nodes in the graph—these can be people, events, objects, concepts, or places. The relationships between the nodes are the edges in the graph. Therefore, isn’t the very essence of a Shakespearean play the active portrayal of entities (the nodes) and their relationships (the edges)? Consequently, maybe Shakespeare could have written graph in his famous declaration.

What makes graph algorithms and graph databases so interesting and powerful isn’t the simple relationship between two entities, with A being related to B. After all, the standard relational model of databases instantiated these types of relationships in its foundation decades ago, in the entity relationship diagram (ERD). What makes graphs so remarkably important are directional relationships and transitive relation‐ships. In directional relationships, A may cause B, but not the opposite. In transitive relationships, A can be directly related to B and B can be directly related to C, while A is not directly related to C, so that consequently A is transitively related to C.

With these transitivity relationships—particularly when they are numerous and diverse, with many possible relationship/network patterns and degrees of separation between the entities—the graph model uncovers relationships between entities that otherwise may seem disconnected or unrelated, and are undetected by a relational xiii

database. Hence, the graph model can be applied productively and effectively in many network analysis use cases.

Consider this marketing attribution use case: person A sees the marketing campaign; person A talks about it on social media; person B is connected to person A and sees the comment; and, subsequently, person B buys the product. From the marketing campaign manager’s perspective, the standard relational model fails to identify the attribution, since B did not see the campaign and A did not respond to the campaign. The campaign looks like a failure, but its actual success (and positive ROI) is discov‐ered by the graph analytics algorithm through the transitive relationship between the marketing campaign and the final customer purchase, through an intermediary (entity in the middle).

Next, consider an anti-money laundering (AML) analysis case: persons A and C are suspected of illicit trafficking. Any interaction between the two (e.g., a financial trans‐action in a financial database) would be flagged by the authorities, and heavily scruti‐nized. However, if A and C never transact business together, but instead conduct financial dealings through safe, respected, and unflagged financial authority B, what could pick up on the transaction? The graph analytics algorithm! The graph engine would discover the transitive relationship between A and C through intermediary B.

In internet searches, major search engines use a hyperlinked network (graph-based) algorithm to find the central authoritative node across the entire internet for any given set of search words. The directionality of the edge is vital in this case, since the authoritative node in the network is the one that many other nodes point at.

With literature-based discovery (LBD)—a knowledge network (graph-based) applica‐tion enabling significant discoveries across the knowledge base of thousands (or even millions) of research journal articles—“hidden knowledge” is discovered only through the connection between published research results that may have many degrees of separation (transitive relationships) between them. LBD is being applied to cancer research studies, where the massive semantic medical knowledge base of symptoms, diagnoses, treatments, drug interactions, genetic markers, short-term results, and long-term consequences could be “hiding” previously unknown cures or beneficial treatments for the most impenetrable cases. The knowledge could already be in the network, but we need to connect the dots to find it.

Similar descriptions of the power of graphing can be given for the other use cases lis‐ted earlier, all examples of network analysis through graph algorithms. Each case deeply involves entities (people, objects, events, actions, concepts, and places) and their relationships (touch points, both causal and simple associations).

When considering the power of graphing, we should keep in mind that perhaps the most powerful node in a graph model for real-world use cases might be “context.”

Context may include time, location, related events, nearby entities, and more. Incor‐

xiv | Foreword

porating context into the graph (as nodes and as edges) can thus yield impressive pre‐dictive analytics and prescriptive analytics capabilities.

Mark Needham and Amy Hodler’s Graph Algorithms aims to broaden our knowledge and capabilities around these important types of graph analyses, including algo‐rithms, concepts, and practical machine learning applications of the algorithms. From basic concepts to fundamental algorithms to processing platforms and practical use cases, the authors have compiled an instructive and illustrative guide to the won‐derful world of graphs.

— Kirk Borne, PhD

Principal Data Scientist and Executive Advisor Booz Allen Hamilton March 2019

Foreword | xv

CHAPTER 1 Introduction

Graphs are one of the unifying themes of computer science—an abstract representation that describes the organization of transportation systems, human interactions, and telecommuni‐cation networks. That so many different structures can be modeled using a single formalism is a source of great power to the educated programmer.

The Algorithm Design Manual, by Steven S. Skiena (Springer), Distinguished Teach‐ing Professor of Computer Science at Stony Brook University

Today’s most pressing data challenges center around relationships, not just tabulating discrete data. Graph technologies and analytics provide powerful tools for connected data that are used in research, social initiatives, and business solutions such as:

•Modeling dynamic environments from financial markets to IT services

•Forecasting the spread of epidemics as well as rippling service delays and outages

•Finding predictive features for machine learning to combat financial crimes

•Uncovering patterns for personalized experiences and recommendations

As data becomes increasingly interconnected and systems increasingly sophisticated, it’s essential to make use of the rich and evolving relationships within our data.

This chapter provides an introduction to graph analysis and graph algorithms. We’ll start with a brief refresher about the origin of graphs before introducing graph algo‐rithms and explaining the difference between graph databases and graph processing. We’ll explore the nature of modern data itself, and how the information contained in connections is far more sophisticated than what we can uncover with basic statistical methods. The chapter will conclude with a look at use cases where graph algorithms can be employed.

What Are Graphs?

Graphs have a history dating back to 1736, when Leonhard Euler solved the “Seven Bridges of Königsberg” problem. The problem asked whether it was possible to visit all four areas of a city connected by seven bridges, while only crossing each bridge once. It wasn’t.

With the insight that only the connections themselves were relevant, Euler set the groundwork for graph theory and its mathematics. Figure 1-1 depicts Euler’s progres‐sion with one of his original sketches, from the paper “Solutio problematis ad geome‐triam situs pertinentis”

Figure 1-1. The origins of graph theory. The city of Königsberg included two large islands connected to each other and the two mainland portions of the city by seven bridges. The puzzle was to create a walk through the city, crossing each bridge once and only once.

While graphs originated in mathematics, they are also a pragmatic and high fidelity way of modeling and analyzing data. The objects that make up a graph are called nodes or vertices and the links between them are known as relationships, links, or edges. We use the terms nodes and relationships in this book: you can think of nodes as the nouns in sentences, and relationships as verbs giving context to the nodes. To avoid any confusion, the graphs we talk about in this book have nothing to do with graphing equations or charts as in Figure 1-2.

2 | Chapter1:Introduction

Figure 1-2. A graph is a representation of a network, often illustrated with circles to rep‐resent entities which we call nodes, and lines to represent relationships.

Looking at the person graph in Figure 1-2, we can easily construct several sentences which describe it. For example, person A lives with person B who owns a car, and person A drives a car that person B owns. This modeling approach is compelling because it maps easily to the real world and is very “whiteboard friendly.” This helps align data modeling and analysis.

But modeling graphs is only half the story. We might also want to process them to reveal insight that isn’t immediately obvious. This is the domain of graph algorithms.

What Are Graph Analytics and Algorithms?

Graph algorithms are a subset of tools for graph analytics. Graph analytics is some‐thing we do—it’s the use of any graph-based approach to analyze connected data. There are various methods we could use: we might query the graph data, use basic statistics, visually explore the graphs, or incorporate graphs into our machine learn‐ing tasks. Graph pattern–based querying is often used for local data analysis, whereas graph computational algorithms usually refer to more global and iterative analysis. Although there is overlap in how these types of analysis can be employed, we use the term graph algorithms to refer to the latter, more computational analytics and data science uses.

Graph algorithms provide one of the most potent approaches to analyzing connected data because their mathematical calculations are specifically built to operate on rela‐tionships. They describe steps to be taken to process a graph to discover its general qualities or specific quantities. Based on the mathematics of graph theory, graph algo‐rithms use the relationships between nodes to infer the organization and dynamics of complex systems. Network scientists use these algorithms to uncover hidden infor‐mation, test hypotheses, and make predictions about behavior.

Network Science

Network science is an academic field strongly rooted in graph theory that is concerned with mathematical models of the relationships between objects. Network scientists rely on graph algorithms and database management systems because of the size, con‐nectedness, and complexity of their data.

There are many fantastic resources for complexity and network science. Here are a few references for you to explore.

• Network Science, by Albert-László Barabási, is an introductory ebook

• Complexity Explorer offers online courses

• The New England Complex Systems Institute provides various resources and papers

Graph algorithms have widespread potential, from preventing fraud and optimizing call routing to predicting the spread of the flu. For instance, we might want to score particular nodes that could correspond to overload conditions in a power system. Or we might like to discover groupings in the graph which correspond to congestion in a transport system.

In fact, in 2010 US air travel systems experienced two serious events involving multi‐ple congested airports that were later studied using graph analytics. Network scien‐tists P. Fleurquin, J. J. Ramasco, and V. M. Eguíluz used graph algorithms to confirm the events as part of systematic cascading delays and use this information for correc‐tive advice, as described in their paper, “Systemic Delay Propagation in the US Air‐port Network”.

To visualize the network underpinning air transportation Figure 1-3 was created by Martin Grandjean for his article, “Connected World: Untangling the Air Traffic Net‐work”. This illustration clearly shows the highly connected structure of air transpor‐tation clusters. Many transportation systems exhibit a concentrated distribution of links with clear hub-and-spoke patterns that influence delays.

4 | Chapter1:Introduction

Figure 1-3. Air transportation networks illustrate hub-and-spoke structures that evolve over multiple scales. These structures contribute to how travel flows.

Graphs also help uncover how very small interactions and dynamics lead to global mutations. They tie together the micro and macro scales by representing exactly which things are interacting within global structures. These associations are used to forecast behavior and determine missing links. Figure 1-4 is a foodweb of grassland species interactions that used graph analysis to evaluate the hierarchical organization and species interactions and then predict missing relationships, as detailed in the paper by A. Clauset, C. Moore, and M. E. J. Newman, “Hierarchical Structure and the Prediction of Missing Links in Network”

Figure 1-4. This foodweb of grassland species uses graphs to correlate small-scale interac‐tions to larger structure formation.

Graph Processing, Databases, Queries, and Algorithms

Graph processing includes the methods by which graph workloads and tasks are car‐ried out. Most graph queries consider specific parts of the graph (e.g., a starting node), and the work is usually focused in the surrounding subgraph. We term this type of work graph local, and it implies declaratively querying a graph’s structure, as explained in the book Graph Databases, by Ian Robinson, Jim Webber, and Emil Eifrem (O’Reilly). This type of graph-local processing is often utilized for real-time transactions and pattern-based queries.

When speaking about graph algorithms, we are typically looking for global patterns and structures. The input to the algorithm is usually the whole graph, and the output can be an enriched graph or some aggregate value such as a score. We categorize such processing as graph global, and it implies processing a graph’s structure using compu‐tational algorithms (often iteratively). This approach sheds light on the overall nature of a network through its connections. Organizations tend to use graph algorithms to model systems and predict behavior based on how things disseminate, important components, group identification, and the overall robustness of the system.

There may be some overlap in these definitions—sometimes we can use processing of an algorithm to answer a local query, or vice versa—but simplistically speaking whole-graph operations are processed by computational algorithms and subgraph operations are queried in databases.

Traditionally, transaction processing and analysis have been siloed. This was an unnatural split based on technology limitations. Our view is that graph analytics

6 | Chapter1:Introduction

drives smarter transactions, which creates new data and opportunities for further analysis. More recently there’s been a trend to integrate these silos for more real-time decision making.

OLTP and OLAP

Online transaction processing (OLTP) operations are typically short activities like booking a ticket, crediting an account, booking a sale, and so forth. OLTP implies voluminous low-latency query processing and high data integrity. Although OLTP may involve only a small number of records per transaction, systems process many transactions concurrently.

Online analytical processing (OLAP) facilitates more complex queries and analysis over historical data. These analyses may include multiple data sources, formats, and types. Detecting trends, conducting “what-if” scenarios, making predictions, and uncovering structural patterns are typical OLAP use cases. Compared to OLTP, OLAP systems process fewer but longer-running transactions over many records. OLAP systems are biased toward faster reading without the expectation of transac‐tional updates found in OLTP, and batch-oriented operation is common.

Recently, however, the line between OLTP and OLAP has begun to blur. Modern data-intensive applications now combine real-time transactional operations with ana‐lytics. This merging of processing has been spurred by several advances in software, such as more scalable transaction management and incremental stream processing, and by lower-cost, large-memory hardware.

Bringing together analytics and transactions enables continual analysis as a natural part of regular operations. As data is gathered—from point-of-sale (POS) machines, manufacturing systems, or internet of things (IoT) devices—analytics now supports the ability to make real-time recommendations and decisions while processing. This trend was observed several years ago, and terms to describe this merging include translytics and hybrid transactional and analytical processing (HTAP). Figure 1-5 illus‐trates how read-only replicas can be used to bring together these different types of processing.

Figure 1-5. A hybrid platform supports the low latency query processing and high data integrity required for transactions while integrating complex analytics over large amounts of data.

According to Gartner:

[HTAP] could potentially redefine the way some business processes are executed, as real-time advanced analytics (for example, planning, forecasting and what-if analysis) becomes an integral part of the process itself, rather than a separate activity performed after the fact. This would enable new forms of real-time business-driven decisionmaking process. Ultimately, HTAP will become a key enabling architecture for intelli‐gent business operations.

As OLTP and OLAP become more integrated and begin to support functionality pre‐viously offered in only one silo, it’s no longer necessary to use different data products or systems for these workloads—we can simplify our architecture by using the same platform for both. This means our analytical queries can take advantage of real-time data and we can streamline the iterative process of analysis.

Why Should We Care About Graph Algorithms?

Graph algorithms are used to help make sense of connected data. We see relation‐ships within real-world systems from protein interactions to social networks, from communication systems to power grids, and from retail experiences to Mars mission planning. Understanding networks and the connections within them offers incredible potential for insight and innovation.

Graph algorithms are uniquely suited to understanding structures and revealing pat‐terns in datasets that are highly connected. Nowhere is the connectivity and interac‐

8 | Chapter1:Introduction

tivity so apparent than in big data. The amount of information that has been brought together, commingled, and dynamically updated is impressive. This is where graph algorithms can help make sense of our volumes of data, with more sophisticated ana‐lytics that leverage relationships and enhance artificial intelligence contextual infor‐mation.

As our data becomes more connected, it’s increasingly important to understand its relationships and interdependencies. Scientists that study the growth of networks have noted that connectivity increases over time, but not uniformly. Preferential attachment is one theory on how the dynamics of growth impact structure. This idea, illustrated in Figure 1-6, describes the tendency of a node to link to other nodes that already have a lot of connections.

Figure 1-6. Preferential attachment is the phenomenon where the more connected a node is, the more likely it is to receive new links. This leads to uneven concentrations and hubs.

In his book, Sync: How Order Emerges from Chaos in the Universe, Nature, and Daily Life (Hachette), Steven Strogatz provides examples and explains different ways that real-life systems self-organize. Regardless of the underlying causes, many researchers

believe that how networks grow is inseparable from their resulting shapes and hierar‐chies. Highly dense groups and lumpy data networks tend to develop, with complex‐ity growing along with data size. We see this clustering of relationships in most realworld networks today, from the internet to social networks like the gaming community shown in Figure 1-7.

Figure 1-7. This gaming community analysis shows a concentration of connections around just 5 of 382 communities.

The network analysis shown in Figure 1-7 was created by Francesco D’Orazio of Pul‐sar to help predict the virality of content and inform distribution strategies. D’Orazio found a correlation between the concentration of a community’s distribution and the speed of diffusion of a piece of content.

This is significantly different than what an average distribution model would predict, where most nodes would have the same number of connections. For instance, if the World Wide Web had an average distribution of connections, all pages would have about the same number of links coming in and going out. Average distribution mod‐els assert that most nodes are equally connected, but many types of graphs and many real networks exhibit concentrations. The web, in common with graphs like travel and social networks, has a power-law distribution with a few nodes being highly con‐nected and most nodes being modestly connected.

10 | Chapter1:Introduction

Power Law

A power law (also called a scaling law) describes the relationship between two quanti‐ties where one quantity varies as a power of another. For instance, the area of a cube is related to the length of its sides by a power of 3. A well-known example is the Pareto distribution or “80/20 rule,” originally used to describe the situation where 20% of a population controlled 80% of the wealth. We see various power laws in the natural world and networks.

Trying to “average out” a network generally won’t work well for investigating relation‐ships or forecasting, because real-world networks have uneven distributions of nodes and relationships. We can readily see in Figure 1-8 how using an average of character‐istics for data that is uneven would lead to incorrect results.

Figure 1-8. Real-world networks have uneven distributions of nodes and relationships represented in the extreme by a power-law distribution. An average distribution assumes most nodes have the same number of relationships and results in a random network.

Because highly connected data does not adhere to an average distribution, network scientists use graph analytics to search for and interpret structures and relationship distributions in real-world data.

There is no network in nature that we know of that would be described by the random network model.

Another random document with no related content on Scribd:

When, after removing the lens of the eye in the larval salamander, we see it restored in perfect and typical form by regeneration from the posterior layer of the iris, we behold an adaptive response to changed conditions of which the organism can have no antecedent experience either ontogenetic or phylogenetic, and one of so marvelous a character that we are made to realize, as by a flash how far we still are from a solution of this problem.” Then, after discussing the attempt of evolutionists to bridge the enormous gap that separates living, from lifeless nature, he continues: “But when all these admissions are made, and when the conserving action (sic) of natural selection is in the fullest degree recognized, we cannot close our eyes to two facts: first, that we are utterly ignorant of the manner in which the idioplasm of the germ cell can so respond to the influence of the environment as to call forth an adaptive variation; and second, that the study of the cell has on the whole seemed to widen rather than to narrow the enormous gap that separates even the lowest forms of life from the inorganic world.” (“The Cell,” 2nd edit., pp. 433, 434.)

§ 5. A “New” Theory of

Abiogenesis

Since true science is out of sympathy with baseless conjectures and gratuitous assumptions, one would scarcely expect to find scientists opposing the inductive trend of the known facts by preferring mere possibilities (if they are even such) to solid actualities. As a matter of fact, however, there are not a few who obstinately refuse to abandon preconceptions for which they can find no factual justification. The bio-chemist, Benjamin Moore, while conceding the bankruptcy of the old theory of spontaneous generation, which looked for a de novo origin of living cells in sterilized cultures, has, nevertheless, the hardihood to propose what he is pleased to term a new one. Impressed by the credulity of Charlton Bastian and the autocratic tone of Schäfer, he sets out to defend as plausible the hypothesis that the origination of life from inert matter may be a contemporaneous, perhaps, daily, phenomenon, going on continually, but invisible to us, because its

initial stages take place in the submicroscopic world. By the time life has emerged into the visible world, it has already reached the stage at which the law of genetic continuity prevails, but at stages of organization, which lie below the limit of the microscope, it is not impossible, he thinks, that abiogenesis may occur. To plausibleize this conjecture, he notes that the cell is a natural unit composed of molecules as a molecule is a natural unit composed of atoms. He further notes, that, in addition to the cell, there is in nature another unit higher than the monomolecule, namely, the multimolecule occurring in both crystalloids and colloids. The monomolecule consists of atoms held together by atomic valence, whereas the multimolecule consists of molecules whose atomic valence is completely saturated, and which are, consequently, held together by what is now known as molecular or residual valence. Moore cites the crystal units of sodium bromide and sodium iodide as instances of multimolecules. The crystal unit of ordinary salt, sodium chloride, is an ordinary monomolecule, with the formula NaCl. In the case of the former salts the crystal units consist of multimolecules of the formula NaB·(H2O)2 and NaI·(H2O)2, the water of crystallization not being mechanically confined in the crystals, but combined with the respective salt in the exact ratio of two molecules of water to one of the salt. Judged by all chemical tests, such as heat of formation, the law of combination in fixed ratios, the manifestation of selective affinity, etc., the multimolecule is quite as much entitled to be considered a natural unit as is the monomolecule.

But it is not in the crystalloidal multimolecule, but in the larger and more complex multimolecule of colloids (viscid substances like gum arabic, gelatine, agar-agar, white of egg, etc.), that Moore professes to see a sort of intermediate between the cell and inorganic units. Such colloids form with a dispersing medium (like water) an emulsion, in which the dispersed particles, known as ultramicrons or “solution aggregates,” are larger than monomolecules. It is among these multimolecules of colloids that Moore would have us search for a transitional link connecting the cell with the inorganic world. Borrowing Herbert Spencer’s dogma of the complication of homogeneity into heterogeneity, he asserts that such colloidal

multimolecules would tend to become more and more complex, and consequently more and more instable, so that their instability would gradually approach the chronic instability or constant state of metabolic fluxion manifest in living organisms. The end-result would be a living unit more simply organized than the cell, and evolution seizing upon this submicroscopic unit would, in due time, transform it into cellular life of every variety and kind. Ce n’est que le premier pas qui coûte!

It should be noted that this so-called law is a mere vague formula like the “law” of natural selection and the “law” of evolution. The facts which it is alleged to express are not cited, and its terms are far from being quantitative. It is certainly not a law in the sense of Arrhénius, who says: “Quantitative formulation, that is, the establishing of a connection, expressed by a formula, between different quantitatively measurable magnitudes, is the peculiar feature of a law.” (“Theories of Chemistry,” Price’s translation, p. 3.) Now, chemistry, as an exact science, has no lack of laws of this kind, but no branch of chemistry, whether physical, organic, or inorganic, knows of any law of complexity, that can be stated in either quantitative, or descriptive, terms. We will, however, let Moore speak for himself:

“It may then be summed up as a general law universal in its application to all matter, ... a law which might be called the Law of Complexity, that matter so far as its energy environment will permit tends to assume more and more complex forms in labile equilibrium. Atoms, molecules, colloids, and living organisms, arise as a result of the operations of this law, and in the higher regions of complexity it induces organic evolution and all the many thousands of living forms....

“In this manner we can conceive that the hiatus between non-living and living things can be bridged over, and there awakens in our minds the conception of a kind of spontaneous production of life of a different order from the old. The territory of this spontaneous generation of life lies not at the level of bacteria, or animalculæ, springing forth into life from dead organic matter, but at a level of life lying deeper than anything the microscope can reveal, and

possessing a lower unit than the living cell, as we form our concept of it from the tissues of higher animals and plants.

“In the future, the stage at which colloids begin to be able to deal with external energy forms, such as light, and build up in chemical complexity, will yield a new unit of life opening a vista of possibilities as magnificent as that which the establishment of the cell as a unit gave, with the development of the microscope, about a century ago.” (“Origin and Nature of Life,” pp. 188-190.)

Having heard out a rhapsody of this sort, one may be pardoned a little impatience at such a travesty on science. Again we have the appeal from realities to fancies, from the seen to unseen. Moore sees no reason to doubt and is therefore quite sure that an unverified occurrence is taking place “at a level of life lying deeper than anything the microscope can reveal.” The unknown is a veritable paradise for irresponsible speculation and phantasy. It is well, however, to keep one’s feet on the terra firma of ascertained facts and to make one’s ignorance a motive for caution rather than an incentive to reckless dogmatizing.

To begin with, it is not to a single dispersed particle or ultramicron that protoplasm has been likened, but to an emulsion, comprising both the dispersed particles and the dispersing medium, or, in other words, to the colloidal system as a whole. Moreover, even there the analogy is far from being perfect, and is confined exclusively, as Wilson has pointed out, to a rough similarity of structure and appearance. The colloidal system is obviously a mere aggregate and not a natural unit like the cell, and its dispersed particles (ultramicrons) do not multiply and perpetuate themselves by growth and division as do the living components or formed bodies of the cell. As for the single ultramicron or multimolecule of a colloidal solution, it may, indeed, be a natural unit, but it only resembles the cell in the sense that, like the latter, it is a complex of constituent molecules. Here, however, all resemblance ceases; for the ultramicron does not display the typically vital power of selfperpetuation by growth and division, which, as we have seen, is characteristic not only of the cell as a whole, but of its single

components or organelles. Certainly, the distinctive phenomena of colloidal systems cannot be interpreted as processes of multiplication. There is nothing suggestive of this vital phenomenon in the reversal of phase, which is caused by the addition of electrolytes to oil emulsions, or in gelation, which is caused by a change of temperature in certain hydrophilic colloids. Thus the addition of the salt of a bivalent cation (e.g. CaCl2 or BaCl2) to an oilin-water emulsion (if soap is used as the emulsifier) will cause the external or continuous phase (water) to become the internal or discontinuous phase. Vice versa, a water-in-oil emulsion can be reversed into an oil-in-water emulsion, under the same conditions, by the addition of the salt of a monovalent cation (e. g. NaOH). Solutions of hydrophilic colloids, like gelatine or agar-agar, can be made to “set” from the semifluid state of a hydrosol into the semisolid state of a hydrogel, by lowering the temperature, after which the opposite effect can be brought about by again raising the temperature. In white of egg, however, once gelation has taken place, through the agency of heat, it is impossible to reconvert the “gel” into a “sol” (solution). In such phenomena, it is, perhaps, possible to see a certain parallelism with some processes taking place in the cell, e. g. the osmotic processes of absorption and excretion, but to construe them as evidence of propagation by growth and division would be preposterous.

Nor is the subterfuge of relegating the question to the obscurity of the submicroscopic world of any avail; for, as a matter of fact, submicroscopic organisms actually do exist, and manage, precisely by virtue of this uniquely vital power of multiplication or reproductivity, to give indirect testimony of their invisible existence. The microörganisms, for example, which cause the disease known as Measles are so minute that they pass through the pores of a porcelain filter, and are invisible to the highest powers of the microscope. Nevertheless, they can be bred in the test tube cultures of the bacteriologist, where they propagate themselves for generations without losing the definite specificity, which make them capable of producing distinctive pathological effects in the organisms of higher animals, including man. Each of these invisible disease

germs communicates but one disease, with symptoms that are perfectly characteristic and definite. Moreover, they are specific in their choice of a host, and will not infect any and every organism promiscuously. Finally, they never arise de novo in a healthy host, but must always be transmitted from a diseased to a healthy individual. The microscopist is tantalized, to quote the words of Wilson, “with visions of disease germs which no eye has yet seen, so minute as to pass through a fine filter, yet beyond a doubt selfperpetuating and of specific type.” (Science, March 9, 1923, p. 283.)

Submicroscopic dimensions, therefore, are no obstacle to the manifestation of such vital properties as reproduction, genetic continuity, and typical specificity; and we must conclude that, if any of the ultramicrons of colloids possessed them, their minute size would not debar them from manifesting the fact. As it is, they fail to show any vital quality, whereas the submicroscopic disease germs give evidence of possessing all the characteristics of visible cells.

In fine, the radical difference between inorganic units, like atoms, molecules, and multimolecules, and living units, like protozoans and metazoans, is so obvious that it is universally admitted. Not all, however, are in accord when it comes to assigning the fundamental reason for the difference in question. Benjamin Moore postulates a unique physical energy, peculiar to living organisms and responsible for all distinctively vital manifestations. This unique form of energy, unlike all other forms, he calls “biotic energy,” denying at the same time that it is a vital force. (Cf. op. cit., pp. 224-226.) Moore seems to be desirous of dressing up vitalism in the verbal vesture of mechanism. He wants the game, without the name. But, if his “biotic energy” is unlike all other forms of energy, it ought not to parade under the same name, but should frankly call itself a “vital force.” Somewhat similar in nature is Osborn’s suggestion that the peculiar properties of living protoplasm may be due to the presence of a unique chemical element called Bion. (Cf. “The Origin and Evolution of Life,” 1917, p. 6.) Now, a chemical element unlike other chemical elements is not a chemical element at all. Osborn’s Bion, like Moore’s biotic energy, ought, by all means, to make up its mind definitely on Hamlet’s question of “to be, or not to be.” The policy of

“It is, and it is not,” is not likely to win the approval of either mechanists or vitalists.

§ 6. Hylomorphism versus Mechanism and Neovitalism

Mechanism and Neo-vitalism represent two extreme solutions of this problem of accounting for the difference between living and lifeless matter. Strictly speaking, it is an abuse of language to refer to mechanism as a solution at all. Its first pretense at solving the problem is to deny that there is any problem. But facts are facts and cannot be disposed of in this summary fashion. Forced, therefore, to face the actual fact of the uniqueness of living matter, mechanists concede the inadequacy of their physicochemical analogies, but obstinately refuse to admit the legitimacy of any other kind of explanation. Confronted with realities, which simply must have some explanation, they prefer to leave them unexplained by their own theory than have them explained by any other. They recognize the difference between a living animal and a dead animal (small credit to them for their perspicacity!), but deny that there is anything present in the former which is not present in the latter.

Neo-vitalism, on the other hand, is, at least, an attempt at solving the problem in the positive sense. It ascribes the unique activities of living organisms to the operation of a superphysical and superchemical energy or force resident in living matter. This unique dynamic principle is termed vital force. It is not an entitive nor a static principle, but belongs to the category of efficient or active causes, being variously described as an agent, energy, or force. To speak precisely, the term agent denotes an active being or substance; the term energy denotes the proximate ground in the agent of a specific activity; while the term force denotes the activity or free, kinetic, or activated phase of a given energy. In practice, however, these terms are often used interchangeably Thus Driesch, who, like all other Neo-vitalists, makes the vital principle a dynamic factor rather than an entitive principle, refers to the vital principle as a “non-material,”

“non-spatial” agent, though the term energy would be more precise. To this active or dynamic vital principle Driesch gives a name, which he borrowed from Aristotle, that is, entelechy. In so doing, however, he perverted, as he himself confesses, the true Aristotelian sense of the term in question: “The term,” he says, “ ... is not here used in the proper Aristotelian sense.” (“History and Theory of Vitalism,” p. 203.) His admission is quite correct. At the critical point, Driesch, for all his praise of Aristotle, deserts the Stagirite and goes over to the camp of Plato, Descartes, and the Neo-vitalists!

Driesch’s definition is as follows: “Entelechy is an agent sui generis, non-material and non-spatial, but acting ‘into’ space.” (Op. cit., p. 204.) Aristotle’s use of the term in this connection is quite different. He uses it, for example, in a static, rather than a dynamic, sense: “The term ‘entelechy,’” he says, “is used in two senses; in one it answers to knowledge, in the other to the exercise of knowledge. Clearly in this case it is analogous to knowledge.” (“Peri Psyches,” Bk. II, c. 1.) Knowledge, however, is only a second or static entelechy. Hence, in order to narrow the sense still further Aristotle refers to the soul as a first entelechy, by which he designates a purely entitive principle, that is, a constituent of being or substance (cf. op. cit. ibidem). The first, or entitive, entelechy, therefore, is to be distinguished from all secondary entelechies, whether of the dynamic order corresponding to kinetic energy or force, or of the static order corresponding to potential energy. Neither is it an agent, because it is only a partial constituent of the total agent, that is, of the total active being or substance. Hence, generally speaking, that which acts (the agent) is not entelechy, but the total composite of entelechy and matter, first entelechy being consubstantial with matter and not a separate existent or being. In fine, according to Aristotelian philosophy, entelechy (that is, “first” or “prime” entelechy) is not an agent nor an energy nor a force. In other words, it is totally removed from the category of efficient or active causes. The second difference between Driesch and Aristotle with respect to the use of the term entelechy lies in the fact that Driesch uses it as a synonym for the soul or vital principle, whereas, according to Aristotle, entelechy is common to the non-living units of inorganic nature as well as the

living units (organisms) of the organic world All vital principles or souls are entelechies, but not all entelechies are vital principles. All material beings or substances, whether living or lifeless, are reducible, in the last analysis, to two consubstantial principles or complementary constituents, namely, entelechy and matter. Entelechy is the binding, type-determining principle, the source of unification and specification, which makes of a given natural unit (such as a molecule or a protozoan) a single and determinate whole. Matter is the determinable and potentially-multiple element, the principle of divisibility and quantification, which can enter indifferently into the composition of this or that natural unit, and which owes its actual unity and specificity to the entelechy which here and now informs it. It is entelechy which makes a chemical element distinct from its isobare, a chemical compound distinct from its isomer, a paramœcium distinct from an amœba, a maple distinct from an oak, and a bear distinct from a tiger.

The molecular entelechy finds expression in what the organic chemist and the stereochemist understand by valence, that is, the static aspect of valence considered as the structural principle of a molecule. Hence it is entelechy which makes a molecule of urea [O:C:(NH2)2] an entirely different substance from its isomer ammonium cyanate [NH4·O·C:N], although the material substrate of each of these molecular units consists of precisely the same number and kinds of atoms. Similarly, it is the atomic entelechy which gives to the isotopes of Strontium chemical properties different from those of the isotopes of Rubidium, although the mass and corpuscular (electronic and protonic) composition of their respective atoms are identical. It is the vital entelechy or soul, which causes a fragment cut from a Stentor to regenerate its specific protoplasmic architecture instead of the type which would be regenerated from a similar fragment cut from another ciliate such as Dileptus.

In all the tridimensional units of nature, both living and non-living, the hylomorphic analysis of Aristotle recognizes an essential dualism of matter and entelechy. Hence it is not in the presence and absence of an entelechy (as Driesch contends) that living organisms differ from inorganic units. The sole difference between these two classes

of units is one of autonomy and inertia. The inorganic unit is inert, not in the sense that it is destitute of energy, but in the sense that it is incapable of self-regulation and rigidly dependent upon external factors for the utilization of its own energy-content. The living unit, on the other hand, is endowed with dynamic autonomy. Though dependent, in a general way, upon environmental factors for the energy which it utilizes, nevertheless the determinate form and direction of its activity is not imposed in all its specificity by the aforesaid environmental factors. The living being possesses a certain degree of independence with respect to these external forces. It is autonomous with a special law of immanent finality or reflexive orientation, by which all the elements and energies of the living unit are made to converge upon one and the same central result, namely, the maintenance and development of the organism both in its capacity as an individual and in its capacity as the generative source of its racial type.

The entelechies of the inert units of inorganic nature turn the forces of these units in an outward direction, so that they are incapable of operating upon themselves, of modifying themselves, or of regulating themselves. They are only capable of operating upon other units outside themselves, and in so doing they irreparably externalize their energy-contents. All physicochemical action is transitive or communicable in character, whereas vital action is of the reflexive or immanent type. Mechanical action, for example, is intermolar (i.e. an exchange between large masses of inorganic matter); physical action is intermolecular; chemical action is interatomic; while in radioactive and electrical phenomena we have intercorpuscular action. Hence all the forms of activity native to the inorganic world are reducible to interaction between discontinuous and unequally energized masses or particles. Always it is a case of one mass or particle operating upon another mass or particle distinct from, and spatially external to, itself. The effect or positive change produced by the action is received into another unit distinct from the agent or active unit, which can never become the receptive subject of the effect generated by its own activity. The living being, on the contrary, is capable of operating upon itself, so that what is modified

by the action is not outside the agent but within it. The reader does not modify the book, but modifies himself by his reading. The blade of grass can nourish not only a horse, but its very self, whereas a molecule of sodium nitrate is impotent to nourish itself, and can only nourish a subject other than itself, such as the blade of grass. Here the active source and receptive subject of the action is one and the same unit, namely, the living organism, which can operate upon itself in the interest of its own perfection. In chemical synthesis two substances interact to produce a third, but in vital assimilation one substance is incorporated into another without the production of a third. Thus hydrogen unites with oxygen to produce water. But in the case of assimilation the reaction may be expressed thus: Living protoplasm plus external nutriment equals living protoplasm increased in quantity but unchanged in specificity. Addition or subtraction alters the nature of the inorganic unit, but does not change the nature of the living unit. In chemical change, entelechy is the variant and matter is the constant, but in metabolic change, matter is the variant and entelechy the constant. “Living beings,” says Henderson, “preserve, or tend to preserve, an ideal form, while through them flows a steady stream of energy and matter which is ever changing, yet momentarily molded by life; organized, in short.” (“Fitness of the Environment,” 1913, pp. 23, 24.) The living unit maintains its own specific type amid a constant flux of matter and flow of energy. It subjugates the alien substances of the inorganic world, eliminates their mineral entelechies and utilizes their components and energies for its own purposes. The soul or vital entelechy, therefore, is more powerful than the entelechies of inorganic units which it supplants. It turns the forces of living matter inward, so that the living organism becomes capable of selfregulation and of striving for the attainment of self-perfection. It is this reflexive orientation of all energies towards self-perfection that is the unique characteristic of the living being, and not the nature of the energies themselves. The energies by which vital functions are executed are the ordinary physicochemical energies, but it is the vital entelechy or soul which elevates them to a higher plane of efficiency and renders them capable of reflexive or vital action. There is, in short, no such thing as a special vital force. The radical difference

between living and non-living units does not consist in the possession or non-possession of an entelechy, nor yet in the peculiar nature of the forces displayed in the execution of vital functions, but solely in the orientation of these forces towards an inner finality.

§ 7. The Definition of Life

Life, then, may be defined as the capacity of reflexive or selfperfective action. In any action, we may distinguish four things: (1) the agent, or source of the action; (2) the activity or internal determination differentiating the agent in the active state from the selfsame agent in the inactive state; (3) the patient or receptive subject; (4) the effect or change produced in the patient by the agent. Let us suppose that a boy named Tom kicks a door. Here Tom is the agent, the muscular contraction in his leg is the activity, the door is the patient or recipient, while the dent produced in the door is the effect or change of which the action is a production. In this action, the effect is produced not in the cause or agent, but in a patient outside of, and distinct from, the agent, and the otherness of cause and effect is consequently complete. Such an action is termed transitive, which is the characteristic type of physicochemical action. In another class of actions, however, (those, namely, that are peculiar to living beings) the otherness of cause and effect is only partial and relative. When the agent becomes ultimately the recipient of the effect or modification wrought by its own activity, that is, when the positive change produced by the action remains within the agent itself, the action is called immanent or reflexive action. Since, however, action and passion are opposites, they can coëxist in the same subject only upon condition that said subject is differentiated into partial otherness, that is, organized into a plurality of distinct and dissimilar parts or components, one of which may act upon another. Hence only the organized unit or organism, which combines unity or continuity of substance with multiplicity and dissimilarity of parts is capable of immanent action. The inorganic unit is capable only of transitive action, whose effect is produced in an exterior subject

really distinct from the agent. The living unit or organism, however, is capable of both transitive action and immanent (reflexive) action. In such functions as thought and sensation, the living agent modifies itself and not an exterior patient. In the nutritive or metabolic function the living being perfects itself by assimilating external substances to itself. It develops, organizes, repairs, and multiplies itself, holding its own and perpetuating its type from generation to generation.

Life, accordingly, is the capacity of tending through any form of reflexive action to an ulterior perfection of the agent itself. This capacity of an agent to operate of, and upon, itself for the acquisition of some perfection exceeding its natural equilibrial state is the distinctive attribute of the living being. Left to itself, the inorganic unit tends exclusively to conservation or to loss, never to positive acquisition in excess of equilibrial exigencies; what it acquires it owes exclusively to the action of external factors. The living unit, on the contrary, strives in its vital operations to acquire something for itself, so that what it gets it owes to itself and not (except in a very general sense) to the action of external factors. All the actions of the living unit, both upon itself and upon external matter, result sooner or later in the acquisition on the part of the agent of a positive perfection exceeding and transcending the mere exigencies of equilibration. The inorganic agent, on the contrary, when in the state of tension, tends only to return to the equilibrial state by alienation or expenditure of its energy; otherwise, it tends merely to conserve, by virtue of inertia, the state of rest or motion impressed upon it from without. In the chemical changes of inorganic units, the tendency to loss is even more in evidence. Such changes disrupt the integrity of the inorganic unit and dissipate its energy-content, and the unit cannot be reconstructed and recharged, except at the expense of a more richly endowed inorganic unit. The living organism, however, as we see in the case of the paramæcium undergoing endomixis, is capable of counteracting exhaustion by recharging itself.

The difference between transitive and reflexive action is not an accidental difference of degree, but an essential difference of kind. In reflexive actions, the source of the action and the recipient of the effect or modification produced by it are one and the same

substantial unit or being. In transitive actions, the receptive subject of the positive change is an alien unit distinct from the unit, which puts forth the action. Hence a reflexive action is not an action which is less transitive; it is an action which is not at all transitive, but intransitive. The difference, therefore, between the living organism, which is capable of both reflexive and transitive action, and the inorganic unit, which is only capable of transitive action, is radical and essential. This being the case, an evolutionary transition from an inert multimolecule to a reflexively-operating cell or cytode, becomes inconceivable. Evolution might, at the very most, bring about intensifications and combinations of the transitive agencies of the physicochemical world, but never the volte face, which would be necessary to reverse the centrifugal orientation of forces characteristic of the inorganic unit into the centripetal orientation of forces which makes the living unit capable of self-perfective action, self-regulation, and self-renewal. The idea, therefore, of a spontaneous derivation of living units from lifeless colloidal multimolecules must be rejected, not merely because it finds no support in the facts of experience, but also because it is excluded by aprioristic considerations.

§ 8. An Inevitable Corollary

But, if inorganic matter is impotent to vitalize itself by means of its native physicochemical forces, the inevitable alternative is that the initial production of organisms from inorganic matter was due to the action of some supermaterial agency. Certain scientists, like Henderson of Harvard, while admitting the incredibility of abiogenesis, prefer to avoid open conflict with mechanism and materialism by declaring their neutrality. “But while biophysicists like Professor Schäfer,” says Henderson, “follow Spencer in assuming a gradual evolution of the organic from the inorganic, biochemists are more than ever unable to perceive how such a process is possible, and without taking any final stand prefer to let the riddle rest.” (“Fitness of the Environment,” p. 310, footnote.) Not to take a decisive stand on this question, however, is tantamount to making a

compromise with what is illogical and unscientific; for both logic and the inductive trend of biological facts are arrayed against the hypothesis of spontaneous generation.

In the first place, it is manifest that organic life is neither selfexplanatory nor eternal. Hence it must have had its origin in the action of some external agency. Life as it exists today depends upon the precedence of numerous unbroken chains of consecutive cells that extend backward into a remote past. It is, however, a logical necessity to put an end to this retrogradation of the antecedents upon which the actual existence of our present organisms depends. The infinite cannot be spanned by finite steps; the periodic lifeprocess could not be relayed through an unlimited temporal distance; and a cellular series which never started would never arrive. Moreover, we do not account for the existence of life by extending the cellular series interminably backward. Each cell in such a series is derived from a predecessor, and, consequently, no cell in the series is self-explanatory. When it comes to accounting for its own existence, each cell is a zero in the way of explanation, and adding zeros together indefinitely will never give us a positive total. Each cell refers us to its predecessor for the explanation of why it exists, and none contains within itself the sufficient explanation of its own existence. Hence increasing even to infinity the number of these cells (which fail to explain themselves) will give us nothing else but a zero in the way of explanation. If, therefore, the primordial cause from which these cellular chains are suspended is not the agency of the physicochemical forces of inorganic nature, it follows that the first active cause of life must have been a supermaterial and extramundane agency, namely, the Living God and Author of Life.

As a matter of fact, no one denies that life has had a beginning on our globe. The physicist teaches that a beginning of our entire solar system is implied in the law of the degradation of energy, and various attempts have been made to determine the time of this beginning. The older calculations were based on the rate of solar radiation; the more recent ones, however, are based on quantitative estimates of the disintegration products of radioactive elements. Similarly, the geologist and the astronomer propound theories of a

gradual constitution of the cosmic environment, which organic life requires for its support, and all such theories imply a de novo origin or beginning of life in the universe. Thus the old nebular hypothesis of Laplace postulated a hot origin of our solar system incompatible with the coëxistence of organic life, which, as the experiments of Pasteur and others have shown, is destroyed, in all cases, at a temperature just above 45° Centigrade (113° Fahrenheit). Even the enzymes or organic catalysts, which are essential for bio-chemical processes, are destroyed at a temperature between 60° and 70° Centigrade. This excludes the possibility of the contemporaneousness of protoplasm and inorganic matter, and points to a beginning of life in our solar system. Moreover, independently of this theory, the geologist sees in the primitive crystalline rocks (granites, diorites, basalts, etc.) and in the extant magmas of volcanoes evidences of an azoic age, during which temperatures incompatible with the survival of even the blue-green algæ or the most resistent bacterial spores must have prevailed over the surface of the globe. In fact, it is generally recognized by geologists that the igneous or pyrogenic rocks, which contain no fossils, preceded the sedimentary or fossiliferous rocks. The new planetesimal hypothesis, it is true, is said to be compatible with a cold origin of the universe. Nevertheless, this theory assumes a very gradual condensation of our cosmos out of dispersed gases and star dust, whereas life demands as the sine qua non condition of its existence a differentiated environment consisting of a lithosphere, a hydrosphere, and an atmosphere. Hence, it is clear that life did not originate until such an appropriate environment was an accomplished fact. All theories of cosmogony, therefore, point to a beginning of life subsequent to the constitution of the inorganic world.

Now, it is impossible for organic life to antecede itself. If, therefore, it has had a beginning in the world, it must have had a first active cause distinct from itself; and the active cause, in question, must, consequently, have been either something intrinsic, or something extrinsic, to inorganic matter. The hypothesis, however, of a spontaneous origin of life through the agency of forces intrinsic to

inorganic matter is scientifically untenable. Hence it follows that life originated through the action of an immaterial or spiritual agent, namely, God, seeing that there is no other assignable agency capable of bringing about the initial production of life from lifeless matter.

§ 9. Futile Evasions

Many and various are the efforts made to escape this issue. One group of scientists, for example, attempt to rid themselves of the difficulty by diverting our attention from the problem of a beginning of organic life in the universe to the problem of its translation to a new habitat. This legerdemain has resulted in the theories of cosmozoa or panspermia, according to which life originates in a favorable environment, not by reason of spontaneous generation, but by reason of importation from other worlds. This view has been presented in two forms: (1) the “meteorite” theory, which represents the older view held by Thomson and Helmholtz; (2) the more recent theory of “cosmic panspermia” advocated by Svante Arrhénius, with H. E. Richter and F. J. Cohn as precursors. Sir Wm. Thompson suggested that life might have been salvaged from the ruins of other worlds and carried to our own by means of meteorites or fragments thrown off from life-bearing planets that had been destroyed by a catastrophic collision. These meteorites discharged from bursting planets might carry germs to distant planets like the earth, causing them to become covered with vegetation. Against this theory stands the fatal objection that the transit of a meteorite from the nearest stellar system to our own would require an interval of 60,000,000 years. It is incredible that life could be maintained through such an enormous lapse of time. Even from the nearest planet to our earth the duration of the journey would be 150 years. Besides, meteorites are heated to incandescence while passing through the atmosphere, and any seeds they might contain would perish by reason of the heat thus generated, not to speak of the terrific impact, which terminates the voyage of a meteorite.

Arrhénius suggests a method by which microörganisms might be conveyed through intersidereal space with far greater dispatch and without any mineral vehicle such as a meteorite. He notes that particles of cosmic dust leave the sun as a coronal atmosphere and are propelled through intervening space by the pressure of radiation until they reach the higher atmosphere of the earth (viz. at a height of 100 kilometers from the surface of the latter), where they become the electrically charged dust particles of polar auroras (v.g. the aurora borealis). The motor force, in this case, is the same as that which moves the vanes of a Crookes’ radiometer Lebedeff has verified Clerk-Maxwell’s conceptions of this force and has demonstrated its reality by experiments. It is calculated that in the immediate vicinity of a luminous surface like that of the sun the pressure exerted by radiation upon an exposed surface would be nearly two milligrams per square centimeter. On a nontransparent particle having a diameter of 1.5 microns, the pressure of radiation would just counterbalance the force of universal gravitation, while on particles whose diameter was 0.16 of a micron, the pressure of radiation would be ten times as great as the pull of gravitation. Now bacterial spores having a diameter of O.3 to O.2 of a micron are known to bacteriologists, and the ultramicroscope reveals the presence of germs not more than O.1 of a micron in size.[11] Hence it is conceivable that germs of such dimensions might be wafted to limits of our atmosphere, and might then be transported by the pressure of radiation to distant planets or stellar systems, provided, of course, they could escape the germicidal action of oxidation, desiccation, ultra-violet rays, etc. Arrhénius calculates that their journey from the earth to Mars would, under such circumstances, occupy a period of only 20 days. Within 80 days they could reach Jupiter, and they might arrive at Neptune on the confines of our solar system after an interval of 3 weeks. The transit to the constellation of the Centaur, which contains the solar system nearest to our own (the one, namely, whose central sun is the star Alpha), would require 9,000 years.

Arrhénius’ theory, however, that “life is an eternal rebeginning” explains nothing and leaves us precisely where we were. In the

metaphysical as well as the scientific sense, it is an evasion and not a solution. To the logical necessity of putting an end to the retrogradation of the subalternate conditions, upon which the realities of the present depend for their actual existence, we have already adverted. Moreover, the reasons which induce the scientist to postulate a beginning of life in our world are not based on any distinctive peculiarity of that world, but are universally applicable, it being established by the testimony of the spectroscope that other worlds are not differently constituted than our own. Hence Schäfer voices the general attitude of scientific men when he says: “But the acceptance of such theories of the arrival of life on earth does not bring us any nearer to a conception of its actual mode of origin; on the contrary, it merely serves to banish the investigation of the question to some conveniently inaccessible corner of the universe and leaves us in the unsatisfactory condition of affirming not only that we have no knowledge as to the mode of origin of life—which is unfortunately true—but that we never can acquire such knowledge— which it is to be hoped is not true. Knowing what we know, and believing what we believe, we are, I think (without denying the possibility of the existence of life in other parts of the universe), justified in regarding these cosmic theories as inherently improbable.” (Dundee Address of 1912, cf. Smithson. Inst. Rpt. for 1912, p. 503.)

Dismissing, therefore, all evasions of this sort, we may regard as scientifically established the conclusion that, so far as our knowledge goes, inorganic nature lacks the means of self-vivification, and that no inanimate matter can become living matter without first coming under the influence of matter previously alive. Given, therefore, that the conditions favorable to life did not always prevail in our cosmos, it follows that life had a beginning, for which we are obliged to account by some postulate other than abiogenesis. This conclusion seems inescapable for those who concede the scientific absurdity of spontaneous generation, but, by some weird freak of logic, not only is it escaped, but the very opposite conclusion is reached through reasoning, which the exponents are pleased to term philosophical, as distinguished from scientific, argumentation. The plight of these

“hard-headed worshippers of fact,” who plume themselves on their contempt for “metaphysics,” is sad indeed. Worsted in the experimental field, they appeal the case from the court of facts to that aprioristic philosophy. “Physic of metaphysic begs defence, and metaphysic calls for aid on sense!”

Life, they contend, either had no beginning or it must have begun in our world as the product of spontaneous generation. But all the scientific theories of cosmogony exclude the former alternative. Consequently, not only is it not absurd to admit spontaneous generation, but, on the contrary, it is absurd not to admit it. It is in this frame of mind that August Weismann is induced to confide to us “that spontaneous generation, in spite of all the vain attempts to demonstrate it, remains for me a logical necessity.” (“Essays,” p. 34, Poulton’s Transl.) The presupposition latent in all such logic is, of course, the assumption that nothing but matter exists; for, if the possibility of the existence of a supermaterial agency is conceded, then obviously we are not compelled by logical necessity to ascribe the initial production of organic life to the exclusive agency of the physicochemical energies inherent in inorganic matter. Weismann should demonstrate his suppressed premise that matter coincides with reality and that spiritual is a synonym for nonexistent. Until such time as this unverified and unverifiable affirmation is substantiated, the philosophical proof for abiogenesis is not an argument at all, it is dogmatism pure and simple.

But, they protest, “To deny spontaneous generation is to proclaim a miracle” (Nägeli), and natural science cannot have recourse to “miracles” in explaining natural phenomena. For the “scientist,” miracles are always absurd as contradicting the uniformity of nature, and to recur to them for the solution of a scientific problem is, to put it mildly, distinctly out of the question. Hence Haeckel regards spontaneous generation as more than demonstrated by the bare consideration that no alternative remains except the unspeakable scientific blasphemy implied in superstitious terms like “miracle,” “creation,” and “supernatural.” For a “thinking man,” the mere mention of these abhorrent words is, or ought to be, argument enough. “If we do not accept the hypothesis of spontaneous