Lately, I’ve been thinking about how interactions between organisms in an ecosystem can be represented as a graph, with nodes and edges, similar to a social network. The nodes represent an organism or group of organisms while the edges represent the relationship between them. For example, a graph representation of an African savanna ecosystem would have lion and zebra as two nodes connected with an edge representing a predator/prey relationship. I began to wonder if some of the methods for gaining insights in social networks could be applied in ecology. There is an entire field of mathematics devoted to analyzing networks. These analyses can identify things like important nodes and sub-networks. Could I use these maths to identify important species? If the answer is yes, can I then use these derived characteristics to train a learner to identify important species in a database of interactions?
Interesting questions. Where to start? Fortunately, there is a major species interaction database called GloBI that I can use as a data set. There are several ecosystems that have been extensively studied and have data about the relative importance of taxa that I can use as training and test data sets. I’m not the first person to think along these lines. Lundren and Olesen, Estrada and Bodin, Gonzalez et al., Dunne et al., Steinhaeuser and Chawla, and Jordan et al. have all published studies looking at network structure of ecosystem graphs. Their work gives me some hope that this might actually work. My contribution will be designing a learner that can identify important taxa in a database of interactions not expressly designed for this purpose.
I will be doing the analysis using the Python networkx library. I would like to focus on ecosystems of different sizes, granularity, and types for training and testing. I want to capture important predation, habitat creation, and pollination interactions. I think I’ll start with rocky intertidal ecosystems and yellowstone national park. Both systems have been well studied. The first task will be getting the data in a usable format for analysis. Stay tuned!