February 21, 2022
What is important in an Interconnected World?
By Tom Magelinski
Tags: Centrality; Community Structure; Community-Aware Centrality; Modularity Vitality; Network Robustness
Image caption: blue nodes connect across a sphere shaped network
Image credit: Canva
Which power generation stations need to be fortified? What social changes can we make to slow a pandemic? Which parts of the supply chain are most vulnerable? All of these questions can be answered through network analysis. More specifically, by analyzing how a network responds when nodes, or the entities modeled in the network, are removed. If a power station is shut down and the network is no longer able to distribute power appropriately, the station in question is very important and should be fortified. If a port in a supply chain is closed and the flow of goods grinds to a halt, this is an important vulnerability. Experimentally removing all combinations of the nodes in a large network is infeasible, so how can we find these critical nodes? In our most recent work, we have improved our ability to answer this question by linking two of the most important areas of research in network science: community detection and centrality.
Historically, the question of node importance in a network is answered through network centrality measures. These measures give each node in a network a score that reflects how important they are. Some measures are simple. One measure called “degree centrality” simply counts the number of connections a node has. Other measures are complex, taking into account all the connections in the network. The more complex the measure, the more expensive it is to compute, and the more difficult it is to apply them to very large networks.
Classic centrality measures do a good job of identifying these critical nodes (those that harm networks when they are removed). However, recent work has shown that leveraging the community structure of a network can improve on classic approaches. Community structure is a core area of research in network science, which seeks to break down big complex networks into simpler networks by grouping nodes together and looking at the connections between groups. Research has found that many types of networks have strong group structure. For example, social networks modeling friendship can be divided into “friend groups” which have strong connections internally, and are weakly connected to other “friend groups”.
The work by da Cunha et al. has found that this structure can be exploited to find important nodes. Specifically, nodes that have connections outside of their group are acting like a “bridge” between communities. If this bridge is removed the groups can no longer reach each other, and so the whole network has been harmed. While this research provides a method for determining important nodes, it does not give them a score, making it difficult to compare their importance to other nodes without further study. Also, the research leverages the idea of community bridges, but does not consider community hubs, those who are well-connected within their community.
In our work, we developed “modularity vitality,” which allows us to find both community bridges and community hubs, while giving a score to each node. Modularity vitality builds on the concept of modularity, which is a way of measuring the quality of a network’s group structure. While the quality of group structure in a network can be quantified in a number of ways, modularity focuses on the number of internal connections. Internal connections are those which connect members of the same community. The more internal connections a network has, the easier it is to distinguish the communities. Conversely, external connections blur the boundaries between communities, and thus lower the quality of group structure observed in a network. Modularity then quantifies the fraction of internal edges a network has, while accounting for the number of internal connections we would see due to random chance.
Modularity vitality quantifies the impact that each node has on the modularity value of the network. If a node has a positive modularity vitality its presence in the network makes the modularity value higher. Nodes can make the modularity value higher when they have many internal connections, so these nodes are community hubs. If a node has a negative value it is lowering the modularity value by having connections to nodes outside of its community. These are community bridges. In our paper, we have derived an efficient way for calculating modularity vitality scores, allowing us to test its effectiveness on networks with millions of nodes.
We then tested modularity vitality’s ability to identify important nodes. The testing procedure iteratively removes the most important nodes and measures how disconnected the network becomes. The quicker a method can disconnect a network, the better it is at identifying important nodes. This is extremely similar to a disease simulation approach where important nodes are vaccinated against a disease which is spreading across the network; methods that hamper simulated disease spread the best are considered the most effective .
Comparing Modularity Vitality to Other Methods
Modularity vitality was compared to 5 existing methods while studying the Pennsylvania road network. Nodes in this network represent all of the major intersections in Pennsylvania while edges are the roads between them. There are 1.5 million roads in the dataset. We find that Modularity vitality is the most successful method in finding vulnerabilities in the road network. These vulnerabilities are found to be community bridges. Now that we have better identified the most important intersections in the Pennsylvania road network, we know which intersections are the most crucial for traffic flow. If these were to be closed, due to construction or other reasons, transportation across the state would be affected.
We then tested modularity vitality on an online social network. This network consisted of 7.5 million Twitter users who communicated with each other in discussions about the Canadian Federal Election of 2019. In this network, we find that simply removing community bridges does not work. Instead, you need to remove both community bridges and community hubs. This finding has important implications for the diffusion of information online. We see that both community leaders and public figures who connect different groups are crucial to the spread of information online. Based on this, we see that strategies to mitigate the spread of misinformation and disinformation must address both types of users.
In summary, community structure can be leveraged to identify important nodes through modularity vitality. We demonstrate that modularity vitality is an effective way at finding community hubs and community bridges in networks with millions of nodes. We find that both community bridges often play the most important role in networks, because they enable communities to exchange information, diseases, or resources depending on the network in question. We also find that in online social networks community hubs play an important role because of the large size of online social communities and the number of connections that these users can make.
Modularity vitality is designed to work on networks where all nodes are of the same type, e.g., people connected to people. Now, we are working on expanding this so that we can apply modularity to networks with different types of nodes, e.g., people connected to attributes. There are many sociological theories about how personal attributes relate to community structure. However, these theories have been difficult to test on social media due to the amount of data involved. We hope that the efficiency of modularity vitality will enable us to test how these social theories designed for small offline social networks apply to massive online social networks.