Complex systems of many components – the internet, economies, cells, ecosystems, and so on – depend on the rich web of interactions among many parts. In analysing such systems, however, it is natural that scientists often want to rank the components in terms or their importance or influence using some quantitative measure, which can be either local or global. For example, the simplest local measure is the degree of each element, or node, reflecting the number and strength of an element’s immediate contacts. In contrast,…global measures such as several types of centrality offer a more holistic view – a node’s importance depends on how fast information can travel through it from and towards the outer regions of the network. Unfortunately, as LML Fellow Fabio Caccioli and colleagues Silvia Bartolucci, Francesco Caravelli and Pierpaolo Vivo note in a recent paper, such global measures are usually more difficult to compute accurately, and are also sensitive to minute details of a network’s organisation.
The definition of most of these global measures, the researchers point out, rest on the idea that important or influential elements in a system should be closely linked to other important or influential elements. In ecology, for instance, species can be assigned a natural “influence” (trophic) level in the food-chain hierarchy, with apex predators being high-up in the food chain because they either eat top-level predators themselves, or because they consume a large number of medium- or low-level species. Similarly, the PageRank algorithm used by Google classifies web pages as very relevant if they are linked to by other very relevant addresses, or by very many other pages. Other examples occur in economics. In each case, there’s a self-consistency of influence across the whole network. Influence, in this sense, is an intrinsically non-local quantity which depends on the full set of interactions in the network.
Yet, the authors point out, practical calculations of such quantities often run into serious obstacles. For example, calculations often require the inversion of a large and ill-conditioned matrix, which makes a numerical approach computationally intensive and prone to inaccuracies. In other cases, estimates of influence measures require the complete and accurate knowledge of all pairwise interactions strengths in the network, which may be not realistically achieved in many real-life settings.
In their paper, Caccioli and colleagues address the possibility of overcoming this problem by finding a way to efficiently and accurately rank constituents of a system we know very little about. In a variety of important contexts, they show how this can be done with high accuracy using only local information. In particular, they demonstrate that the detailed knowledge of the complete state of the network is often irrelevant. Their analysis focuses on the interaction matrix A reflecting the pairwise interactions between a systems components. Mathematically, it turns out, a non-negative and sub-stochastic interaction matrix A often displays a “large” Perron-Frobenius eigenvalue, which is well separated from the bulk of all the others. When this happens, A can be faithfully approximated by a rank-1 matrix A^, which retains only some local information stored in A. As long as the network of interactions is “not too sparse”, this approximation dramatically reduces the amount of detailed information needed to rank constituents as fast as possible, without compromising accuracy. Moreover, it makes it possible to accurately approximate all sorts of non-linear functions of the interaction matrix, as the authors demonstrate explicitly in the case of exponential and power-law centrality measures.
Overall, the work establishes that a large class of nominally global rankings may be accurately estimated using only local information, and may not require the full knowledge – or even an accurate reconstruction – of the whole set of pairwise interactions. The accuracy of the approximation depends on the existence of a “large” spectral gap in the interaction matrix, a condition the authors believe is not too hard to find in many empirical data sets.
A preprint of the paper is available at https://arxiv.org/abs/2009.06307
Leave a Reply