In Southern California, Japan, Indonesia or other areas around the world prone to earthquakes, people at risk face vast uncertainty about when, where, and how strong the next one will be. Through human history, this combination of immense uncertainty and potential for profound human consequences has stimulated efforts to predict earthquakes, with methods ranging from interpretation of animal movements to modern techniques based on sophisticated seismic monitoring and computational modelling. Yet our ability to make useful earthquake forecasts remains limited.
In large part, this simply reflects the complexity of the earthquake process, which depends on precise details of highly irregular fault zones and local geology, as well as chance events – the initial slip or rupture of rocks at one location rather than another. But progress on scientific prediction has also been hampered by a lack of sound methods for comparing the accuracy of forecasts, or even determining if a particular prediction was correct or not. If method X predicts a large quake in Southern California in 6 months, and a medium-size quake strikes in Central California in eight months, is this a success or failure? Worse still, insufficiently detailed forecasts leave room for subtle adjustment after the fact, making them appear more accurate than is justified.
For these and other reasons, an international group of earthquake scientists joined forces ten years ago to create a global infrastructure to help compare future prediction methods in an open and objective way. The aim of the Collaboratory for the Study of Earthquake Predictability (CSEP) is to provide the conditions for systematic improvement of prediction methods. In a recent paper, leaders of the collaboration including LML Fellows Max Werner and Jiancang Zhuang reviewed the successes and lessons learned from their activities over the first decade. In a brief interview, I asked both of them to comment on various aspects of the ongoing work.
Mark Buchanan: What has the project learned over the first decade? Do any particular forecasting approaches seem to stand out as working better?
Jiancang Zhuang: Yes. One thing that our work has at least proven is that the clustering effect – the statistical clustering of earthquake activity in space and time – is the biggest predictable component in seismicity. We have found strong evidence that the locations of past shocks, particularly many small shocks recorded by dense sensor networks, contain significantly more predictive information for moderate-to-strong earthquakes over a 5- to 10-yr period than other forecast approaches, including geological (fault-based), geodetic and tectonic models. The standard earthquake clustering model, which is called the Epidimic-Type Aftershock Sequence (ETAS) model, is now the basic tool for forecasting short-term seismicity. Most of the more informative forecasting models are extensions of this model derived by incorporating precursory information and/or seismicity features not normally included in the simpler model. Ultimately, rigorous testing of forecast models is required to improve our ability to forecast seismic hazard.
Mark Buchanan: Have you found any surprises?
Max Werner: My biggest surprise was finding that the textbook approach for physics-based forecasting of aftershocks could not compete with simpler statistical approaches. The textbook theory holds that a large earthquake induces a change in the strain field in the surrounding crust, and that earthquakes are triggered where the strain change encourages further ruptures on pre-existing fractures. Hundreds or even thousands of papers have been published ‘explaining’ aftershocks and their anisotropic spatial arrangement around a mainshock with this theory. In a study published in 2011, CSEP asked: how informative are quantitative forecasts of this theory as compared to benchmark statistical models, which forecast aftershocks in a simpler, isotropic manner around a main shock? We found to our surprise that the simpler (statistical) models far outperformed the physics-based models.
Our finding sat uncomfortably with most of us. Was the theory really wrong, despite its intuitive and basic physical mechanism? Was this an outlier aftershock sequence? Luckily, the result spurned significant model development and a few years later, armed with new and improved physics-based forecast models, CSEP repeated the exercise, this time on the complex 2010-2012 Christchurch, New Zealand, earthquake sequence. This time (2018), we found that the physics-based models had caught up, and the leading physics-based forecast model was more informative than a statistical benchmark model. The physics-based models now better captured uncertainty in input data and parameters, and considered additional sources of strain from aftershocks during the evolving sequence. But the experiment was by no means a clear win for the physics-based theory: the leading statistical model still outperforms the leading physics-based model, though not by lot!
These results challenged my own intuition and beliefs, and they taught me that earthquakes are (or appear to be) sufficiently random that we need to hold our favourite models and theories to the fire, repeatedly, to regain some objectivity.
Mark Buchanan: I understand that one limiting factor in current forecasting research is a lack of earthquake data. Do you have plans to push for better data?
Jiancang Zhuang: Yes, of course. The observed data only reflects the true underlying seismicity in an extremely approximate way. But models have to be tested through the observation data. To improve the situation, some effort is being directed to constructing models that explicitly recognize and try to correct for known data deficits. But we also have plans to improve the quality of data, which is currently the main limitation in testing of earthquake forecasts. The project has plans to extend spatial coverage by encouraging forecast testing in other regions with good earthquake catalogues (e.g., seismic belts of Asia and South America), as well as globally. Also, we aim to extend temporal coverage by expanding retrospective testing capabilities to take advantage of well-recorded aftershock sequences and other datasets.
Except for modern instrumental data, we also need to consider paleoseismic data. Big earthquakes in small regions have recurrence intervals of hundreds of years. This requires us to investigate the completeness of fault databases and collect data globally for each individual fault rupture history, in order to make robust evaluations of hypotheses about the earthquakes that society cares most about.
Mark Buchanan: The basic idea of CSEP seems simple: find ways to make legitimate unbiased tests and comparisons of models by forcing them to predict future outcomes. What makes this harder to achieve in practice?
Max Werner: In my view, the hardest part is – perhaps surprisingly — defining the future outcome that researchers are forecasting. I’ll try to illustrate with an example. Many widely respected models make statements about future big earthquakes on geological faults, say the ‘Big One’ on the San Andreas Fault in California. To make such forecasts testable – meaning that a computer can retrieve independent data and decide if a future earthquake counts as a qualifying success or not – we need precise definitions of ‘big’, ‘fault’ and ‘earthquake’. This seems trivial, but there are many complications. For example, earthquake size can be measured by one of several magnitude scales, or by spatial extent. Defining the magnitude scale is easy in principle, but researchers usually like a little wiggle room, risking confirmation bias. Spatial extent is much harder, but is often implicit in forecasts (“the Big One will break the ‘southern’ San Andreas”). Measuring the spatial rupture extent of an earthquake is an inverse problem plagued by uncertainties: different research groups produce different estimates from different assumptions, input data, and approaches.
Problems continue with the definition of ‘fault’. Geologists trace them on the surface, but how do they look at depth? Are they infinitely thin planes as in most models, or are they distributed 3D structures embedded in fractal-like networks? This is a big issue if a computer is to decide whether any future earthquake fits the bill. In 1989, the M6.9 Loma Prieta earthquake in California occurred ‘right next’ to the San Andreas Fault. Whether ‘right next to’ qualifies as a direct hit or not continues to be debated today.
In short, making forecasts testable in a prospective manner requires very precise definitions, but most modellers have certain types of earthquakes in mind. Reality tends to be much more complex. To get started a decade ago, we had to agree to treat earthquakes as points, characterized by their magnitude, occurrence time, and location of rupture nucleation (rather than extent). This was much less controversial, but it left many important models and hypotheses beyond our testing programme. Testing these is a top priority for the second phase of CSEP.
The paper is available at here.