Limits to Learning
Big data is no doubt a buzzword, but “big” is a relative concept. The term is typically used in the context of machine learning – of algorithms (rather than scientists) detecting meaningful structure in data. In a high-dimensional, complex system no data is big enough for this task. As the dimension of the problem grows there is no way of keeping up with the exponential growth of data-demand. LML Fellow Imre Kondor became interested in this problem in the context of financial risk measures, capital regulation, and portfolio optimization. But the relative data scarcity in portfolio optimization is just a special example of a large number of fields where we cannot have sufficient data, yet have to make inferences and decisions.
In some cases, underlying physical laws or biological plausibility enable researchers boldly to reduce the dimension of the problem and remove noise from the estimation. This can be successful even if the dimension of the problem is orders of magnitude higher than the available sample sizes. Examples are large-scale astronomy or gene chips. Scarce representations of noisy data are at the heart of the machine learning philosophy. It remains to be seen whether meaningful compressions of financial market data exist.
In the early 2000s, while working in the private sector, Kondor was charged with implementing international capital regulation. He noticed a number of inconsistencies in the regulation and set out to clarify how serious the consequences of these inconsistencies would be for the regulated institutions or systemic risk at large [1, 5]. This subject has remained in the focus of his research even after his return to academia, and led him to a set of more general questions: To what extent are the methods of risk measurement, risk management, asset management and capital allocation reliable, given the proliferation of financial assets and the large size of institutional portfolios?
In portfolio optimization the dimension of the problem is the number of different assets in the portfolio, and the data are the observed time series of the returns on these assets. With the sampling frequency limited by a realistic frequency of rebalancing, the length of available time series alone implies large statistical uncertainties. Even in benign toy models where well-defined optimal portfolios exist, estimated optima tend to be far off from their true values.
It has been known for decades that portfolio optimization is prone to estimation error. References [2,3,4] and with a particularly strong emphasis  were the first papers to report that this estimation error can actually diverge. The optimization algorithm undergoes a phase transition, accompanied with universal scaling laws, independent of the particular risk measure, the underlying probability distribution of returns, even of whether the return process is stationary or of an autoregressive GARCH type. Kondor and coworkers approached these problems both by numerical simulations [2,3,4,8,9,11] and by analytic calculations, initiated in [6,7,10] by borrowing powerful techniques from the theory of disordered systems. It was recognized that the phase transitions observed in portfolio optimization were relatively simple special cases of a wide class of algorithmic phase transitions, a field where a fruitful collaboration has developed between computer science and statistical physics in the last two decades. Another interdisciplinary link has emerged between phase transions discovered in random geometrical problems and those in portfolio optimization. This helped to establish another kind of universality: not only the critical behaviour, but also the critical point itself is independent of the probability distribution of the underlying returns. This has greatly extended the range of validity of the analytic results derived in [10,13,16,18,19,20,21,22,23] on the basis of Gaussian fluctuations.
The methods borrowed from the theory of disordered systems are powerful. They have made it possible to treat analytically [10,13,16,17,18,19,20,21] the optimization of Expected Shortfall, which became the official international regulatory market risk measure in 2016. The results were submitted to the Basel Committee on Banking Supervision in response to their Consultative Document on the new market risk regulation [17,19].
After an attempt to reduce the dimensions of the optimization problem by tools from random matrix theory [6,7] Kondor and coworkers turned to high-dimensional statistics, which offers systematic and disciplined methods to rein in large sample fluctuations via regularization [15,16,18, 23]. This unavoidably introduces bias, and the art of regularization consists in finding a trade-off between the bias and variance. In  it was shown that a meaningful trade-off can only be found in a relatively narrow range of parameters, outside of which one either has plenty of data and does not really need regularization, or the regularizer dominates and then the estimation reflects only the bias. The root of the difficulty is the unboundedness of risk measures, which, without further restrictions, allows very large leverage – the mother of all financial risks.
More broadly, this research addresses the question whether social systems are reducible or intrinsically complex. Kondor’s interest is shifting in this direction. With the powerful techniques of the theory of disordered systems he hopes to learn how much information can be retrieved from small noisy samples of data on artificial market models. The phase transitions observed in the portfolio optimization context may generalize to these problems and may erect a barrier beyond which information retrieval becomes impossible.
From a short-term perspective, reliable algorithms to estimate risk and recognize patterns are hugely useful and beneficial. The flipside is that the power of these algorithms may lead to a dictatorship of the machine. In the long run, then, it may be more comforting to learn that social systems cannot be reduced to a small number of dimensions and will remain unpredictable.
1. Kondor: Spin glasses in the trading book, Int. J. of Theor. and Appl. Finance, 3, 537 (2000)
2. Sz. Pafka and I. Kondor: Noisy covariance matrices and portfolio optimization, Eur. Phys. J. B27, 277-280 (2002)
3. Sz. Pafka and I. Kondor: Noisy covariance matrices and portfolio optimization II., Physica A319C, 487-494 (2003)
4. Sz. Pafka and I. Kondor: Estimated correlation matrices and portfolio optimization, Physica A343, 623-634 (2004)
5. I. Kondor, A. Szepessy and T. Ujvárosi: Concave risk measures in international capital regulation, in: Risk Measures for the 21th Century, Ch. 4., pp. 51-59, ed. G, Szego, John Wiley & Sons (2004)
6. Sz. Pafka, M. Potters and I. Kondor: Exponential weighting and random-matrix-theory-based filtering of financial covariance matrices for portfolio optimization, arXiv: cond-mat/0402573
7. G. Papp, Sz. Pafka, M.A. Nowak, I. Kondor: Random Matrix Filtering in Portfolio Optimization, Acta Physica Polonica B36, 2757-2766 (2005).
8. I. Kondor, Sz. Pafka, R. Karádi, and Nagy: Portfolio selection in a noisy environment using absolute deviation as a risk measure, in H. Takayasu (ed.): Practical Fruits of Econophysics: Proceedings of the Third Nikkei Econophysics Symposium, Tokyo; Springer, New York, (2006). ISBN: 4431289143.
9. I. Kondor, Sz. Pafka, G. Nagy: Noise sensitivity of portfolio selection under various risk measures, Journal of Banking and Finance, 31, 1545-1573 (2007).
10. Ciliberti, I. Kondor, M. Mezard: On the Feasibility of Portfolio Optimization under Expected Shortfall, Quantitative Finance, 7, 389-396 (2007)
11. Varga-Haszonits and I. Kondor: Noise Sensitivity of Portfolio Selection in Constant Conditional Correlation GARCH models, Physica A385, 307-318 (2007)
12. Kondor and I. Varga-Haszonits: Divergent estimation error in portfolio optimization and in linear regression, Eur. Phys. J. B 64, 601-605 (2008).
13. I. Varga-Haszonits and I. Kondor: The instability of downside risk measures, Stat. Mech. P12007 doi: 10.1088/1742-5468/2008/12/P12007 (2008)
14. I. Kondor and I. Varga-Haszonits: Instability of portfolio optimization under coherent risk measures, Advances in Complex Systems, 13, 425-437 (2010) DOI No: 1142/S0219525910002591.
16. F. Caccioli, S. Still, M. Marsili and I. Kondor: Optimal Liquidation Strategies Regularize Portfolio Selection, European Journal of Finance, 19, 554-571 (2013), Doi: 10.1080/1351847X.2011.601661, arXiv:1004.4169v1 [q-fin.PM] (2010).
17. Kondor, I.: Estimation Error of Expected Shortfall, submitted to the Basel Committee in response to the Consultative Document of the Basel Committee on Banking Supervision: Fundamental review of the trading book: A revised market risk framework (2014), http://www.bis.org/publ/bcbs265/imrekondor.pdf ; http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2400401
18. Caccioli, I. Kondor, M. Marsili and S. Still: Liquidity risk and instabilities in portfolio optimization, International Journal of Theoretical and Applied Finance, 19/5 (2016), posted online on July 29, 2016; (DOI: 10.1142/S0219024916500357), pp.1650035 (2016).
19. I. Kondor, F. Caccioli, G. Papp and M. Marsili: Contour map of estimation error for Expected Shortfall, submitted to the Basel Committee in response to the Fundamental review of the trading book: outstanding issues – consultative document (2014), http://arxiv.org/abs/1502.06217, http://ssrn.com/abstract=2567876
20. Caccioli, I. Kondor and G. Papp: Portfolio optimization under Expected Shortfall: Contour maps of estimation error, http://arxiv.org/abs/1510.04943
21. Papp, F. Caccioli and I. Kondor: Fluctuation-bias trade-off in portfolio optimization under Expected Shortfall with l2 regularization, http:// arXiv:1602.08297v1 [q-fin.PM]
22. Istvan Varga-Haszonits, Fabio Caccioli, Imre Kondor: Replica approach to mean-variance portfolio optimization, Stat. Mech. (2016) 123404, available at http://arxiv.org/abs/1606.08679 Categories: q-fin.RM cond-mat.dis-nn
23. Kondor, G. Papp and F. Caccioli: Analytic solution to variance optimization with no short-selling, https://arxiv.org/pdf/1612.07067 , https://papers.ssrn.com/sol3/Delivery.cfm/SSRN_ID2899431_code2199040.pdf?…1