Introduction

Risk management plays a central role in the financial system. Value at risk (VaR),1 a quantile of the loss distribution, is a widely used risk metric. For example the Basel III regulations require banks to perform stress tests using VaR.2 A second important risk metric is conditional value at risk (CVaR, sometimes also called expected shortfall), defined as the expected loss for losses greater than VaR. By contrast to VaR, CVaR is more sensitive to extreme events in the tail of the loss distribution.

Monte Carlo simulations are the method of choice to determine VaR and CVaR of a portfolio.1 They are done by building a model of the portfolio assets and computing the aggregated value for M different realizations of the model input parameters. VaR calculations are computationally intensive as the width of the confidence interval scales as O(M−1/2). Many different runs are needed to achieve a representative distribution of the portfolio value. Classical attempts to improve the performance are variance reduction or Quasi-Monte Carlo techniques.3,4,5 The first aims at reducing the constants while not changing the asymptotic scaling; whereas, the latter improves the asymptotic behavior, but only works well for low-dimensional problems.

Quantum computers process information using the laws of quantum mechanics.6 This has opened up novel ways of addressing some problems, e.g. in quantum chemistry,7 optimization,8 or machine learning.9 Problems in finance that make use of machine learning may benefit from quantum machine learning.10 A quantum computer may also be used to optimize the risk-return of portfolios and sample from the optimal portfolio.11 Amplitude estimation (AE) is a quantum algorithm used to estimate an unknown parameter and converges as O(M−1), which is a quadratic speed-up over classical algorithms like Monte Carlo.12 It has already been shown how AE can be used to price financial derivatives with the Black–Scholes model.13,14

In this article, we extend the use of AE to the calculation of variance, VaR and CVaR of random distributions. Furthermore, we develop methods to implement AE on actual hardware and we discusses how to construct the corresponding quantum circuits to calculate the expected value, variance, VaR and CVaR of random variables. Therefore making AE a powerful tool for risk management and security pricing. We illustrate our algorithm using portfolios made up of debt issued by the United States Treasury (US Treasury). As of December 2016 the US Treasury had 14.5 trillion USD in outstanding marketable debt held by the public.15 This debt is an actively traded asset class with typical daily volumes close to 500 billion USD16 and is regarded as high quality collateral.17 government debt typically lacks some of the more complex features that other types of fixed-income securities have. These features make US Treasuries a highly relevant asset class to study while allowing us to use simple models to illustrate our algorithm. We demonstrate AE on a real quantum computer by approximating the expected value of a very simple portfolio made up of one T-Bill, a short-term debt obligation issued by the US Treasury, analyzed on a single period of a binomial tree. We also show a more comprehensive two-asset portfolio and simulate the presented algorithms assuming a perfect as well as a noisy quantum computer.

Results

Amplitude estimation,12 formally introduced in Supplementary Information, allows us to estimate a in the state \({\cal A}\left| 0 \right\rangle _{n + 1} = \sqrt {1 - a} \left| {\psi _0} \right\rangle _n\left| 0 \right\rangle + \sqrt a \left| {\psi _1} \right\rangle _n\left| 1 \right\rangle\) for an operator \({\cal A}\) acting on n + 1 qubits. AE uses m additional sampling qubits and Quantum Phase Estimation18 to produce an estimator \(\tilde a = {\mathrm{sin}}^2(y\pi {\mathrm{/}}M)\) of a, where y {0, …, M − 1} and M, the number of samples, is 2m. The estimator \(\tilde a\) satisfies

$$\left| {a - \tilde a} \right| \le \frac{\pi }{M} + \frac{{\pi ^2}}{{M^2}} = O\left( {M^{ - 1}} \right),$$
(1)

with probability of at least 8/π2. This represents a quadratic speed-up compared to the O(M−1/2) convergence rate of classical Monte Carlo methods.1

To use AE to estimate quantities related to a random variable X we must first represent X as a quantum state. Using n qubits we map X to the interval {0, …, N − 1}, where N = 2n. X is then represented by the state

$${\cal R}\left| 0 \right\rangle _n = \left| \psi \right\rangle _n = \mathop {\sum}\limits_{i = 0}^{N - 1} \sqrt {p_i} \left| i \right\rangle _n\quad {\mathrm{with}}\quad \mathop {\sum}\limits_{i = 0}^{N - 1} {\kern 1pt} p_i = 1$$
(2)

created by the operator \({\cal R}\). Here pi [0, 1] is the probability of measuring the state \(\left| i \right\rangle _n\) and i {0, …, N − 1} is one of the N possible realizations of X. Next, we consider a function f:{0, …, N − 1} → [0, 1] and a corresponding operator

$$F:\left| i \right\rangle _n\left| 0 \right\rangle \mapsto \left| i \right\rangle _n\left( {\sqrt {1 - f(i)} \left| 0 \right\rangle + \sqrt {f(i)} \left| 1 \right\rangle } \right),$$
(3)

for all i {0, …, N − 1}, acting on an ancilla qubit. Applying F to \(\left| \psi \right\rangle _n\left| 0 \right\rangle\) yields

$$\mathop {\sum}\limits_{i = 0}^{N - 1} \sqrt {1 - f(i)} \sqrt {p_i} \left| i \right\rangle _n\left| 0 \right\rangle + \mathop {\sum}\limits_{i = 0}^{N - 1} \sqrt {f(i)} \sqrt {p_i} \left| i \right\rangle _n\left| 1 \right\rangle .$$

With AE we approximate the probability of measuring \(\left| 1 \right\rangle\) in the last qubit, which equals \(\mathop {\sum}\nolimits_{i = 0}^{N - 1} {\kern 1pt} p_if(i) = {\Bbb E}\left[ {f(X)} \right]\). AE can be used to approximate the expected value of a random variable.19,20 By choosing f(i) = i/(N − 1) we estimate \({\Bbb E}\left[ {\frac{X}{{N - 1}}} \right]\) and hence \({\Bbb E}[X]\). If we choose f(i) = i2/(N − 1)2 we can efficiently estimate \({\Bbb E}[X^2]\) and obtain the variance \({\mathrm{Var}}(X) = {\Bbb E}[X^2] - {\Bbb E}[X]^2\).

We now extend this technique to risk measures such as VaR and CVaR. For a given confidence level α [0, 1], VaRα(X) is the smallest value x {0, …, N − 1} such that \({\Bbb P}[X \le x] \ge (1 - \alpha )\). To find VaRα(X) on a quantum computer, we define the function fl(i) = 1 if i ≤ l and fl(i) = 0 otherwise, where l {0, …, N − 1}. Applying Fl, the operator corresponding to fl, to \(\left| \psi \right\rangle _n\left| 0 \right\rangle\) produces

$$\mathop {\sum}\limits_{i = l + 1}^{N - 1} \sqrt {p_i} \left| i \right\rangle _n\left| 0 \right\rangle + \mathop {\sum}\limits_{i = 0}^l \sqrt {p_i} \left| i \right\rangle _n\left| 1 \right\rangle .$$
(4)

The probability of measuring \(\left| 1 \right\rangle\) for the last qubit is \(\mathop {\sum}\nolimits_{i = 0}^l p_i = {\Bbb P}[X \le l]\). Therefore, with a bisection search over l we find the smallest level lα such that \({\Bbb P}[X \le l_\alpha ] \ge 1 - \alpha\) in at most n steps. The smallest level lα is equal to VaRα(X). This estimation of VaRα(X) has accuracy O(M−1), i.e. a quadratic speed-up compared to classical Monte Carlo methods (omitting the additional logarithmic complexity of the bisection search).

CVaRα(X) is the conditional expectation of X restricted to {0, …, lα}, where we compute lα = VaRα(X) as before. To estimate CVaR we apply the operator F corresponding to the function \(f(i) = \frac{i}{{l_\alpha }} \cdot f_{l_\alpha }(i)\) to \(\left| \psi \right\rangle _n\left| 0 \right\rangle\) to create

$$\begin{array}{r}\left( {\mathop {\sum}\limits_{i = l_\alpha + 1}^{N - 1} \sqrt {p_i} \left| i \right\rangle _n + \mathop {\sum}\limits_{i = 0}^{l_\alpha } \sqrt {1 - \frac{i}{{l_\alpha }}} \sqrt {p_i} \left| i \right\rangle _n} \right)\left| 0 \right\rangle \\ + \mathop {\sum}\limits_{i = 0}^{l_\alpha } \sqrt {\frac{i}{{l_\alpha }}} \sqrt {p_i} \left| i \right\rangle _n\left| 1 \right\rangle .\end{array}$$
(5)

The probability of measuring \(\left| 1 \right\rangle\) for the ancilla, approximated using AE, equals \(\mathop {\sum}\nolimits_{i = 0}^{l_\alpha } \frac{i}{{l_\alpha }}p_i\). However, since \(\mathop {\sum}\nolimits_{i = 0}^{l_\alpha } p_i\) does not sum up to one but to \({\Bbb P}[X \le l_\alpha ]\), as evaluated during the VaR estimation, we must normalize the probability of measuring \(\left| 1 \right\rangle\) to get

$${\mathrm{CVaR}}_\alpha (X) = \frac{{l_\alpha }}{{{\Bbb P}[X \le l_\alpha ]}}\mathop {\sum}\limits_{i = 0}^{l_\alpha } \frac{i}{{l_\alpha }}p_i.$$
(6)

We also multiplied by lα, otherwise we would estimate \({\mathrm{CVaR}}_\alpha \left( {\frac{X}{{l_\alpha }}} \right)\). Even though we replace \({\Bbb P}[X \le l_\alpha ]\) by an estimation, the error bound on CVaR, see Supplementary Information, still achieves a quadratic speed-up compared to classical Monte Carlo methods.

We have shown how to choose f to calculate the expected value, variance, VaR and CVaR of a random variable X represented by \(\left| \psi \right\rangle _n\). However, we are usually interested in \({\Bbb E}[f(X)]\) for a more general function f:{0, …, N − 1} → {0, …, N′ − 1}, \(N\prime = 2^{n\prime }\), \(n\prime \in {\Bbb N}\), for instance representing some payoff or loss depending on X. In some cases, as shown later, we can adjust F accordingly. In other cases, we can apply a corresponding operator \(\left| i \right\rangle _n\left| 0 \right\rangle _{n\prime } \mapsto \left| i \right\rangle _n\left| {f(i)} \right\rangle _{n\prime }\) and use the previously introduced algorithms on the second register. There exist numerous quantum algorithms for arithmetic operations21,22,23,24,25 as well as tools to translate classical logic into quantum circuits.26,27

Quantum circuits

We now show how the previously discussed algorithms can be mapped to quantum circuits. We start with the construction of \(\left| \psi \right\rangle _n\), see Eq. (2), representing the probability distribution of X. In general, the best known upper bound for the number of gates required to create \(\left| \psi \right\rangle _n\) is O(2n).28 However, approximations with polynomial complexity in n are possible for many distributions, e.g., log-concave distributions.29

Approximating \({\Bbb E}[X]\) using AE requires the operator F corresponding to f(x) = x/(N − 1), defined in Eq. (3). In general, representing F for the expected value or for the CVaR either requires an exponential O(2n) number of gates or additional ancillas to pre-compute the (discretized) function f into qubits, using quantum arithmetic, before applying the rotation.30 The exact number of ancillas depends on the desired accuracy of the approximation of F. Another approach consists of piecewise polynomial approximations of f.31 However, this also implies a significant overhead in terms of the number of ancillas and gates. In the following, we show how to overcome these hurdles by approximating F without ancillas using polynomially many gates, at the cost of a lower—but still faster than classical—rate of convergence. Note that the operator required for estimating VaR is easier to construct and we can always achieve the optimal rate of convergence as discussed later in this section.

Our contribution rests on the fact that an operator mapping \(\left| x \right\rangle _n\left| 0 \right\rangle\) to \(\left| x \right\rangle _n\left( {{\mathrm{cos}}(\zeta (x))\left| 0 \right\rangle + {\mathrm{sin}}(\zeta (x))\left| 1 \right\rangle } \right)\), for a given polynomial \(\zeta (x) = \mathop {\sum}\nolimits_{j = 0}^k {\kern 1pt} \zeta _jx^j\) of degree k, can be efficiently constructed using multi- controlled Y-rotations. Single qubit operations with n − 1 control qubits can be exactly constructed, e.g., using O(n) gates and O(n) ancillas or O(n2) gates without any ancillas. They can also be approximated with accuracy \(\epsilon \; > \; 0\) using \(O(n\,{\mathrm{log}}(1{\mathrm{/}}\epsilon ))\) gates.32 For simplicity, we use O(n) gates and O(n) ancillas. Since the binary variable representation of ζ, leads to at most nk terms, the corresponding operator can be constructed using O(nk+1) gates and O(n) ancillas. An example for a second order polynomial is shown in Supplementary Information.

For every analytic function f, there exists a sequence of polynomials such that the approximation error converges exponentially fast to zero with increasing degree of the polynomials.33 Thus, for simplicity, we assume that f is a polynomial of degree s.

If we can find a polynomial ζ(y) such that \({\mathrm{sin}}^2(\zeta (y)) = y\), then we can set y = f(x), and the previous discussion provides a way to construct the operator F. Since the expected value is linear, we may choose to estimate \({\Bbb E}\left[ {c\left( {f(X) - \frac{1}{2}} \right) + \frac{1}{2}} \right]\) instead of \({\Bbb E}[f(X)]\) for a parameter c (0, 1], and then map the result back to an estimator for \({\Bbb E}[f(X)]\). The rationale behind this choice is that \({\mathrm{sin}}^2\left( {y + \frac{\pi }{4}} \right) = y + \frac{1}{2} + O(y^3)\). Thus, we want to find ζ(y) such that \(c\left( {y - \frac{1}{2}} \right) + \frac{1}{2}\) is sufficiently well approximated by \({\mathrm{sin}}^2\left( {c\zeta (y) + \frac{\pi }{4}} \right)\). Setting the two terms equal and solving for ζ(y) leads to

$$\zeta (y) = \frac{1}{c}\left( {\sin ^{ - 1}\left( {\sqrt {c\left( {y - \frac{1}{2}} \right) + \frac{1}{2}} } \right) - \frac{\pi }{4}} \right),$$
(7)

and we choose ζ(y) as a Taylor approximation of Eq. (7) around y = 1/2. Note that Eq. (7) defines an odd function around y = 1/2, and thus the even terms in the Taylor series equal zero. The Taylor approximation of order 2u + 1 leads to a maximal approximation error for Eq. (7) of

$$\frac{{c^{2u + 3}}}{{2(2u + 3)}} + O(c^{2u + 5}),$$
(8)

for all y [0, 1], as shown in Supplementary Information.

Now we consider the resulting polynomial ζ(f(x)) of order s(2u + 1). The number of gates required to construct the corresponding circuit scales as O(ns(2u+1)+1). The smallest scenario of interest is s = 1 and u = 0, i.e., both, f and ζ, are linear functions, which leads to a circuit for F where the number of gates scales quadratically with respect to the number of qubits n representing \(\left| \psi \right\rangle _n\).

Thus, using AE to estimate \({\Bbb E}\left[ {c\left( {f(x) - \frac{1}{2}} \right) + \frac{1}{2}} \right]\) leads to a maximal error

$$\frac{\pi }{M} + \frac{{c^{2u + 3}}}{{2(2u + 3)}} + O\left( {c^{2u + 5} + M^{ - 2}} \right),$$
(9)

where we ignore the higher order terms in the following. Since our estimation uses cf(x), we also need to analyze the scaled error \(c\epsilon\), where \(\epsilon > 0\) denotes the resulting estimation error for \({\Bbb E}[f(X)]\). Setting Eq. (9) equal to \(c\epsilon\) and reformulating it leads to

$$c\epsilon - \frac{{c^{2u + 3}}}{{2(2u + 3)}} = \frac{\pi }{M}.$$
(10)

Maximizing the left-hand-side with respect to c, i.e. minimizing the number of required samples M to achieve a target error \(\epsilon\), results in \(c^ \ast = (2\epsilon )^{1/(2u + 2)}\). Plugging c* into Eq. (10) gives

$$2^{\frac{1}{{2u + 2}}}\left( {1 - \frac{1}{{2u + 3}}} \right)\epsilon ^{1 + \frac{1}{{2u + 2}}} = \frac{\pi }{M}.$$
(11)

Translating this into a rate of convergence for the estimation error \(\epsilon\) with respect to the number of samples M leads to \(\epsilon = O\left( {M^{ - \frac{{2u + 2}}{{2u + 3}}}} \right).\) For u = 0, we get O(M−2/3), which is already better than the classical convergence rate of O(M−1/2). For increasing u, the convergence rate quickly approaches the optimal rate of O(M−1).

For the estimation of the expectation we exploited \({\mathrm{sin}}^2\left( {y + \frac{\pi }{4}} \right) \approx y + \frac{1}{2}\), for small |y|. For the variance we apply the same idea but use \({\mathrm{sin}}^2(y) \approx y^2\). We employ this approximation to estimate the value of \({\Bbb E}\left[ {f(X)^2} \right]\) and then, together with the estimation for \({\Bbb E}\left[ {f(X)} \right]\), we evaluate \({\mathrm{Var}}\left( {f(X)} \right) = {\Bbb E}\left[ {f(X)^2} \right] - {\Bbb E}\left[ {f(X)} \right]^2\). The resulting convergence rate is again equal to \(O\left( {M^{ - \frac{{2u + 2}}{{2u + 3}}}} \right)_{}^{}\).

The previous discussion shows how to build quantum circuits to estimate \({\Bbb E}[f(X)]\) and Var(f(X)) more efficiently than possible classically. In the following, we extend this to VaR and CVaR.

Suppose the state \(\left| \psi \right\rangle _n\) corresponding to the random variable X on {0, …, N − 1} and a fixed l {0, …, N − 1}. To estimate VaR, we need an operator Fl that maps \(\left| x \right\rangle _n\left| 0 \right\rangle\) to \(\left| x \right\rangle _n\left| 1 \right\rangle\) if x ≤ l and to \(\left| x \right\rangle _n\left| 0 \right\rangle\) otherwise, for all x {0, …, N − 1}. Then, for the fixed l, AE can be used to approximate \({\Bbb P}[X \le l]\), as shown in Eq. (5). With (n + 1) ancillas, adder-circuits can be used to construct Fl using O(n) gates,23 and the resulting convergence rate is O(M−1). For a given level α, a bisection search can find the smallest lα such that \({\Bbb P}[X \le l_\alpha ] \ge \alpha\) in at most n steps, and we get lα = VaRα(X).

To estimate the CVaR, we apply the circuit Fl for lα to an ancilla qubit and use this ancilla qubit as a control for the operator F used to estimate the expected value, but with a different normalization, as shown in Eq. (5). Based on the previous discussion, it follows that AE can then be used to approximate CVaRα(X) with the same trade-off between circuit depth and convergence rate as for the expected value.

T-Bill on a single period binomial tree

Our first model consists of a zero coupon bond discounted at an interest rate r. See Supplementary Information for an introduction to the financial concepts used in this article. We seek to find the value of the bond today given that in the next time step there might be a δr rise in r. The value of the bond today, with face value VF, i.e. the amount of money the bond holder receives when the bond matures, is

$$V = \frac{{(1 - p)V_F}}{{1 + r + \delta r}} + \frac{{pV_F}}{{1 + r}} = (1 - p)V_{{\mathrm{low}}} + pV_{{\mathrm{high}}},$$
(12)

where p and (1 − p) denote the probabilities of a constant interest rate and a rise, respectively. This model is the first step of a binomial tree. Binomial trees can be used to price securities with a path dependency such as bonds with embedded options.34

The simple scenario in Eq. (12) could correspond to a market participant who bought a 1 year T-bill the day before a Federal Open Markets Committee announcement and expects a δr = 0.25%-points increase of the Federal Funds Rate with a (1 − p) = 70% probability and no change with a p = 30% probability.

We show how to calculate the value of the investor’s T-bill by running AE on the IBM Q Experience and mapping V to [0, 1] such that Vlow and Vhigh correspond to $0 and $1, respectively.

Here, we only need a single qubit to represent the uncertainty and the objective and we have \({\cal A} = R_y(\theta _p)\), a Y-rotation of angle \(\theta _p = 2\,{\mathrm{sin}}^{ - 1}\left( {\sqrt p } \right)\), and thus, \({\cal A}\left| 0 \right\rangle = \sqrt {1 - p} \left| 0 \right\rangle + \sqrt p \left| 1 \right\rangle\).

AE requires applying exponentials of an operator Q derived from \({\cal A}\). For the single qubit case Q = Ry(2θp). This implies Qj = Ry(j2θp), which allows us to construct the AE circuit efficiently to approximate the parameter \(p = {\Bbb E}[X] = 30\%\).

Although a single period binomial tree is a very simple model, it is straight-forward to extend it to multi-period multi-nomial trees with path-dependent assets. Thus, it represents the smallest building block for interesting scenarios of arbitrary complexity.

We run several experiments on read quantum hardware in which we apply AE with a different number of evaluation qubits m = 1, 2, 3, 4 corresponding to M = 2, 4, 8, 16 samples, respectively, to estimate \(p = {\Bbb E}[X]\). This requires at most five qubits and can be run on the IBM Q 5 Yorktown (ibmqx2) quantum processor.35 Since the success probability of AE is larger than 8/π2 we would in principle only need, for instance, 24 repetitions to achieve a success probability of 99.75%.36 However, current quantum hardware introduces additional errors. Therefore, we repeat every circuit 8192 times (i.e., the maximal number of shots in the IBM Q Experience) to get a reliable estimate. We distinguish between the number of repetitions required by the imperfections of the hardware and the convergence rate of our algorithm. We thus consider the 8192 repetitions as a constant overhead, which we ignore when comparing the quantum and classical algorithms. The quantum circuit for m = 3 compiled to the IBM Q 5 quantum processor is illustrated in Fig. 1. The connectivity of the IBM Q 5 quantum processor, shown in Supplementary Information, requires swapping two qubits in the middle of the circuit between the application of the controlled Q operators and the inverse Quantum Fourier Transform. The results of the algorithm are illustrated in Fig. 2a where it can be seen that the most frequent estimator approaches the real value p and how the resolution of the algorithm increases with m. The quantum algorithm presented in this paper outperforms the Monte Carlo method already for M = 16 samples (i.e. m = 4 evaluation qubits), which is the largest scenario we performed on the real hardware, see Fig. 2b. The details of this convergence analysis are discussed in Supplementary Information.

Fig. 1
figure 1

AE circuit for the T-Bill problem with m = 3. Dashed boxes highlight from left to right: the controlled Q operators, the swap of two qubits, and the inverse Quantum Fourier Transform (QFT). The swap is needed to overcome the limited connectivity of the chip. U2 and U3, formally introduced in Supplementary Information, indicate single qubit rotations where the parameters are omitted. Note that the circuit could be further optimized, e.g., the adjoint CNOT gates at the beginning of the SWAP would cancel out, but we kept them for illustration

Fig. 2
figure 2

a Results of running AE on real hardware for m = 1, …, 4 with 8192 shots each. Therefore, the error bars on the histograms are \({\mathrm{1/}}\sqrt {{\mathrm{8192}}}\) and not shown. The green bars indicate the probability of the most frequent estimate and the blue bars the probability of the other estimates. The red dashed lines indicate the target value of 30%. The gray dashed lines show the probability of the second most frequent value to highlight the resulting contrast. The possible values are not equally distributed on the x-axis, since AE first returns a number y {0, …, M − 1} that is then classically mapped to \(\tilde a = {\mathrm{sin}}^2\left( {\frac{{y\pi }}{M}} \right)\). b Comparison of the convergence of the error of Monte Carlo simulation and our algorithm with respect to the number of samples M. Although the quantum algorithm starts with a larger estimation error, for M ≥ 16 (m ≥ 4) the better convergence rate of the quantum algorithm takes over and the error stays below the Monte Carlo results. The blue solid line shows the error for our real experiments using up to m = 4 evaluation qubits. The blue dashed line shows how the estimation error would further decrease for experiments with m = 5, 6 evaluation qubits, respectively

Two-asset portfolio

We now illustrate how to use our algorithm to calculate the daily risk in a portfolio made up of 1-year US Treasury bills and 2-year US Treasury notes with face values \(V_{F_1}\) and \(V_{F_2}\), respectively. We chose a simple portfolio in order to put the focus on the AE algorithm applied to VaR. The portfolio is worth

$$V(r_1,r_2) = \frac{{V_{F_1}}}{{1 + r_1}} + \mathop {\sum}\limits_{i = 1}^4 \frac{{r_cV_{F_2}}}{{(1 + r_2{\mathrm{/}}2)^i}} + \frac{{V_{F_2}}}{{(1 + r_2{\mathrm{/}}2)^4}},$$
(13)

where rc is the annual coupon rate, i.e. the annual interest payment that the bondholder receives divided by the face value of the bond, paid every 6 months by the 2-year treasury note and r1 and r2 are the yield to maturity of the 1-year bill and 2-year note, respectively.

US Treasuries are usually assumed to be default free.37 The cash-flows are thus known ex ante and the changes in the interest rates are the primary risk factors. Therefore, a proper understanding of the yield curve, i.e. the yield of bonds versus their maturity, suffices to model the risk in this portfolio. We use the Constant Maturity Treasury (CMT) rates to model the uncertainty in r1 and r2. To calculate the daily risk of our portfolio we study the difference in the CMT rates from 1 day to the next. These differences are highly correlated (as are the initial CMT rates), see Fig. 3a, making it unnecessary to model them all when simulating more complex portfolios. A principal component analysis reveals that the first three principal components, named shift, twist and butterfly account for 96% of the variance,38,39 see Fig. 3b, c. Therefore, when modeling a portfolio of US Treasury securities it suffices to study the distribution of these three factors. This dimensionality reduction also lowers the amount of resources needed by our quantum algorithm.

Fig. 3
figure 3

Daily change in the CMT rates. a Correlation matrix. The high correlation between the rates can be exploited to reduce the dimension of the problem. b Shift, Twist, and Butterfly components expressed in terms of the original constant maturity treasury rates. c Eigenvalues of the principal components. The numbers show the cumulative explained variance. d Historical constant maturity treasury rates (1-year against 2-years to maturity) as well as the resulting principal components: shift (longer vector), and twist (shorter vector). e 8-bin histogram of historical shift data (bars) as well as fitted distribution (dashed line). f 4-bin histogram of historical twist data (bars) as well as fitted distribution (dashed line). In both cases the labels show the quantum state that will occur with the corresponding probability. g, h show the quantum circuits used to load the distributions of e, f, respectively, into the quantum computer

To study the daily risk in the portfolio we write ri = ri,0 + δri for i = 1, 2, where ri,0 is the yield to maturity observed today and the random variable δri follows the historical distribution of the 1 day changes in the CMT rate with maturity i. For our demonstration we set \(V_{F_1} = V_{F_2} = \$ 100\), r1,0 = 1.8%, r2,0 = 2.25%, and rc = 2.5% in Eq. (13). We perform a principal component analysis of δr1 and δr2 and retain only the shift S and twist T components. Figure 3d illustrates the historical data as well as S and T, related to δri by

$$\left( {\begin{array}{*{20}{c}} {\delta r_1} \\ {\delta r_2} \end{array}} \right) = {\boldsymbol{W}}\left( {\begin{array}{*{20}{c}} S \\ T \end{array}} \right) = \left( {\begin{array}{*{20}{c}} {0.703} & { - 0.711} \\ {0.711} & {0.703} \end{array}} \right)\left( {\begin{array}{*{20}{c}} S \\ T \end{array}} \right).$$
(14)

The correlation coefficient between shift and twist is −1%. We thus assume them to be independent and fit discrete distributions to each separately. We retained only the first two principal components to illustrate the use of principal component analysis despite the fact that, in this example, there is no dimensionality reduction. Furthermore, this allows us to simulate our algorithm in a reasonable time on classical hardware by keeping the number of required qubits low. We expect that all three components would be retained when running this algorithm on real quantum hardware for larger portfolios.

To model the uncertainty in the quantum computer we use three qubits, denoted by q0, q1, q2, to represent the distribution of S, and two, denoted by q3, q4, for T. Following Eq. (2), the probability distributions are encoded by the states \(\left| {\psi _S} \right\rangle = \mathop {\sum}\nolimits_{i = 0}^7 \sqrt {p_{i,S}} \left| i \right\rangle _3\) and \(\left| {\psi _T} \right\rangle = \mathop {\sum}\nolimits_{i = 0}^3 \sqrt {p_{i,T}} \left| i \right\rangle _2\) for S and T, which can thus take eight and four different values, respectively. We use more qubits for S than for T since the shift explains a larger part of the variance. Additional qubits may be used to represent the probability distributions at a higher resolution. The qubits naturally represent integers via binary encoding and we apply the affine mappings

$$S = 0.0626{\kern 1pt} x - 0.2188,$$
(15)
$$T = 0.0250{\kern 1pt} y - 0.0375.$$
(16)

Here x {0, …, 7} and y {0, …, 3} denote the integer representations of S and T, respectively. Given the almost perfect symmetry of the historical data we fit symmetric distributions to it. The operator \({\cal R}\) that we define prepares a quantum state \({\cal R}\left| 0 \right\rangle _5\), illustrated by the dots in Fig. 3e, f, that represents the distributions of S and T, up to the aforementioned affine mapping.

Next, we show how to construct the operator F to translate the random variables x and y into a portfolio value. Equations (13) through (16) allow us to define the portfolio value V in terms of x and y, instead of r1 and r2. For simplicity, we use a first order approximation

$$\tilde f(x,y) = 203.5170 - 13.1896x - 1.8175y$$
(17)

of V around the mid points x = 3.5 and y = 1.5. From a financial perspective, the first order approximation \(\tilde f\) of V corresponds to studying the portfolio from the point of view of its duration.40 Higher order expansions, e.g. convexity could be considered at the cost of increased circuit depth.

To map the approximated value of the portfolio \(\tilde f\) to a function f with target set [0, 1] we compute \(f = \left( {\tilde f - \tilde f_{{\rm min}}} \right){\mathrm{/}}\left( {\tilde f_{{\rm max}} - \tilde f_{{\rm min}}} \right)\), where \(\tilde f_{{\rm min}} = \tilde f(7,3)\) and \(\tilde f_{{\rm max}} = \tilde f(0,0)\), i.e., the minimum and maximum values \(\tilde f\) can take for the considered values of x {0, …, 7} and y {0, …, 3}. This leads to

$$f(x,y) = 1 - 0.1349x - 0.0186y.$$
(18)

Polynomial approximations allow us to construct an operator F corresponding to f for a given scaling parameter c (0, 1].

We simulate the two-asset portfolio assuming an ideal quantum computer for different numbers m of sampling qubits to show the behavior of the accuracy and convergence rate. We repeat this task twice, once for a processor with all-to-all connectivity and once for a processor with a connectivity corresponding to the IBM Q 20 Tokyo, https://quantumexperience.ng.bluemix.net/qx/devices, accessed: 2018-05-22. chip, see Supplementary Information. This highlights the overhead imposed by a realistic chip connectivity. For a number M = 2m samples, we need a total of m + 12 qubits for expected value and VaR, and m + 13 qubits for CVaR. Five of these qubits are used to represent the distribution of the interest rate changes, one qubit is needed to create the state in Eq. (3) used by AE, and six ancillas are needed to implement the controlled Q operator. For CVaR we need one more ancilla for the comparison to the level l. Once the shift and twist distributions are loaded into the quantum computer, using the circuit shown in Fig. 3g, h, we apply the operator F to create the state defined in Eq. (3).

We compare the quantum estimation of risk to the exact 95% VaR level of $0.288. Based on Eq. (18), this classical VaR corresponds to 0.093, shown by the verticle line in Fig. 4. The quantum estimation of risk rapidly approaches this value as m is increased, Fig. 4. With m = 5 sample qubits the difference between the classical and quantum estimates is 9%. The number of CNOT gates needed to calculate VaR approximately doubles each time a sample qubit is added, see Table 1, i.e. it scales as O(M) with a resulting error of O(M−1). We find that the connectivity of the IBM Q 20 Tokyo chip increases the number of CNOT gates by a factor 2.3 when compared to a chip with all-to-all connectivity.

Fig. 4
figure 4

VaR estimated through a simulation of a perfect quantum computer. As the number of sample qubits m is increased the quantum estimated VaR approaches the classical value indicated by the vertical blue line. The dashed lines are intended as guides to the eye. The stars indicate the most probable values

Table 1 Summary of the number of CNOT gates to estimate VaR as a function of m for a processor architecture featuring an all-to-all qubit connectivity and an architecture with a qubit connectivity corresponding to the IBM Q 20 Tokyo chip with 20 qubits

Computing the expected value or risk measures for the two-asset portfolio requires a long circuit. However, it suffices for AE to return the correct state with the highest probability, i.e. measurements do not need to yield this state with 100% probability. We now run simulations with errors to investigate how much imperfections can be tolerated before the correct state can no longer be identified.

We study the effect of two types of errors: energy relaxation and cross-talk, where the latter is only considered for two-qubit gates (CNOT gates). We believe this captures the leading error sources. Errors and gate times for single qubit gates are in general an order of magnitude lower than for two-qubit gates.41,42,43 Furthermore, our algorithm requires the same order of magnitude in the number of single and two-qubit gates. Energy relaxation is simulated using a relaxation rate γ such that after a time t each qubit has a probability 1 − exp(−γt) of relaxing to \(\left| 0 \right\rangle\).44 We set the duration of the CNOT gates to 100 ns and assume that the single qubit gates are done instantly and are thus exempt from errors. We also include qubit–qubit cross-talk in our simulation by adding a ZZ error-term in the generator of the CNOT gate

$${\mathrm{exp}}\{ - i\pi (ZX + \alpha ZZ){\mathrm{/}}4\} .$$
(19)

Typical cross-resonance45 CNOT gate rates are of the order of 5 MHz whilst cross-talk on IBM Q chips are of the order of −100 kHz.43 We thus estimate a reasonable value of α, i.e. the strength of the cross-talk, to be −2% and simulate its effect over the range [−3%, 0%].

We illustrate the effect of these errors by computing the expected value of the portfolio. Since the distributions are symmetric around zero and mapped to the interval [0, 1] we expect a value of 0.5, i.e. from one day to the next we do not expect a change in the portfolio value. This simulation is run with m = 2 sample qubits since this suffices to exactly estimate 0.5. The algorithm is successful if it manages to identify 0.5 with a probability >50%. With our error model this is achieved for relaxation rates γ < 10−4 s−1 and cross-talk strength |α| < 1%, see Fig. 5a–c, despite the 4383 gates needed. A generous estimation of current hardware capabilities with γ = 10−4 s−1 (loosely based on T1 = 100 μs) and α = −2%, shown as red lines in Fig. 5, indicates that this simulation may be possible in the near future as long as other error sources (such as measurement error and unitary errors resulting from improper gate calibrations) are kept under control.

Fig. 5
figure 5

Results from noisy simulation for estimating the expected value of the two-asset portfolio using two evaluation qubits. The perfect simulation returns 0.5 with 100%. This figure shows how the probability of measuring 0.5 decreases with increasing noise: a shows the results for both, increasing cross-talk and increasing relaxation rate, b shows the result for varying relaxation rate without cross-talk, and c shows the result for different cross-talk strengths without relaxation. The dashed red lines indicate the estimated state of the currently available hardware

Discussion

We developed a quantum algorithm to estimate risk, e.g. for portfolios of financial assets, resulting in a quadratic speed-up compared to classical Monte Carlo methods. The algorithm has been demonstrated on real hardware for a small model and the scalability and impact of noise has been studied using a more complex model and simulation. Our approach is very flexible and straight-forward to extend to other risk measures, such as semi-variance.

More qubits are needed to model realistic scenarios and the errors of actual hardware need to be reduced. Although the quadratic speed-up can already be observed for a small number of samples, more is needed to achieve a practical quantum advantage. In practice, Monte Carlo simulations can be massively parallelized, which pushes the border for a quantum advantage even higher.

Our simulations of the two-asset portfolio show that circuit depth is limited for current hardware. In order to perform the calculation of VaR for the two-asset portfolio on real quantum hardware it is likely that qubit coherence times will have to be increased by several orders of magnitude and that cross-talk will have to be further suppressed.

However, approximating, parallelizing, and decomposing quantum phase estimation is ongoing research and we expect significant improvements in this area not only through hardware, but also algorithms.46,47,48 This can help shorten the required circuit depths, and reduce the hardware requirements to achieve a quantum advantage. Circuit depth can also be shortened by using a more versatile set of gates. For instance, the ability to implement SWAP gates directly in hardware would circumvent the need to synthesize them using CNOT gates.49,50 In addition, techniques such as error mitigation51 could be applied to cope with the noisy hardware of the near future.

Another question that has only briefly been addressed in this paper is the loading of considered random distributions or stochastic processes. For auto-correlated processes this can be rather costly and needs to be further investigated. Techniques known from classical Monte Carlo, such as importance sampling,52 might be employed here as well to improve the results or reduce the circuit depth.

Methods

The code to run real experiment and simulations was written using the Qiskit framework.44 Qiskit also provides access to the used quantum hardware as well as the simulator.

Code availability

The code used for amplitude estimation is available open source in Qiskit Aqua and some of the mentioned examples are available in Qiskit Finance.53