Monday, February 24, 2014

Linear Algebra and Modern (Markowitz) Portfolio Theory within R

In this text, i demonstrate basic linear algebra, matrix factorization (LU and QR) to solve linear equations and regression, lagrange method, basic numerical methods (bisection and Newton-Raphson) in context of modern (Markowitz, efficient frontier) portfolio theory to estimate optimum weights for given constraints within R. Firstly, lets do some basic linear equation exercises in R.

Load matrixcalc library

library("matrixcalc")

Lets assume we have a linear equation such as: Ax=b where A is square matrix and x is unknown parameters column vector and b is also column vector.

A = matrix(c(2, 2, 4, 9), 2, 2, byrow = T)
A
##      [,1] [,2]
## [1,]    2    2
## [2,]    4    9
b = c(8, 21)

To solve this equation, solve command can be used such as:

solve(A, b)
## [1] 3 1

We can also use various factorization method to resolve this linear equation.

LU Factorization

LU factorization creates elimination matrices (lower and upper) which will be used to solve linear equations.

  1. The upper triangular U has the pivots on its diagonal
  2. The lower triangular L has ones on its diagonal
  3. L has the multipliers \( l_{ij} \) below the diagonal
lu = lu.decomposition(A)
lu
## $L
##      [,1] [,2]
## [1,]    1    0
## [2,]    2    1
## 
## $U
##      [,1] [,2]
## [1,]    2    2
## [2,]    0    5

To solve x via L (lower triangle), U (upper triangle), we have two steps:

First step is to find a column vector c which solves Lc=b equation

c = solve(lu$L, b)
c
## [1] 8 5

In the last step, Ux=c equation is solved which yields x, unknown vector.

x = solve(lu$U, c)
x
## [1] 3 1

QR Factorization

QR factorization decomposites a matrix (A) into two matrices (Q, R) , where A=QR , Q is an orthogonal matrix and R is an upper triangular matrix.


aqr = qr(A)
# orthogonal matrix
aQ = qr.Q(aqr)
aQ
##         [,1]    [,2]
## [1,] -0.4472 -0.8944
## [2,] -0.8944  0.4472
# upper triangular matrix
aR = qr.R(aqr)
aR
##        [,1]   [,2]
## [1,] -4.472 -8.944
## [2,]  0.000  2.236

Since orthogonal matrix has properties of : \( QQ^T=I \), Ax=B equation can be written as: \( Rx=Q^Tb \) where , first A decomposited into QR and then mupltiplied by \( Q^T \) .

aQT = t(aQ)
qb = aQT %*% b
x = backsolve(aR, qb)
x
##      [,1]
## [1,]    3
## [2,]    1

# all of above steps can be done with a one command too
x = qr.solve(A, b)
x
## [1] 3 1

QR factorization for Linear regression

QR factorization is also useful for linear regression fit (\( y=\beta x+\alpha \)) For example, lets try to estimate coefficients (\( \beta \) and \( \alpha \)) of linear regression between Apple and Google's montly returns.

library(quantmod)

# download price of Google and apple
getSymbols(c("GOOG", "AAPL"))
## [1] "GOOG" "AAPL"

google = c(coredata(monthlyReturn(GOOG)))
apple = c(coredata(monthlyReturn(AAPL)))
# let plot returns
plot(google, apple)

plot of chunk unnamed-chunk-9

Lets apply QR factorization now to find out linear regression (apple~google)

x = cbind(1, google)
xqr = qr(x)

xQ = qr.Q(xqr, complete = T)
xR = qr.R(xqr, complete = T)
# Compute u = QTy
u = t(xQ) %*% apple
# and finally solve for $ \beta $ and $ \alpha $
backsolve(xR[1:2, 1:2], u[1:2])
## [1] 0.01636 0.66522

# lets verify the result with lm command
lm(apple ~ google)
## 
## Call:
## lm(formula = apple ~ google)
## 
## Coefficients:
## (Intercept)       google  
##      0.0164       0.6652

Linear Algebra in Portfolio Theory

Linear equations are common in modern (markowitz, efficient frontier) portfolio theory in finance to estimate optimal weights subject to some constrains, such as maximize return or minimize variance. For example:

\[ maximize: \mu^{T}w \] \[ subject to: e^{T}w=1 \] \[ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ w^T\Sigma w=\sigma^2 \]

where we have following return vector, target risk and covariance matrix :

mu = c(0.08, 0.1, 0.13, 0.15, 0.2)
mu
## [1] 0.08 0.10 0.13 0.15 0.20
s = 0.25^2  # target risk 
s
## [1] 0.0625
Sigma = matrix(c(0.0196, -0.00756, 0.01288, 0.00875, -0.0098, -0.00756, 0.0324, 
    -0.00414, -0.009, 0.00945, 0.01288, -0.00414, 0.0529, 0.020125, 0.020125, 
    0.00875, -0.009, 0.020125, 0.0625, -0.013125, -0.0098, 0.00945, 0.020125, 
    -0.013125, 0.1225), 5, 5, byrow = T)
Sigma
##          [,1]     [,2]     [,3]     [,4]     [,5]
## [1,]  0.01960 -0.00756  0.01288  0.00875 -0.00980
## [2,] -0.00756  0.03240 -0.00414 -0.00900  0.00945
## [3,]  0.01288 -0.00414  0.05290  0.02013  0.02013
## [4,]  0.00875 -0.00900  0.02013  0.06250 -0.01312
## [5,] -0.00980  0.00945  0.02013 -0.01312  0.12250

To find optimum weights, above constraint problem needs to be solved. We will use Lagrangian method for this purpose. Lagrangian is:

\[ F(w,\lambda)= \mu^{T}w + \lambda_1(e^{T}w-1)+\lambda_2(w^T\Sigma w-\sigma^2) \]

To maximize return, we can take first partial derivative of $ F'(w, \lambda)=0$ equation:

\[ G(w, \lambda) = F'(w, \lambda) = \left| \begin{array}{c} \mu+\lambda_1 e + 2\lambda_2 w \Sigma \\ e^{T}w-1 \\ w^T\Sigma w-\sigma^2 \end{array} \right| = 0 \]

To solve this quadratic equation, we can deploy numerical methods such as bisection or Newton-Raphson method. Let me explain briefly, what are these methods:

Bisection Method

If a function f(x) is continuous between [a,b] and if f(a) and f(b) have different sign, then by using following algorithm f(x)=0 can be found:

  1. Increase count= count+1
  2. Compute f( c ) where c =(a + b)/2 is the midpoint of [a, b]
  3. If sign(f( c )) == sign(f(a)), let a=c, otherwise b=c;
  4. If |b-a| > tolerance and count < totalIteration then goto 1

R implementation would be:

bisection <- function(f, a, b, tol = 0.001, iteration = 100) {
    count = 0
    while (b - a > tol & count < iteration) {
        c <- (a + b)/2
        if (sign(f(c)) == sign(f(a))) 
            a <- c else b <- c
        count = count + 1
    }
    (a + b)/2
}

This is analogous to binary search algorithm in computer science.

Newton-Raphson method

Newton-Raphson method utilises fundemantal theory of derivatives in order to find root of continuous function f(x)=0 in a recursive function, with a given starting point:

\[ x_{k+1}=x_k- \frac{f(x_k)}{f'(x_k)} \]

Compare to bisection, Newton-Raphson method converge faster, if ever converges.

Lagrange's Method & Newton-Raphson Method

Armed with Newton-Raphson method, now we can solve \( G(w, \lambda) = F'(w, \lambda) \). Algorithm to solve it:

  1. Compute G(x) and G'(x)
  2. Pick a starting point ($x_0$)
  3. Solve the linear system : $G'(x_k)u = G(x_k)$
  4. Update $x_{k+1} = x_k − u$
  5. Repeat steps 3 and 4 with a number of times or $x_{k+1}- x_k$ is less than a predefined tolerance threshold.

First, we need to get partial derivative of G which will be used in in Newton-Raphson method:

\[ G'(w, \lambda) = \left| \begin{array}{ccc} 2\lambda_2 \Sigma & e & 2 \Sigma w\\ e^{T} & 0 & 0 \\ 2\Sigma w & 0 & 0 \end{array} \right| = 0 \]

Function to compute $ G(w, \lambda) $

G <- function(x, mu, Sigma, sigmaP2) {
    n <- length(mu)
    c(mu + rep(x[n + 1], n) + 2 * x[n + 2] * (Sigma %*% x[1:n]), sum(x[1:n]) - 
        1, t(x[1:5]) %*% Sigma %*% x[1:5] - sigmaP2)
}

Derivative of G

DG <- function(x, mu, Sigma, sigmaP2) {
    n <- length(mu)
    grad <- matrix(0, n + 2, n + 2)
    grad[1:n, 1:n] <- 2 * x[n + 2] * Sigma
    grad[1:n, n + 1] <- 1
    grad[1:n, n + 2] <- 2 * (Sigma %*% x[1:5])
    grad[n + 1, 1:n] <- 1
    grad[n + 2, 1:n] <- 2 * t(x[1:5]) %*% Sigma
    grad
}

Initial weights and $ \lambda $:

x = c(rep(0.5, 5), 1, 1)
x
## [1] 0.5 0.5 0.5 0.5 0.5 1.0 1.0

Lets apply Newton-Raphson iteration now:

for (i in 1:100) {
    x <- x - solve(DG(x, mu, Sigma, s), G(x, mu, Sigma, s))
}

and numerical solution:

x
## [1] -0.39550  0.09606  0.04584  0.70988  0.54372 -0.09201 -0.85715

We need to verify that $ G'(x) $ is positive definite, in order to have a maximum critical points. Lets first see what is $ G'(x) $

DG(x, mu, Sigma, sigmaP2)[1:5, 1:5]
##          [,1]      [,2]      [,3]     [,4]    [,5]
## [1,] -0.03360  0.012960 -0.022080 -0.01500  0.0168
## [2,]  0.01296 -0.055543  0.007097  0.01543 -0.0162
## [3,] -0.02208  0.007097 -0.090686 -0.03450 -0.0345
## [4,] -0.01500  0.015429 -0.034500 -0.10714  0.0225
## [5,]  0.01680 -0.016200 -0.034500  0.02250 -0.2100

Since $ G'(x) $ is symetric and square, all negative eigen value of $ G'(x) $ confirms $G'(x) is $negative definite:

eigen(DG(x, mu, Sigma, s)[1:5, 1:5])$values
## [1] -0.02024 -0.05059 -0.05806 -0.14445 -0.22364

Since all eigen values are negative, we can conclude that x[1:5] is optimum weights for the constraint and maximum return is:

t(x[1:5]) %*% mu
##        [,1]
## [1,] 0.1992

No comments: