In this text, i demonstrate basic linear algebra, matrix factorization (LU and QR) to solve linear equations and regression, lagrange method, basic numerical methods (bisection and Newton-Raphson) in context of modern (Markowitz, efficient frontier) portfolio theory to estimate optimum weights for given constraints within R. Firstly, lets do some basic linear equation exercises in R.
Load matrixcalc library
library("matrixcalc")
Lets assume we have a linear equation such as: Ax=b where A is square matrix and x is unknown parameters column vector and b is also column vector.
A = matrix(c(2, 2, 4, 9), 2, 2, byrow = T)
A
## [,1] [,2]
## [1,] 2 2
## [2,] 4 9
b = c(8, 21)
To solve this equation, solve command can be used such as:
solve(A, b)
## [1] 3 1
We can also use various factorization method to resolve this linear equation.
LU Factorization
LU factorization creates elimination matrices (lower and upper) which will be used to solve linear equations.
- The upper triangular U has the pivots on its diagonal
- The lower triangular L has ones on its diagonal
- L has the multipliers \( l_{ij} \) below the diagonal
lu = lu.decomposition(A)
lu
## $L
## [,1] [,2]
## [1,] 1 0
## [2,] 2 1
##
## $U
## [,1] [,2]
## [1,] 2 2
## [2,] 0 5
To solve x via L (lower triangle), U (upper triangle), we have two steps:
First step is to find a column vector c which solves Lc=b equation
c = solve(lu$L, b)
c
## [1] 8 5
In the last step, Ux=c equation is solved which yields x, unknown vector.
x = solve(lu$U, c)
x
## [1] 3 1
QR Factorization
QR factorization decomposites a matrix (A) into two matrices (Q, R) , where A=QR , Q is an orthogonal matrix and R is an upper triangular matrix.
aqr = qr(A)
# orthogonal matrix
aQ = qr.Q(aqr)
aQ
## [,1] [,2]
## [1,] -0.4472 -0.8944
## [2,] -0.8944 0.4472
# upper triangular matrix
aR = qr.R(aqr)
aR
## [,1] [,2]
## [1,] -4.472 -8.944
## [2,] 0.000 2.236
Since orthogonal matrix has properties of : \( QQ^T=I \), Ax=B equation can be written as: \( Rx=Q^Tb \) where , first A decomposited into QR and then mupltiplied by \( Q^T \) .
aQT = t(aQ)
qb = aQT %*% b
x = backsolve(aR, qb)
x
## [,1]
## [1,] 3
## [2,] 1
# all of above steps can be done with a one command too
x = qr.solve(A, b)
x
## [1] 3 1
QR factorization for Linear regression
QR factorization is also useful for linear regression fit (\( y=\beta x+\alpha \)) For example, lets try to estimate coefficients (\( \beta \) and \( \alpha \)) of linear regression between Apple and Google's montly returns.
library(quantmod)
# download price of Google and apple
getSymbols(c("GOOG", "AAPL"))
## [1] "GOOG" "AAPL"
google = c(coredata(monthlyReturn(GOOG)))
apple = c(coredata(monthlyReturn(AAPL)))
# let plot returns
plot(google, apple)
Lets apply QR factorization now to find out linear regression (apple~google)
x = cbind(1, google)
xqr = qr(x)
xQ = qr.Q(xqr, complete = T)
xR = qr.R(xqr, complete = T)
# Compute u = QTy
u = t(xQ) %*% apple
# and finally solve for $ \beta $ and $ \alpha $
backsolve(xR[1:2, 1:2], u[1:2])
## [1] 0.01636 0.66522
# lets verify the result with lm command
lm(apple ~ google)
##
## Call:
## lm(formula = apple ~ google)
##
## Coefficients:
## (Intercept) google
## 0.0164 0.6652
Linear Algebra in Portfolio Theory
Linear equations are common in modern (markowitz, efficient frontier) portfolio theory in finance to estimate optimal weights subject to some constrains, such as maximize return or minimize variance. For example:
where we have following return vector, target risk and covariance matrix :
mu = c(0.08, 0.1, 0.13, 0.15, 0.2)
mu
## [1] 0.08 0.10 0.13 0.15 0.20
s = 0.25^2 # target risk
s
## [1] 0.0625
Sigma = matrix(c(0.0196, -0.00756, 0.01288, 0.00875, -0.0098, -0.00756, 0.0324,
-0.00414, -0.009, 0.00945, 0.01288, -0.00414, 0.0529, 0.020125, 0.020125,
0.00875, -0.009, 0.020125, 0.0625, -0.013125, -0.0098, 0.00945, 0.020125,
-0.013125, 0.1225), 5, 5, byrow = T)
Sigma
## [,1] [,2] [,3] [,4] [,5]
## [1,] 0.01960 -0.00756 0.01288 0.00875 -0.00980
## [2,] -0.00756 0.03240 -0.00414 -0.00900 0.00945
## [3,] 0.01288 -0.00414 0.05290 0.02013 0.02013
## [4,] 0.00875 -0.00900 0.02013 0.06250 -0.01312
## [5,] -0.00980 0.00945 0.02013 -0.01312 0.12250
To find optimum weights, above constraint problem needs to be solved. We will use Lagrangian method for this purpose. Lagrangian is:
To maximize return, we can take first partial derivative of $ F'(w, \lambda)=0$ equation:
To solve this quadratic equation, we can deploy numerical methods such as bisection or Newton-Raphson method. Let me explain briefly, what are these methods:
Bisection Method
If a function f(x) is continuous between [a,b] and if f(a) and f(b) have different sign, then by using following algorithm f(x)=0 can be found:
- Increase count= count+1
- Compute f( c ) where c =(a + b)/2 is the midpoint of [a, b]
- If sign(f( c )) == sign(f(a)), let a=c, otherwise b=c;
- If |b-a| > tolerance and count < totalIteration then goto 1
R implementation would be:
bisection <- function(f, a, b, tol = 0.001, iteration = 100) {
count = 0
while (b - a > tol & count < iteration) {
c <- (a + b)/2
if (sign(f(c)) == sign(f(a)))
a <- c else b <- c
count = count + 1
}
(a + b)/2
}
This is analogous to binary search algorithm in computer science.
Newton-Raphson method
Newton-Raphson method utilises fundemantal theory of derivatives in order to find root of continuous function f(x)=0 in a recursive function, with a given starting point:
Compare to bisection, Newton-Raphson method converge faster, if ever converges.
Lagrange's Method & Newton-Raphson Method
Armed with Newton-Raphson method, now we can solve \( G(w, \lambda) = F'(w, \lambda) \). Algorithm to solve it:
- Compute G(x) and G'(x)
- Pick a starting point ($x_0$)
- Solve the linear system : $G'(x_k)u = G(x_k)$
- Update $x_{k+1} = x_k − u$
- Repeat steps 3 and 4 with a number of times or $x_{k+1}- x_k$ is less than a predefined tolerance threshold.
Function to compute $ G(w, \lambda) $
G <- function(x, mu, Sigma, sigmaP2) {
n <- length(mu)
c(mu + rep(x[n + 1], n) + 2 * x[n + 2] * (Sigma %*% x[1:n]), sum(x[1:n]) -
1, t(x[1:5]) %*% Sigma %*% x[1:5] - sigmaP2)
}
Derivative of G
DG <- function(x, mu, Sigma, sigmaP2) {
n <- length(mu)
grad <- matrix(0, n + 2, n + 2)
grad[1:n, 1:n] <- 2 * x[n + 2] * Sigma
grad[1:n, n + 1] <- 1
grad[1:n, n + 2] <- 2 * (Sigma %*% x[1:5])
grad[n + 1, 1:n] <- 1
grad[n + 2, 1:n] <- 2 * t(x[1:5]) %*% Sigma
grad
}
Initial weights and $ \lambda $:
x = c(rep(0.5, 5), 1, 1)
x
## [1] 0.5 0.5 0.5 0.5 0.5 1.0 1.0
Lets apply Newton-Raphson iteration now:
for (i in 1:100) {
x <- x - solve(DG(x, mu, Sigma, s), G(x, mu, Sigma, s))
}
and numerical solution:
x
## [1] -0.39550 0.09606 0.04584 0.70988 0.54372 -0.09201 -0.85715
We need to verify that $ G'(x) $ is positive definite, in order to have a maximum critical points. Lets first see what is $ G'(x) $
DG(x, mu, Sigma, sigmaP2)[1:5, 1:5]
## [,1] [,2] [,3] [,4] [,5]
## [1,] -0.03360 0.012960 -0.022080 -0.01500 0.0168
## [2,] 0.01296 -0.055543 0.007097 0.01543 -0.0162
## [3,] -0.02208 0.007097 -0.090686 -0.03450 -0.0345
## [4,] -0.01500 0.015429 -0.034500 -0.10714 0.0225
## [5,] 0.01680 -0.016200 -0.034500 0.02250 -0.2100
Since $ G'(x) $ is symetric and square, all negative eigen value of $ G'(x) $ confirms $G'(x) is $negative definite:
eigen(DG(x, mu, Sigma, s)[1:5, 1:5])$values
## [1] -0.02024 -0.05059 -0.05806 -0.14445 -0.22364
Since all eigen values are negative, we can conclude that x[1:5] is optimum weights for the constraint and maximum return is:
t(x[1:5]) %*% mu
## [,1]
## [1,] 0.1992