

Efficient Automatic Differentiation of Matrix Functions
incollection
  

Author(s)
Peder A. Olsen
, Steven J. Rennie
, Vaibhava Goel

Published in Recent Advances in Algorithmic Differentiation

Editor(s) Shaun Forth, Paul Hovland, Eric Phipps, Jean Utke, Andrea Walther 
Year 2012 
Publisher Springer 
Abstract Forward and reverse mode automatic differentiation methods for functions that take a vector argument make derivative computation efficient. However, the determinant and inverse of a matrix are not readily expressed in the language of vectors. The derivative of a function f(X) for a d×d matrix X is itself a d×d matrix. The second derivative, or Hessian, is a d^2 times d^2 matrix, and so computing and storing the Hessian can be very costly. In this paper, we present a new calculus for matrix differentiation, and introduce a new matrix operation, the box product, to accomplish this. The box product can be used to elegantly and efficiently compute both the first and second order matrix derivatives of any function that can be expressed solely in terms of arithmetic, transposition, trace and log determinant operations. The Hessian of such a function can be implicitly represented as a sum of Kronecker, outer, and box products, which allows us to compute the Newton step efficiently. Whereas the direct computation requires O(d^4) storage and O(d^6) operations, the indirect representation of the Hessian allows the storage to be reduced to O(kd^2) , where k is the number of times the variable X occurs in the expression for the derivative. Likewise, the cost of computing the Newton direction is reduced to O(kd^5) in general, and O(d^3) for k = 1 and k = 2. 
CrossReferences Forth2012RAi 
AD Theory and Techniques Hierarchical Approach 
BibTeX
@INCOLLECTION{
Olsen2012EAD,
title = "Efficient Automatic Differentiation of Matrix Functions",
doi = "10.1007/9783642300233_7",
author = "Peder A. Olsen and Steven J. Rennie and Vaibhava Goel",
abstract = "Forward and reverse mode automatic differentiation methods for functions that take
a vector argument make derivative computation efficient. However, the determinant and inverse of a
matrix are not readily expressed in the language of vectors. The derivative of a function $f(X)$ for
a $d \times d$ matrix $X$ is itself a $d \times d$ matrix. The second derivative, or
Hessian, is a $d^2 times d^2$ matrix, and so computing and storing the Hessian can be very costly.
In this paper, we present a new calculus for matrix differentiation, and introduce a new matrix
operation, the box product, to accomplish this. The box product can be used to elegantly and
efficiently compute both the first and second order matrix derivatives of any function that can be
expressed solely in terms of arithmetic, transposition, trace and log determinant operations. The
Hessian of such a function can be implicitly represented as a sum of Kronecker, outer, and box
products, which allows us to compute the Newton step efficiently. Whereas the direct computation
requires $O(d^4)$ storage and $O(d^6)$ operations, the indirect representation of the Hessian allows
the storage to be reduced to $O(kd^2)$ , where $k$ is the number of times the variable $X$ occurs in
the expression for the derivative. Likewise, the cost of computing the Newton direction is reduced
to $O(kd^5)$ in general, and $O(d^3)$ for $k = 1$ and $k = 2$.",
pages = "7181",
crossref = "Forth2012RAi",
booktitle = "Recent Advances in Algorithmic Differentiation",
series = "Lecture Notes in Computational Science and Engineering",
publisher = "Springer",
address = "Berlin",
volume = "87",
editor = "Shaun Forth and Paul Hovland and Eric Phipps and Jean Utke and Andrea Walther",
isbn = "9783540689355",
issn = "14397358",
year = "2012",
ad_theotech = "Hierarchical Approach"
}
 
back

