Publication: Efficient Automatic Differentiation of Matrix Functions
 Introduction Applications Tools Research Groups Workshops Publications List Publications Advanced Search Info Add Publications My Account About

#### Efficient Automatic Differentiation of Matrix Functions

- incollection -

Author(s)
Peder A. Olsen , Steven J. Rennie , Vaibhava Goel

Published in

Editor(s)
Shaun Forth, Paul Hovland, Eric Phipps, Jean Utke, Andrea Walther

Year
2012

Publisher
Springer

Abstract
Forward and reverse mode automatic differentiation methods for functions that take a vector argument make derivative computation efficient. However, the determinant and inverse of a matrix are not readily expressed in the language of vectors. The derivative of a function f(X) for a d×d matrix X is itself a d×d matrix. The second derivative, or Hessian, is a d^2 times d^2 matrix, and so computing and storing the Hessian can be very costly. In this paper, we present a new calculus for matrix differentiation, and introduce a new matrix operation, the box product, to accomplish this. The box product can be used to elegantly and efficiently compute both the first and second order matrix derivatives of any function that can be expressed solely in terms of arithmetic, transposition, trace and log determinant operations. The Hessian of such a function can be implicitly represented as a sum of Kronecker, outer, and box products, which allows us to compute the Newton step efficiently. Whereas the direct computation requires O(d^4) storage and O(d^6) operations, the indirect representation of the Hessian allows the storage to be reduced to O(kd^2) , where k is the number of times the variable X occurs in the expression for the derivative. Likewise, the cost of computing the Newton direction is reduced to O(kd^5) in general, and O(d^3) for k = 1 and k = 2.

Cross-References
Forth2012RAi

 BibTeX @INCOLLECTION{          Olsen2012EAD,        title = "Efficient Automatic Differentiation of Matrix Functions",        doi = "10.1007/978-3-642-30023-3_7",        author = "Peder A. Olsen and Steven J. Rennie and Vaibhava Goel",        abstract = "Forward and reverse mode automatic differentiation methods for functions that take          a vector argument make derivative computation efficient. However, the determinant and inverse of a          matrix are not readily expressed in the language of vectors. The derivative of a function $f(X)$ for          a $d \times d$ matrix $X$ is itself a $d \times d$ matrix. The second derivative, or          Hessian, is a $d^2 times d^2$ matrix, and so computing and storing the Hessian can be very costly.          In this paper, we present a new calculus for matrix differentiation, and introduce a new matrix          operation, the box product, to accomplish this. The box product can be used to elegantly and          efficiently compute both the first and second order matrix derivatives of any function that can be          expressed solely in terms of arithmetic, transposition, trace and log determinant operations. The          Hessian of such a function can be implicitly represented as a sum of Kronecker, outer, and box          products, which allows us to compute the Newton step efficiently. Whereas the direct computation          requires $O(d^4)$ storage and $O(d^6)$ operations, the indirect representation of the Hessian allows          the storage to be reduced to $O(kd^2)$ , where $k$ is the number of times the variable $X$ occurs in          the expression for the derivative. Likewise, the cost of computing the Newton direction is reduced          to $O(kd^5)$ in general, and $O(d^3)$ for $k = 1$ and $k = 2$.",        pages = "71--81",        crossref = "Forth2012RAi",        booktitle = "Recent Advances in Algorithmic Differentiation",        series = "Lecture Notes in Computational Science and Engineering",        publisher = "Springer",        address = "Berlin",        volume = "87",        editor = "Shaun Forth and Paul Hovland and Eric Phipps and Jean Utke and Andrea Walther",        isbn = "978-3-540-68935-5",        issn = "1439-7358",        year = "2012",        ad_theotech = "Hierarchical Approach" }