Automatic Differentiation

Tensors.curl
Tensors.divergence
Tensors.gradient
Tensors.hessian
Tensors.laplace
Tensors.@implement_gradient

Tensors supports forward mode automatic differentiation (AD) of tensorial functions to compute first order derivatives (gradients) and second order derivatives (Hessians). It does this by exploiting the Dual number defined in ForwardDiff.jl. While ForwardDiff.jl can itself be used to differentiate tensor functions it is a bit awkward because ForwardDiff.jl is written to work with standard Julia Arrays. One therefore has to send the input argument as an Array to ForwardDiff.jl, convert it to a Tensor and then convert the output Array to a Tensor again. This can also be inefficient since these Arrays are allocated on the heap so one needs to preallocate which can be annoying.

Instead, it is simpler to use Tensors own AD API to do the differentiation. This does not require any conversions and everything will be stack allocated so there is no need to preallocate.

The API for AD in Tensors is gradient(f, A) and hessian(f, A) where f is a function and A is a first or second order tensor. For gradient the function can return a scalar, vector (in case the input is a vector) or a second order tensor. For hessian the function should return a scalar.

When evaluating the function with dual numbers, the value (value and gradient in the case of hessian) is obtained automatically, along with the gradient. To obtain the lower order results gradient and hessian accepts a third arguement, a Symbol. Note that the symbol is only used to dispatch to the correct function, and thus it can be any symbol. In the examples the symbol :all is used to obtain all the lower order derivatives and values.

Tensors.gradient — Function

gradient(f::Function, v::Union{SecondOrderTensor, Vec, Number})
gradient(f::Function, v::Union{SecondOrderTensor, Vec, Number}, :all)

Computes the gradient of the input function. If the (pseudo)-keyword all is given, the value of the function is also returned as a second output argument.

Examples

julia> A = rand(SymmetricTensor{2, 2});

julia> ∇f = gradient(norm, A)
2×2 SymmetricTensor{2, 2, Float64, 3}:
 0.374672  0.63107
 0.63107   0.25124

julia> ∇f, f = gradient(norm, A, :all);

source

Tensors.hessian — Function

hessian(f::Function, v::Union{SecondOrderTensor, Vec, Number})
hessian(f::Function, v::Union{SecondOrderTensor, Vec, Number}, :all)

Computes the hessian of the input function. If the (pseudo)-keyword all is given, the lower order results (gradient and value) of the function is also returned as a second and third output argument.

Examples

julia> A = rand(SymmetricTensor{2, 2});

julia> ∇∇f = hessian(norm, A)
2×2×2×2 SymmetricTensor{4, 2, Float64, 9}:
[:, :, 1, 1] =
  0.988034  -0.271765
 -0.271765  -0.108194

[:, :, 2, 1] =
 -0.271765   0.11695
  0.11695   -0.182235

[:, :, 1, 2] =
 -0.271765   0.11695
  0.11695   -0.182235

[:, :, 2, 2] =
 -0.108194  -0.182235
 -0.182235   1.07683

julia> ∇∇f, ∇f, f = hessian(norm, A, :all);

source

Tensors.divergence — Function

divergence(f, x)

Calculate the divergence of the vector field f, in the point x.

Examples

julia> f(x) = 2x;

julia> x = rand(Vec{3});

julia> divergence(f, x)
6.0

source

Tensors.curl — Function

curl(f, x)

Calculate the curl of the vector field f, in the point x.

Examples

julia> f(x) = Vec{3}((x[2], x[3], -x[1]));

julia> x = rand(Vec{3});

julia> curl(f, x)
3-element Vec{3, Float64}:
 -1.0
  1.0
 -1.0

source

Tensors.laplace — Function

laplace(f, x)

Calculate the laplacian of the field f, in the point x. If f is a vector field, use broadcasting.

Examples

julia> x = rand(Vec{3});

julia> f(x) = norm(x);

julia> laplace(f, x)
2.9633756571179273

julia> g(x) = x*norm(x);

julia> laplace.(g, x)
3-element Vec{3, Float64}:
 1.9319830062026155
 3.2540895437409754
 1.2955087437219237

source

Examples

We here give a few examples of differentiating various functions and compare with the analytical solution.

Norm of a vector

\[f(\mathbf{x}) = |\mathbf{x}| \quad \Rightarrow \quad \partial f / \partial \mathbf{x} = \mathbf{x} / |\mathbf{x}|\]

julia> x = rand(Vec{2});

julia> gradient(norm, x)
2-element Vec{2, Float64}:
 0.5105128363207563
 0.859870132026771

julia> x / norm(x)
2-element Vec{2, Float64}:
 0.5105128363207563
 0.8598701320267711

Determinant of a second order symmetric tensor

\[f(\mathbf{A}) = \det \mathbf{A} \quad \Rightarrow \quad \partial f / \partial \mathbf{A} = \mathbf{A}^{-T} \det \mathbf{A}\]

julia> A = rand(SymmetricTensor{2,2});

julia> gradient(det, A)
2×2 SymmetricTensor{2, 2, Float64, 3}:
  0.218587  -0.549051
 -0.549051   0.325977

julia> inv(A)' * det(A)
2×2 SymmetricTensor{2, 2, Float64, 3}:
  0.218587  -0.549051
 -0.549051   0.325977

Hessian of a quadratic potential

\[\psi(\mathbf{e}) = 1/2 \mathbf{e} : \mathsf{E} : \mathbf{e} \quad \Rightarrow \quad \partial \psi / (\partial \mathbf{e} \otimes \partial \mathbf{e}) = \mathsf{E}^\text{sym}\]

where $\mathsf{E}^\text{sym}$ is the major symmetric part of $\mathsf{E}$.

julia> E = rand(Tensor{4,2});

julia> ψ(ϵ) = 1/2 * ϵ ⊡ E ⊡ ϵ;

julia> E_sym = hessian(ψ, rand(Tensor{2,2}));

julia> norm(majorsymmetric(E) - E_sym)
0.0

Inserting a known derivative

When conditionals are used in a function evaluation, automatic differentiation may yield the wrong result. Consider, the simplified example of the function f(x) = is_zero(x) ? zero(x) : sin(x). If evaluated at x=0, the returning of zero(x) gives a zero derivative because zero(x) is constant, while the correct value is 1. In such cases, it is possible to insert a known derivative of a function which is part of a larger function to be automatically differentiated.

Another use case is when the analytical derivative can be computed much more efficiently than the automatically differentiatiated derivative.

Tensors.@implement_gradient — Macro

@implement_gradient(f, f_dfdx)

This macro allows specifying a function f_dfdx that provides an analytical derivative of the function f, and is invoked when f is differentiated using automatic differentiation based on ForwardDiff.jl (e.g. when using Tensors.jl's gradient or hessian), or one of ForwardDiff.jl's API). The function f_dfdx must take the same argument as f and should return both the value of f and the gradient, i.e. fval, dfdx_val = f_dfdx(x). The following combinations of input and output types are supported:

`x`	`f(x)`	`dfdx`
`Number`	`Number`	`Number`
`Number`	`Vec`	`Vec`
`Number`	`SecondOrderTensor`	`SecondOrderTensor`
`Vec`	`Number`	`Vec`
`Vec`	`Vec`	`Tensor{2}`
`SecondOrderTensor`	`Number`	`SecondOrderTensor`
`SecondOrderTensor`	`SecondOrderTensor`	`FourthOrderTensor`

Note that if one tensor if of symmetric type, then all tensors must be of symmetric type

source

Example

Lets consider the function $h(\mathbf{f}(\mathbf{g}(\mathbf{x})))$ where h(x)=norm(x), f(x)=x ⋅ x, and g(x)=dev(x). For f(x) we then have the analytical derivative

\[\frac{\partial f_{ij}}{\partial x_{kl}} = \delta_{ik} x_{lj} + x_{ik} \delta_{jl}\]

which we can insert into our known analytical derivative using the @implement_gradient macro. Below, we compare with the result when the full derivative is calculated using automatic differentiation.

# Define functions
h(x) = norm(x)
f1(x) = x ⋅ x
f2(x) = f1(x)
g(x) = dev(x)

# Define composed functions
cfun1(x) = h(f1(g(x)))
cfun2(x) = h(f2(g(x)))

# Define known derivative
function df2dx(x::Tensor{2,dim}) where{dim}
    println("Hello from df2dx") # Show that df2dx is called
    fval = f2(x)
    I2 = one(Tensor{2,dim})
    dfdx_val = otimesu(I2, transpose(x)) + otimesu(x, I2)
    return fval, dfdx_val
end

# Implement known derivative
@implement_gradient f2 df2dx

# Calculate gradients
x = rand(Tensor{2,2})

gradient(cfun1, x) ≈ gradient(cfun2, x)

# output
Hello from df2dx
true