Quickstart Guide

Defining the GTPSA

We first must define a Descriptor which includes all information about a GTPSA, including the number of variables and truncation order:

# 2 variables with max truncation order 10
d10 = Descriptor(2, 10)
Descriptor(NV=2, MO=10)

Calculating a Truncated Power Series

GTPSA.jl centers around the TPS (truncated power series) type, which represents a multivariable Taylor series truncated at the chosen order in the Descriptor. Using Einstein notation for the variable indices, and letting $n$ specify the order up to a maximum truncation order $MO$, we can express a function $f$ expanded around $\vec{a}$ as a TPS:

\[f(\vec{x}) = f(\vec{a}) + \sum_{n=1}^{MO} \left.\frac{\partial f}{\partial x_{i_1} \partial x_{i_2}\ldots \partial x_{i_n}}\right\rvert_{\vec{a}} \Delta x_{i_1} \Delta x_{i_2} \ldots\Delta x_{i_n}\]

The TPS type stores all of the monomial coefficients in this Taylor series up to the chosen order (e.g. $f(\vec{a})$, $\partial f /\partial x_i |_{\vec{a}}$, etc.). You can manipulate and propagate a TPS through functions like any other number type, and the result will be a TPS containing the Taylor expansion representing all preceding operations. If you have some familiarity with automatic differentiation, then you can view a TPS as basically a Dual number which has been highly optimized for high order automatic differentiation with high numbers of variables/parameters.

In the context of GTPSA.jl, we refer to "variables" as each $\Delta x_i$ in the TPS. The reason for this is that one can always choose a coordinate system where $\vec{a}=\vec{0}$, and such a choice greatly simplifies the terminology and analysis. This brings up an important point: the TPS type itself does NOT explicitly store any expansion point. It just stores the monomial coefficients of each term in the Taylor series, truncated at the chosen order of the (tiny) variables. The setup of the problem, done by you our dear user, will decide the expansion point. In the parlance of Dual numbers, the quantity $\Delta x_i$ is equivalent to $0+1\epsilon_i$.

Let's dive into some examples, which will make these above points clearer. After defining the Descriptor, we can obtain the variables, which themselves are represented as TPSs (with each variable's first order monomial coefficient set to 1) using @vars:

d6 = Descriptor(2, 6); # 2 variables to 6th order

# Returns a Vector of each variable as a TPS
Δx = @vars(d6)
2-element Vector{TPS64{Descriptor(NV=2, MO=6)}}:
 Index Coefficient                Order   Exponent
----------------------------------------------------
   1:   1.0000000000000000e+00      1      1   0
----------------------------------------------------
   2:   1.0000000000000000e+00      1      0   1

The result is a TPS vector function $\begin{bmatrix} \Delta x_1 \\ \Delta x_2 \end{bmatrix}$ corresponding directly to each variable. TPS64 is an alias for TPS{Float64}, which are TPSs that represent 64-bit floats. Likewise, ComplexTPS64 is an alias for TPS{ComplexF64}. Currently, GTPSA only supports TPSs which represent Float64 and ComplexF64 numbers.

Now let's use this to compute the Maclaurin series of the function $f(\vec{x}) = \cos{(x_1)} + \textrm{i}\sin{(x_2)}$:

f(x) = cos(x[1]) + im*sin(x[2]); # Define the function

f(Δx) # Maclaurin series
ComplexTPS64{Descriptor(NV=2, MO=6)}:
 Real                     Imag                       Order   Exponent
  1.0000000000000000e+00   0.0000000000000000e+00      0      0   0
  0.0000000000000000e+00   1.0000000000000000e+00      1      0   1
 -5.0000000000000000e-01   0.0000000000000000e+00      2      2   0
  0.0000000000000000e+00  -1.6666666666666666e-01      3      0   3
  4.1666666666666664e-02   0.0000000000000000e+00      4      4   0
  0.0000000000000000e+00   8.3333333333333332e-03      5      0   5
 -1.3888888888888887e-03   0.0000000000000000e+00      6      6   0

Another way to view "variables" in the context of GTPSA.jl are as tiny "wiggles". Specifically, if you wiggle the input, how does the output depend on those input wiggles?. The vector of variables Δx are thus just unit wiggles.

To instead compute the Taylor series of f expanded around $(-\pi/2,\pi/2)$, we simply "wiggle" around $(-\pi/2,\pi/2)$:

x0 = [-pi/2, pi/2];

ft = f(x0 + Δx)
ComplexTPS64{Descriptor(NV=2, MO=6)}:
 Real                     Imag                       Order   Exponent
  6.1232339957367660e-17   1.0000000000000000e+00      0      0   0
  1.0000000000000000e+00   0.0000000000000000e+00      1      1   0
  0.0000000000000000e+00   6.1232339957367660e-17      1      0   1
 -3.0616169978683830e-17   0.0000000000000000e+00      2      2   0
  0.0000000000000000e+00  -5.0000000000000000e-01      2      0   2
 -1.6666666666666666e-01   0.0000000000000000e+00      3      3   0
  0.0000000000000000e+00  -1.0205389992894610e-17      3      0   3
  2.5513474982236524e-18   0.0000000000000000e+00      4      4   0
  0.0000000000000000e+00   4.1666666666666664e-02      4      0   4
  8.3333333333333332e-03   0.0000000000000000e+00      5      5   0
  0.0000000000000000e+00   5.1026949964473050e-19      5      0   5
 -8.5044916607455075e-20   0.0000000000000000e+00      6      6   0
  0.0000000000000000e+00  -1.3888888888888887e-03      6      0   6

ft is a TPS containing the expansion of f around $(-\pi/2,\pi/2)$ up to 6th order, with the variables specifying deviations from $(-\pi/2,\pi/2)$. Note how some of the monomial coefficients are small but nonzero due to floating point roundoff error. To hide the display of small monomial coefficients below a certain absolute value, we can set the global variable GTPSA.show_eps:

GTPSA.show_eps = 1e-15
println(ft)
ComplexTPS64{Descriptor(NV=2, MO=6)}:
 Real                     Imag                       Order   Exponent
  0.0000000000000000e+00   1.0000000000000000e+00      0      0   0
  1.0000000000000000e+00   0.0000000000000000e+00      1      1   0
  0.0000000000000000e+00  -5.0000000000000000e-01      2      0   2
 -1.6666666666666666e-01   0.0000000000000000e+00      3      3   0
  0.0000000000000000e+00   4.1666666666666664e-02      4      0   4
  8.3333333333333332e-03   0.0000000000000000e+00      5      5   0
  0.0000000000000000e+00  -1.3888888888888887e-03      6      0   6
Note

Setting GTPSA.show_eps does not actually reset any monomial coefficients in the TPSs equal to zero, rather just suppresses their output when showing any TPS.

TPS Evaluation, Composition, Translation, and Inversion

We can evaluate TPSs just like any other function. Continuing with our example above, let's evaluate ft, which is an expansion around $(-\pi/2,\pi/2)$ at, say, $(\Delta x_1,\Delta x_2)=(\pi,-\pi)$, and see how well it agrees with the exact calculation using f:

abs(ft([-pi, pi]) - f(x0 + [-pi, pi]))
0.565059330890453

The disagreement is quite large! This is due to the truncation error of the TPS. If we increase the maximum order of the GTPSA, we can obtain a better approximation of the function f for such large variables. Let's test that now, setting the truncation order to 20 for both variables:

d20 = Descriptor(2, 20); # two variables to 20th order
Δx = @vars(d20);
ft = f(x0 + Δx);
abs(ft([-pi, pi]) - f(x0 + [-pi, pi]))
5.343014327042979e-10

ft now approximates f much better for such large variables!

We also can compose TPSs. For this example, let's define a new GTPSA with only 1 variable to 1st order, and define the functions $f(x) = x^2+2x$ and $g(x)=3+4x$:

d = Descriptor(1, 1);
Δx = first(@vars(d));
f(x) = x^2 + 2*x;
g(x) = 3 + 4*x;
g (generic function with 1 method)

Of course, we can compute a TPS representing the expansion of $f\circ g$ around $x=0$ by:

fg_exact = f(g(Δx))
TPS64{Descriptor(NV=1, MO=1)}:
 Coefficient                Order   Exponent
  1.5000000000000000e+01      0      0
  3.2000000000000000e+01      1      1

However, we can also separately compute TPSs for $f$ and $g$ around $x=0$, and then compose them:

ft = f(Δx);
gt = g(Δx);
fg_composed = ft ∘ gt # Or equivalently ft(gt)
TPS64{Descriptor(NV=1, MO=1)}:
 Coefficient                Order   Exponent
  6.0000000000000000e+00      0      0
  8.0000000000000000e+00      1      1

Wait a minute, what happened here? fg_composed certainly is not equal to fg_exact.

This example presents the important concept of TPS feed-down error. The explanation is simple; for fg_composed, we separately obtained TPSs expanded around $x=0$, and then tried to compose them. However, after passing through $g$, a nonzero 0th order term (also called the scalar part of the TPS) was picked up. Thus, we should have expanded $f$ around the scalar part of gt, instead of $x=0$. Then, to keep the same coordinates as initially inputted into gt, we'll also need to translate this TPS expansion point back to the original expansion point:

# First find the expansion of `f` around the 0th order (scalar part) of `g`
ft1 = f(scalar(gt) + Δx); # We can use `scalar` to get the scalar part of a `TPS`

# Then `translate` the expansion point back to the original expansion point
ft = translate(ft1, -scalar(gt));

fg_composed = ft ∘ gt
TPS64{Descriptor(NV=1, MO=1)}:
 Coefficient                Order   Exponent
  1.5000000000000000e+01      0      0
  3.2000000000000000e+01      1      1

fg_composed now fully agrees with fg_exact.

This example shows the care that must be taken when composing two separate truncated power series. Another, perhaps simpler, way of dealing with this problem is to always chose a coordinate system such that the scalar part of a TPS is equal to zero.

GTPSA also includes a routine to invert a TPS map. The inversion routine ignores any scalar part of the TPS, so it is the responsibility of the user to ensure the coordinate system is consistent by either translating the TPSs or using a coordinate system where the scalar part is zero.

As an example, let's invert the following TPS map, which has no scalar part:

d = Descriptor(2, 2);
Δx = @vars(d);
M = [  Δx[1] + 2*Δx[2] + 3*Δx[1]*Δx[2],
     3*Δx[1] + 4*Δx[2] + Δx[1]^2 + Δx[2]^2]
M_inv = inv(M)

M_inv ∘ M
2-element Vector{TPS64{Descriptor(NV=2, MO=2)}}:
 Index Coefficient                Order   Exponent
----------------------------------------------------
   1:   1.0000000000000000e+00      1      1   0
----------------------------------------------------
   2:   1.0000000000000000e+00      1      0   1

Other TPS Constructors

After defining a Descriptor, we can construct a TPS using any of the following:


julia> d = Descriptor(3, 5); # 3 variables to 5th order
julia> t = TPS64{d}() # Constructs a blank `TPS`, equivalent to `TPS{Float64,d}()`TPS64{Descriptor(NV=3, MO=5)}: Coefficient Order Exponent 0.0000000000000000e+00 0 0 0 0
julia> t1 = TPS64{d}(1.0) # Constructs a `TPS` with scalar part set to 1.0TPS64{Descriptor(NV=3, MO=5)}: Coefficient Order Exponent 1.0000000000000000e+00 0 0 0 0
julia> t1c = ComplexTPS64{d}(1.0) # Equivalent to `TPS{ComplexF64,d}(1.0)`ComplexTPS64{Descriptor(NV=3, MO=5)}: Real Imag Order Exponent 1.0000000000000000e+00 0.0000000000000000e+00 0 0 0 0
julia> t1im = TPS{d}(1.0im) # If the number type is not specified, then it is inferredComplexTPS64{Descriptor(NV=3, MO=5)}: Real Imag Order Exponent 0.0000000000000000e+00 1.0000000000000000e+00 0 0 0 0

When constructing TPSs in this manner, it is important to include the Descriptor in the type parameter of the constructor, to ensure the Descriptor of the TPS is resolved statically (at compile time). If it is not included, then the Descriptor of the TPS will be resolved dynamically instead, at runtime. GTPSA.jl provides both static and dynamic Descriptor resolution modes, each of which have certain advantages in different use cases. See the advanced topics section of the documentation for more details.

Partial Derivative Getting/Setting

Individual Monomial Coefficient

Note

The value of a partial derivative is equal to the monomial coefficient in the Taylor series multiplied by a constant factor. E.g. for an expansion around zero $f(x)\approx f(0) + \frac{\partial f}{\partial x}\rvert_0x + \frac{1}{2!}\frac{\partial^2 f}{\partial x^2}\rvert_0 x^2 + ...$, the 2nd order monomial coefficient is $\frac{1}{2!}\frac{\partial^2 f}{\partial x^2}\rvert_0$.

Individual monomial coefficients in a TPS t can be get/set with three methods of indexing:

  1. By Monomial Index: t[idx::Integer]. Indexes the TPS with all monomials sorted by order. For example, for a TPS with two variables $Δx_1$ and $Δx_2$, the $Δx_1$ monomial is indexed with t[1] and the $Δx_1^2$ monomial is indexed with t[3]. The zeroth order part, or the scalar part of the TPS, can be indexed with t[0].
  2. By Order: t[[<Δx_1 order>, ..., <Δx_NV order>]]. For example, for a TPS with three variables $Δx_1$, $Δx_2$, and $Δx_3$, the $Δx_1^3Δx_2^1$ monomial coefficient is accessed with t[[3,1,0]] or equivalently t[[3,1]], as leaving out trailing zeros for unincluded variables is allowed. A tuple is also allowed instead of a vector for the list of orders.
  3. By Sparse Monomial: t[[<ix_var> => <order>, ...]]. This method of indexing is convenient when a TPS contains many variables and parameters. For example, for a TPS with variables $Δx_1,Δx_2,...Δx_{100}$, the $Δx_{1}^3Δx_{99}^1$ monomial coefficient is accessed with t[[1=>3, 99=>1]]. A tuple is also allowed instead of a vector for the list of pairs.

These three methods of indexing are best shown with an example:

# Example of indexing by monomial index -----------
d = Descriptor(3, 10);
t = TPS{d}(); # or equivalently TPS{Float64,d}()

t[0] = 0;
t[1] = 1;
t[2] = 2;
t[3] = 3;
t[4] = 4;
t[5] = 5;
t[6] = 6;
t[7] = 7;
t[8] = 8;
t[9] = 9;
t[10] = 10;

t
TPS64{Descriptor(NV=3, MO=10)}:
 Coefficient                Order   Exponent
  1.0000000000000000e+00      1      1   0   0
  2.0000000000000000e+00      1      0   1   0
  3.0000000000000000e+00      1      0   0   1
  4.0000000000000000e+00      2      2   0   0
  5.0000000000000000e+00      2      1   1   0
  6.0000000000000000e+00      2      0   2   0
  7.0000000000000000e+00      2      1   0   1
  8.0000000000000000e+00      2      0   1   1
  9.0000000000000000e+00      2      0   0   2
  1.0000000000000000e+01      3      3   0   0
# Example of indexing by order -----------
d = Descriptor(3, 10);
t = TPS{d}();

t[[0]] = 1;
t[[1]] = 2;
t[[0,1]] = 3;
t[(0,0,1)] = 4;  # Tuples also allowed
t[(2,1,3)] = 5;

t
TPS64{Descriptor(NV=3, MO=10)}:
 Coefficient                Order   Exponent
  1.0000000000000000e+00      0      0   0   0
  2.0000000000000000e+00      1      1   0   0
  3.0000000000000000e+00      1      0   1   0
  4.0000000000000000e+00      1      0   0   1
  5.0000000000000000e+00      6      2   1   3
# Example of indexing by sparse monomial -----------
d = Descriptor(3, 10);
t = TPS{d}();

t[[1=>1]] = 2;
t[[2=>1]] = 3;
t[[3=>1]] = 4;
t[(1=>2,2=>1,3=>3)] = 5;  # Tuples also allowed

t
TPS64{Descriptor(NV=3, MO=10)}:
 Coefficient                Order   Exponent
  2.0000000000000000e+00      1      1   0   0
  3.0000000000000000e+00      1      0   1   0
  4.0000000000000000e+00      1      0   0   1
  5.0000000000000000e+00      6      2   1   3

The GTPSA.cycle! function can also be used to cycle through all nonzero monomials in a TPS.

Gradients, Jacobians, Hessians

The convenience getters gradient, jacobian, and hessian (as well as their corresponding in-place methods gradient!, jacobian!, and hessian!) are also provided for extracting partial derivatives from a TPS/array of TPSs. Note that these functions are not actually calculating anything - at this point the TPS should already have been propagated through, and these functions are just extracting the corresponding partial derivatives.

using GTPSA; GTPSA.show_sparse=false; #hide
# 2nd Order TPSA with 100 variables
d = Descriptor(100, 2);
Δx = @vars(d);

out = cumsum(Δx);

# Convenience getters for partial derivative extracting:
grad1 = GTPSA.gradient(out[1]);
J = GTPSA.jacobian(out);
h1 = GTPSA.hessian(out[1]);

# Also in-place getters
GTPSA.gradient!(grad1, out[1]);
GTPSA.jacobian!(J, out);
GTPSA.hessian!(h1, out[1]);

Slicing a TPS

Parts of a TPS with certain variable orders can be extracted by slicing the TPS. When indexing by order, a colon (:) can be used in place for a variable order to include all orders of that variable. If the last specified index is a colon, then the rest of the variable indices are assumed to be colons:

d = Descriptor(5, 10);
Δx = @vars(d);

f = 2*Δx[1]^2*Δx[3] + 3*Δx[1]^2*Δx[2]*Δx[3]*Δx[4]^2*Δx[5] + 6*Δx[3] + 5;
g = f[[2,:,1]];
h = f[[2,:,1,:]];

print(f)
print(g)
print(h)
TPS64{Descriptor(NV=5, MO=10)}:
 Coefficient                Order   Exponent
  5.0000000000000000e+00      0      0   0   0   0   0
  6.0000000000000000e+00      1      0   0   1   0   0
  2.0000000000000000e+00      3      2   0   1   0   0
  3.0000000000000000e+00      7      2   1   1   2   1
TPS64{Descriptor(NV=5, MO=10)}:
 Coefficient                Order   Exponent
  2.0000000000000000e+00      3      2   0   1   0   0
TPS64{Descriptor(NV=5, MO=10)}:
 Coefficient                Order   Exponent
  2.0000000000000000e+00      3      2   0   1   0   0
  3.0000000000000000e+00      7      2   1   1   2   1

A TPS can also be sliced with indexing by sparse monomial. In this case, if a colon is included anywhere in the sparse monomial index, then all orders of all variables not explicity specified will be included:

 # Colon position does not matter in sparse-monomial indexing
g = f[[1=>2, :, 3=>1, 4=>0, 5=>0]];
h = f[[1=>2, 3=>1, :]];

print(g)
print(h)
TPS64{Descriptor(NV=5, MO=10)}:
 Coefficient                Order   Exponent
  2.0000000000000000e+00      3      2   0   1   0   0
TPS64{Descriptor(NV=5, MO=10)}:
 Coefficient                Order   Exponent
  2.0000000000000000e+00      3      2   0   1   0   0
  3.0000000000000000e+00      7      2   1   1   2   1

@FastGTPSA/@FastGTPSA! Macros

The macros @FastGTPSA/@FastGTPSA! can be used to speed up evaluation of expressions that may contain TPSs. Both macros are completely transparent to all other types, so they can be prepended to any existing expressions while still maintaining type-generic code. Any functions in the expression that are not overloaded by GTPSA will be ignored. Both macros do NOT use any -ffast-math business (so still IEEE compliant), but instead will use a thread-safe pre-allocated buffer in the Descriptor for any temporaries that may be generated during evaluation of the expression.

The first macro, @FastGTPSA can be prepended to an expression following assignment (=, +=, etc) to only construct one TPS (which requires two allocations), instead of a TPS for every temporary:

julia> using GTPSA, BenchmarkTools
julia> d = Descriptor(3, 7); Δx = @vars(d);
julia> @btime $Δx[1]^3*sin($Δx[2])/log(2+$Δx[3])-exp($Δx[1]*$Δx[2])*im; 6.101 μs (20 allocations: 11.88 KiB)
julia> @btime @FastGTPSA $Δx[1]^3*sin($Δx[2])/log(2+$Δx[3])-exp($Δx[1]*$Δx[2])*im; 5.547 μs (2 allocations: 1.94 KiB)
julia> y = rand(3); # transparent to non-TPS types
julia> @btime $y[1]^3*sin($y[2])/log(2+$y[3])-exp($y[1]*$y[2])*im; 20.379 ns (0 allocations: 0 bytes)
julia> @btime @FastGTPSA $y[1]^3*sin($y[2])/log(2+$y[3])-exp($y[1]*$y[2])*im; 21.323 ns (0 allocations: 0 bytes)

The second macro, @FastGTPSA! can be prepended to the left-hand side of an assignment, and will fill a preallocated TPS with the result of an expression. @FastGTPSA! will calculate a TPS expression with zero allocations, and will still have no impact if a non-TPS type (even if immutable) is used. The only requirement is that all symbols in the expression are defined:


julia> t = ComplexTPS64(); # pre-allocate
julia> @btime @FastGTPSA! $t = $Δx[1]^3*sin($Δx[2])/log(2+$Δx[3])-exp($Δx[1]*$Δx[2])*im; 5.410 μs (0 allocations: 0 bytes)
julia> y = rand(3); @gensym z; # transparent to non-TPS types
julia> @btime @FastGTPSA! $z = $y[1]^3*sin($y[2])/log(2+$y[3])-exp($y[1]*$y[2])*im; 20.057 ns (0 allocations: 0 bytes)

Both @FastGTPSA and @FastGTPSA! can also be prepended to a block of code, in which case they are applied to each assignment in the block:


julia> d = Descriptor(3, 7); Δx = @vars(d);
julia> y = rand(3);
julia> @btime @FastGTPSA begin t1 = $Δx[1]^3*sin($Δx[2])/log(2+$Δx[3])-exp($Δx[1]*$Δx[2])*im; t2 = $Δx[1]^3*sin($Δx[2])/log(2+$Δx[3])-exp($Δx[1]*$Δx[2])*im; z = $y[1]^3*sin($y[2])/log(2+$y[3])-exp($y[1]*$y[2])*im; end; 11.251 μs (4 allocations: 3.88 KiB)
julia> t3 = ComplexTPS64(); t4 = ComplexTPS64(); @gensym w;
julia> @btime @FastGTPSA! begin $t3 = $Δx[1]^3*sin($Δx[2])/log(2+$Δx[3])-exp($Δx[1]*$Δx[2])*im; $t4 = $Δx[1]^3*sin($Δx[2])/log(2+$Δx[3])-exp($Δx[1]*$Δx[2])*im; $w = $y[1]^3*sin($y[2])/log(2+$y[3])-exp($y[1]*$y[2])*im; end; 11.021 μs (0 allocations: 0 bytes)

Both macros are also compatible with broadcasted, vectorized operators:


julia> d = Descriptor(3, 7); Δx = @vars(d); y = rand(3);
julia> @btime @FastGTPSA begin out = @. $Δx^3*sin($y)/log(2+$Δx)-exp($Δx*$y)*im; end; 14.437 μs (7 allocations: 5.89 KiB)
julia> out = zeros(ComplexTPS64, 3); # pre-allocate
julia> @btime @FastGTPSA! begin @. $out = $Δx^3*sin($y)/log(2+$Δx)-exp($Δx*$y)*im; end; 13.996 μs (0 allocations: 0 bytes)

The advantages of using the macros become especially apparent in more complicated systems. See our example where we observed a significant speedup.

Automatic Differentiation using GTPSA.jl

At the time of writing this documentation, a pull request is ready for merge with Julia's DifferentiationInterface.jl package, which provides a generic interface for computing pushforwards (Jacobian vector products), derivatives, gradients, Jacobians, Hessians, and Hessian vector products using any automatic differentiation (AD) backend. If you are only interested in using GTPSA for these computations, then we strongly recommend using DifferentiationInterface.jl with the AutoGTPSA AD type.

Note

GTPSA has been highly optimized for high order AD with large numbers of variables and parameters. If you are only interested in first order AD, then other AD backends will likely be faster. GTPSA, however, will likely be faster in cases where the entire Hessian must be materialized.

Compatibility with DifferentialEquations.jl

GTPSA.jl bindings have been added to DiffEqBase.jl so that TPSs can be integrated using any of the solvers provided by SciML's extensive Differential Equations ecosystem. Simply specify your initial conditions as TPS types. This allows, for example, for computation of high-order Hamiltonian maps that can then be analyzed using canonical perturbation theory.