For Developers
Developers may fork the GTPSA.jl repo and then dev their forked repo in the REPL. For example, if my Github username is githubuser, then after forking GTPSA.jl I would run in the REPL:
import Pkg
Pkg.develop(url="https://github.com/githubuser/GTPSA.jl")The package consists of two layers: a low-level layer written in Julia that is 1-to-1 with the GTPSA C code, and a high-level, user-friendly layer that cleans up the notation for manipulating TPSs, manages temporaries generated during evaluation, and properly manages the memory in C when variables go out of scope in Julia. The low-level functions are listed at the bottom of this page.
When it comes to managing memory in C via Julia, there are certain intricacies that have to be considered. First, let's consider the Descriptor, which is the simplest:
Descriptor
The Descriptor stores all information about the GTPSA, including the indexing table for indexing specific monomials. This is a static object, only created once for a GTPSA, which all TPSs refer to. In C, these structs must exist for the entirety of the program execution. In Julia, because they must not be destroyed when out-of-scope, we wrap these objects in immutable structs. In a single program, up to 250 Descriptors can exist simultanouesly, and they can be manually optionally destroyed using GTPSA.mad_desc_del!. At program termination all Descriptors are destroyed. Given the significantly large number allowed, as well as the danger of still-existing TPS and Descriptor structs after destruction, no front-facing interface to the user is given for destroying existing Descriptors.
The Descriptor struct simply wraps a C-pointer to a low-level struct called Desc: this struct is 1-to-1 equivalent to the C struct desc in GTPSA. See the documentation for GTPSA.Desc below. By having this struct in Julia, we can unsafe_load the struct and get values in the desc. For example, to access the first TPS{Float64} in the buffer of temporaries in the Descriptor, we can do
julia> using GTPSAjulia> import GTPSA: Descjulia> d = Descriptor(5,8)Descriptor(NV=5, MO=8)julia> desc = unsafe_load(d.desc) # To access the low-level C structGTPSA.Desc(2, 5, 5, 0, 0x08, 0x00, 0xff, Ptr{UInt8} @0x00000000061bc148, 0, 4, 0x00000507, 0x00000000, 0x00000000, Ptr{Int32} @0x000000000621a118, Ptr{UInt8} @0x0000000005799d88, Ptr{UInt8} @0x0000000005fdd6b8, Ptr{UInt8} @0x0000000005d398d8, Ptr{Ptr{UInt8}} @0x000000000637c408, Ptr{Ptr{UInt8}} @0x00000000065d5e18, Ptr{Ptr{UInt8}} @0x00000000061263d8, Ptr{Int32} @0x0000000005f9ded8, Ptr{Int32} @0x0000000005cfd008, Ptr{Int32} @0x0000000005db2588, Ptr{Int32} @0x0000000006616208, Ptr{Ptr{Int32}} @0x00000000066e7688, Ptr{Ptr{Ptr{Int32}}} @0x000000000618cd18, 0x0000000000022195, 0.0, 0.0, 0.0, Ptr{Ptr{Nothing}} @0x0000000005d68bf8, Ptr{Ptr{Nothing}} @0x000000000589c888, Ptr{Int32} @0x0000000005d20bf8, Ptr{Int32} @0x0000000006029298)julia> t_jl = unsafe_load(Base.unsafe_convert(Ptr{Ptr{TPS{Float64}}}, desc.t), 1) # 1 in Julia = 0 in CPtr{TPS64} @0x00000000058141e8
In Julia, if we were to then unsafe_load(t_jl), there would in fact be allocations, as its internally creating a copy of the TPS{Float64} in Julia. This is not well documented in the documentation of unsafe_load (see the discussion here).
TPS
The TPS{Float64} struct in Julia corresponds exactly to the C struct tpsa and TPS{ComplexF64} struct in Julia corresponds exactly to ctpsa in C. To understand fully the TPS struct, some history is needed:
In early development versions of GTPSA.jl, the TPS struct was very similar to the Descriptor struct: it used to just wrap a C Ptr to a low-level struct, and instead be mutable so out-of-scope TPSs and temporaries are cleaned up. This at the time seemed like the simplest solution, since in Julia there is no way to tag C pointers for Julia garbage collection. This created some problems though: firstly, there is one indirection because Julia would tag that particular mutable wrapper struct for GC, but then the member Ptr of that struct pointed to a different place in memory that had to be cleaned up. Secondly, when calling GTPSA C functions that want a Vector of TPS (e.g. Ptr to TPS), you would need to do a map(t->t.tpsa, x) where x is a Vector{TPS64} and tpsa is the field member of the wrapper TPS struct. Thirdly, and perhaps most significantly, Julia is not aware of how much memory the C is using. Therefore, it will not call the garbage collector very often, even if actually >100GB of memory is being used. Sometimes the OS will just kill the Julia process.
In order to get around these problems, we must allocate the entire mutable TPS struct in Julia instead of in the C code (e.g. using GTPSA.mad_tpsa_newd). We then can use pointer_from_objref in Julia to get a pointer to that mutable, Julia-owned struct to pass to the C functions.
Sounds simple enough, right? If only! In the GTPSA C code, the coef member array is something called a flexible array member. This is great in C, because instead of the struct storing a pointer to an array (which would cause an indirection every time the coef array is accessed in a TPS), it actually stores the array right there in the struct, with variable size. This gives some performance gains. In Julia, there is no such analog. For those who know about StaticArrays.jl, you might think an SVector could work, but surprise, it doesn't because you cannot mutate the fields of an SVector and neither can the C code.
So the only solution it seems is to change the actual C struct in mad_tpsa_impl.h and mad_ctpsa_impl.h to use regular arrays for coef instead of flexible array members, and indeed this is what is done for GTPSA.jl. There is a tiny speed reduction due to the indirection of accessing coef, however the benefits of Julia's garbage collector knowing how much memory it's using, and keeping memory usage sane, is worth the very tiny cost.
On the Julia side, it turns out you cannot just use a regular Vector for the coef array in the TPS struct, because Julia's Vector structs are quite complex and play a lot of tricks. You might actually be able to use an MVector from StaticArrays, however using this struct might make GTPSA.jl significantly more complex because the size of the array has to be known at compile-time or else you suffer the drastic performance reductions caused by type-instabilities. The complexity of using this could be checked at some point in the future.
The decided solution was to, in the Julia, @ccall jl_malloc for the coef array, and in the finalizer for the mutable TPS struct call @ccall jl_free for the coef array. This gives us C-style arrays that Julia's garbage collector knows about, and so will make sure to keep the memory usage sane.
When @ccall is used, the arguments are Base.unsafe_converted to the corresponding specified argument types. Therefore, for TPS all we had to do then was define
Base.unsafe_convert(::Type{Ptr{TPS{T}}}, t::TPS{T}) where {T} = Base.unsafe_convert(Ptr{TPS{T}},pointer_from_objref(t))and now we can pass our TPS structs to C using @ccall.
TempTPS
Because unsafe_load of a Ptr{<:TPS} creates a copy and allocates, we cannot treat the constant buffer of pre-allocated temporaries in the Descriptor as bona-fide TPSs. Note that the memory addresses of the temporaries in the buffer are constant and do not need to be cleaned up; they are immutable!. The temporaries, which we give the type GTPSA.TempTPS, do not have any of the problems of just wrapping a pointer as do the TPSs, and so that's what they are. Also in Julia, in order to access the fields of a TempTPS (e.g. mo) via unsafe_load without allocating, we need an immutable struct having the same structure as TPS. This is the purpose of GTPSA.LowTempTPS. We need to use the mo field for @FastGTPSA to finally allocate the result TPS with the mo of the result TempTPS.
As with TPS, we also had to define Base.unsafe_convert for TempTPS so we can @ccall. In this case, the unsafe_convert returns the member Ptr of the TempTPS.
GTPSA.TempTPS — Typestruct TempTPS{T<:Union{Float64,ComplexF64},D}This is for internal use only. TempTPS is a temporary TPS, which has been pre-allocated in a buffer for each thread in the Descriptor C struct. When using the @FastGTPSA macro, all temporaries generated will be used from this buffer. "Constructors" of this type simply take a temporary from that particular thread's buffer in a stack-like manner and "Destructors" (which must be manually called because this is immutable) release it from the stack.
Fields
t::Ptr{TPS{T,D}}– Pointer to theTPSin the buffer in theDescriptor
Library Structure
All operators have an in-place, mutating version specified with a bang (!). These are the lowest-level pure Julia code, following the convention that the first argument is the one to contain the result. In the GTPSA C library, the last argument contains the result, so this is accounted for in the file inplace_operators.jl. All in-place functions can receive either a regular TPS , which the user will be using, as well as a GTPSA.TempTPS, which the user should not concern themselves with. The constants RealTPS and ComplexTPS are defined respectively in low_level/rtpsa.jl and low_level/ctpsa.jl to simplify the notation. These are just:
# Internal constants to aid multiple dispatch including temporaries
const RealTPS = Union{TempTPS{Float64}, TPS{Float64}}
const ComplexTPS = Union{TempTPS{ComplexF64}, TPS{ComplexF64}}All this does is enforce correct types for the in-place functions, while keeping the notation/code simple.
The in-place, mutating functions, defined in inplace_operators.jl must all use the RealTPS and ComplexTPS "types". Then, the higher level out-of-place functions for both TPS and TempTPS, which do different things with the result, will use these in-place functions.
The out-of-place functions for TPS are defined in operators.jl, and the out-of-place functions for TempTPS are defined in fastgtpsa/operators.jl.
Fast GTPSA Macros
The @FastGTPSA/@FastGTPSA! macros work by changes all arithmetic operators in different Julia arithmetic operators with the same operator precedence and unary operator capabilities. These special operators then dispatch on functions that use the temporaries when a TPS or TempTPS is passed, else default to their original operators, thereby making it completely transparent to non-TPS types. Both + and - must work as unary operators, and there is a very short list of allowed ones shown here. The other arithmetic operators were chosen somewhat randomly from the top of the same file, next to prec-plus, prec-times, and prec-power which defines the operator precedences. By taking this approach, we relieve ourselves of having to rewrite PEMDAS and instead let the Julia do it for us.
All arithmetic operators are changed to GTPSA.:<special symbols>, e.g. + → GTPSA.:±. All non-arithmetic operators that are supported by GTPSA are then changed to GTPSA.__t_<operator>, e.g. sin → GTPSA.__t_sin, where the prefix __t_ is also chosen somewhat arbitrarily. These operators are all defined in fastgtpsa/operators.jl, and when they encounter a TPS type, they use the temporaries, and when other number types are detected, they fallback to the regular, non-__t_ operator. This approach works extraordinarily well, and introduces no problems externally because none of these functions/symbols are exported.
Calling the C-library with pointers and pointer-to-pointers
All of the GTPSA map functions require a vector of TPS as input, in C **tpsa. In Julia, this works automatically for AbstractArray{<:Union{TPS64,ComplexTPS64}} by specifying the C argument type in the C call as ::Ptr{TPS64} or ::Ptr{ComplexTPS64}. However, this can be misleading to those C inputs which require *tpsa, e.g. also ::Ptr{TPS64}. For consistency in the low level library interface, all pointers are specified as ::Ref{TPS64} and ::Ref{ComplexTPS64}, and pointers-to-pointers are specified as Ptrs. For single TPSs, the cconvert to Ref in the ccall is handled automatically.
However, in some cases, one might have only a single TPS64 and would like the call the corresponding map functions without having to allocate an array. After some experimenting, I've found the following overrides to gives zero allocations:
Base.unsafe_convert(::Type{Ptr{TPS{T}}}, r::Base.RefValue{Ptr{Nothing}}) where {T} = Base.unsafe_convert(Ptr{TPS{T}}, Base.unsafe_convert(Ptr{Cvoid}, r))
Base.cconvert(::Type{Ptr{TPS{T}}}, t::TPS{T}) where {T} = Ref(pointer_from_objref(t))NOTE: We need to have a GC.@preserve before the cconvert to keep t valid! As such, you will see in all map functions a call to GC.@preserve.
This override is only necessary for the mutable TPS. The other array inputs of isbits types are ok and basically have the above defined for them. See this discussion for more details.
Low-Level
Below is documentation for every single 1-to-1 C function in the GTPSA library. If there is any function missing, please submit an issue to GTPSA.jl.
Monomial
GTPSA.mad_mono_add! — Functionmad_mono_add!(n::Cint, a, b, r)Sets monomial r = a + b.
Input
n– Length of monomialsa– Source monomialab– Source monomialb
Output
r– Destination monomial,r = a + b
GTPSA.mad_mono_cat! — Functionmad_mono_cat!(n::Cint, a, m::Cint, b, r)Sets monomial r equal to the concatenation of the monomials a and b
Input
n– Length of monomonialaa– Source monomialam– Length of monomialbb– Source monomialb
Output
r– Destination monomial of concatenation ofaandb(lengthn+m)
GTPSA.mad_mono_cmp — Functionmad_mono_cmp(n::Cint, a, b)::CintCompares monomial a to monomial b, and returns the first difference in the lowest order variables.
Input
n– Length of monomialsa– Monomialab– Monomialb
Output
ret– Firsta[i]-b[i] != 0
GTPSA.mad_mono_copy! — Functionmad_mono_copy!(n::Cint, a, r)Copies monomial a to monomial r.
Input
n– Length of monomialsa– Source monomialr– Destination monomial
GTPSA.mad_mono_eq — Functionmad_mono_eq(n::Cint, a, b)::BoolChecks if the monomial a is equal to the monomial b.
Input
n– Length of monomialsa– Monomialab– Monomialb
Output
ret– True if the monomials are equal, false if otherwise
GTPSA.mad_mono_eqn — Functionmad_mono_eqn(n::Cint, a, b::Cuchar)::Bool???
GTPSA.mad_mono_fill! — Functionmad_mono_fill!(n::Cint, a, v::Cuchar)Fills the monomial a with the value v.
Input
n– Monomial lengtha– Monomialv– Value
GTPSA.mad_mono_le — Functionmad_mono_le(n::Cint, a, b)::BoolChecks if monomial a is less than or equal to monomial b.
Input
n– Length of monomialsa– Monomialab– Monomialb
Output
ret– True ifa <= mono_b, false otherwise
GTPSA.mad_mono_lt — Functionmad_mono_lt(n::Cint, a, b)::BoolChecks if monomial a is less than monomial b.
Input
n– Length of monomialsa– Monomialab– Monomialb
Output
ret– True ifa < b, false otherwise
GTPSA.mad_mono_max — Functionmad_mono_max(n::Cint, a)::CucharReturns the maximum order of the monomial.
Input
n– Length of monomiala– Monomial
Output
mo– Maximum order of monomiala
GTPSA.mad_mono_min — Functionmad_mono_min(n::Cint, a)::CucharReturns the minimum order of the monomial.
Input
n– Length of monomiala– Monomial
Output
mo– Mininum order of monomiala
GTPSA.mad_mono_ord — Functionmad_mono_ord(n::Cint, a)::CintReturns the sum of the orders of the monomial a.
Input
n– Monomial lengtha– Monomial
Output
s– Sum of orders of monomial
GTPSA.mad_mono_ordp — Functionmad_mono_ordp(n::Cint, a, stp::Cint)::CdoubleReturns the product of each stp-th order in monomial a. For example, stp = 2 collects every order in the monomial with a step of 2 between each. As a is a pointer, the product can be started at any element in the monomial.
Input
n– Monomial lengtha– Monomial as byte arraystp– Step over which orders to include in the product
Output
p– Product of orders of monomial separated bystp.
GTPSA.mad_mono_ordpf — Functionmad_mono_ordpf(n::Cint, a, stp::Cint)::CdoubleReturns the product of factorials each stp-th order in monomial a. For example, stp = 2 collects every order in the monomial with a step of 2 between each. As a is a pointer, the product can be started at any element in the monomial.
Input
n– Monomial lengtha– Monomial as byte arraystp– Step over which orders to include in the product of factorials
Output
p– Product of factorials of orders of monomial separated bystp
GTPSA.mad_mono_print — Functionmad_mono_print(n::Cint, a, sep_::Cstring, fp_::Ptr{Cvoid})Prints the monomial to stdout.
Input
n– Length of monomiala– Source monomial to print tostdoutsep_– Separator stringfp_– CFILEpointer, if null will print tostdout
GTPSA.mad_mono_prt! — Functionmad_mono_prt(n::Cint, a, s::Ptr{Cuchar})::CstringWrites the monomial defined by the byte array a (with orders stored as hexadecimal) into a null terminated string s.
Input
n– Monomial and string lengtha– Monomial as byte array
Output
ret– Monomial as string
GTPSA.mad_mono_rcmp — Functionmad_mono_rcmp(n::Cint, a, b)::CintCompares monomial a to monomial b starting from the right (when the monomials are ordered by variable, which is almost never the case) and returns the first difference in the lowest order variables.
Input
n– Length of monomialsa– Monomialab– Monomialb
Output
ret– Firsta[i]-b[i] != 0whereistarts from the end.
GTPSA.mad_mono_rev! — Functionmad_mono_rev!(n::Cint, a, r)Sets destination monomial r equal to the reverse of source monomial a.
Input
n– Lengths of monomialsa– Source monomiala
Output
r– Destination monomial of reverse monomiala
GTPSA.mad_mono_str! — Functionmad_mono_str!(n::Cint, a, s::Cstring)::CintWrites the monomial defined in the string s, which stores the orders in a human-readable format (e.g. 10 is 10, not 0xa), into the byte array a with the orders specified in hexadecimal.
Input
n– Monomial and string lengths– Monomial as string "[0-9]*"
Output
a– Monomial as a byte array converted from the input stringi– Adjusted sizenof byte array if '