For Developers

Developers may fork the GTPSA.jl repo and then dev their forked repo in the REPL. For example, if my Github username is githubuser, then after forking GTPSA.jl I would run in the REPL:

import Pkg
Pkg.develop(url="https://github.com/githubuser/GTPSA.jl")

The package consists of two layers: a low-level layer written in Julia that is 1-to-1 with the GTPSA C code, and a high-level, user-friendly layer that cleans up the notation for manipulating TPSs, manages temporaries generated during evaluation, and properly manages the memory in C when variables go out of scope in Julia. The low-level functions are listed at the bottom of this page.

When it comes to managing memory in C via Julia, there are certain intricacies that have to be considered. First, let's consider the Descriptor, which is the simplest:

Descriptor

The Descriptor stores all information about the GTPSA, including the indexing table for indexing specific monomials. This is a static object, only created once for a GTPSA, which all TPSs refer to. In C, these structs must exist for the entirety of the program execution. In Julia, because they must not be destroyed when out-of-scope, we wrap these objects in immutable structs. In a single program, up to 250 Descriptors can exist simultanouesly, and they can be manually optionally destroyed using GTPSA.mad_desc_del!. At program termination all Descriptors are destroyed. Given the significantly large number allowed, as well as the danger of still-existing TPS and Descriptor structs after destruction, no front-facing interface to the user is given for destroying existing Descriptors.

The Descriptor struct simply wraps a C-pointer to a low-level struct called Desc: this struct is 1-to-1 equivalent to the C struct desc in GTPSA. See the documentation for GTPSA.Desc below. By having this struct in Julia, we can unsafe_load the struct and get values in the desc. For example, to access the first TPS{Float64} in the buffer of temporaries in the Descriptor, we can do

julia> using GTPSA
julia> import GTPSA: Desc
julia> d = Descriptor(5,8)Descriptor(NV=5, MO=8)
julia> desc = unsafe_load(d.desc) # To access the low-level C structGTPSA.Desc(2, 5, 5, 0, 0x08, 0x00, 0xff, Ptr{UInt8} @0x0000000031ec8cf8, 0, 4, 0x00000507, 0x00000000, 0x00000000, Ptr{Int32} @0x000000003170cd58, Ptr{UInt8} @0x00000000327da888, Ptr{UInt8} @0x0000000031e70d88, Ptr{UInt8} @0x0000000031b23108, Ptr{Ptr{UInt8}} @0x0000000032632888, Ptr{Ptr{UInt8}} @0x00000000322fd5b8, Ptr{Ptr{UInt8}} @0x00000000322f72e8, Ptr{Int32} @0x0000000031920ad8, Ptr{Int32} @0x00000000327f2398, Ptr{Int32} @0x00000000328326d8, Ptr{Int32} @0x00000000323e12e8, Ptr{Ptr{Int32}} @0x00000000319d2028, Ptr{Ptr{Ptr{Int32}}} @0x000000003205d0b8, 0x0000000000022195, 0.0, 0.0, 0.0, Ptr{Ptr{Nothing}} @0x000000003249a0f8, Ptr{Ptr{Nothing}} @0x000000003205cfa8, Ptr{Int32} @0x0000000031f4d0d8, Ptr{Int32} @0x0000000031ce4328)
julia> t_jl = unsafe_load(Base.unsafe_convert(Ptr{Ptr{TPS{Float64}}}, desc.t), 1) # 1 in Julia = 0 in CPtr{TPS64} @0x0000000032079c48

In Julia, if we were to then unsafe_load(t_jl), there would in fact be allocations, as its internally creating a copy of the TPS{Float64} in Julia. This is not well documented in the documentation of unsafe_load (see the discussion here).

TPS

The TPS{Float64} struct in Julia corresponds exactly to the C struct tpsa and TPS{ComplexF64} struct in Julia corresponds exactly to ctpsa in C. To understand fully the TPS struct, some history is needed:

In early development versions of GTPSA.jl, the TPS struct was very similar to the Descriptor struct: it used to just wrap a C Ptr to a low-level struct, and instead be mutable so out-of-scope TPSs and temporaries are cleaned up. This at the time seemed like the simplest solution, since in Julia there is no way to tag C pointers for Julia garbage collection. This created some problems though: firstly, there is one indirection because Julia would tag that particular mutable wrapper struct for GC, but then the member Ptr of that struct pointed to a different place in memory that had to be cleaned up. Secondly, when calling GTPSA C functions that want a Vector of TPS (e.g. Ptr to TPS), you would need to do a map(t->t.tpsa, x) where x is a Vector{TPS64} and tpsa is the field member of the wrapper TPS struct. Thirdly, and perhaps most significantly, Julia is not aware of how much memory the C is using. Therefore, it will not call the garbage collector very often, even if actually >100GB of memory is being used. Sometimes the OS will just kill the Julia process.

In order to get around these problems, we must allocate the entire mutable TPS struct in Julia instead of in the C code (e.g. using GTPSA.mad_tpsa_newd). We then can use pointer_from_objref in Julia to get a pointer to that mutable, Julia-owned struct to pass to the C functions.

Sounds simple enough, right? If only! In the GTPSA C code, the coef member array is something called a flexible array member. This is great in C, because instead of the struct storing a pointer to an array (which would cause an indirection every time the coef array is accessed in a TPS), it actually stores the array right there in the struct, with variable size. This gives some performance gains. In Julia, there is no such analog. For those who know about StaticArrays.jl, you might think an SVector could work, but surprise, it doesn't because you cannot mutate the fields of an SVector and neither can the C code.

So the only solution it seems is to change the actual C struct in mad_tpsa_impl.h and mad_ctpsa_impl.h to use regular arrays for coef instead of flexible array members, and indeed this is what is done for GTPSA.jl. There is a tiny speed reduction due to the indirection of accessing coef, however the benefits of Julia's garbage collector knowing how much memory it's using, and keeping memory usage sane, is worth the very tiny cost.

On the Julia side, it turns out you cannot just use a regular Vector for the coef array in the TPS struct, because Julia's Vector structs are quite complex and play a lot of tricks. You might actually be able to use an MVector from StaticArrays, however using this struct might make GTPSA.jl significantly more complex because the size of the array has to be known at compile-time or else you suffer the drastic performance reductions caused by type-instabilities. The complexity of using this could be checked at some point in the future.

The decided solution was to, in the Julia, @ccall jl_malloc for the coef array, and in the finalizer for the mutable TPS struct call @ccall jl_free for the coef array. This gives us C-style arrays that Julia's garbage collector knows about, and so will make sure to keep the memory usage sane.

When @ccall is used, the arguments are Base.unsafe_converted to the corresponding specified argument types. Therefore, for TPS all we had to do then was define

Base.unsafe_convert(::Type{Ptr{TPS{T}}}, t::TPS{T}) where {T} = Base.unsafe_convert(Ptr{TPS{T}},pointer_from_objref(t))

and now we can pass our TPS structs to C using @ccall.

TempTPS

Because unsafe_load of a Ptr{<:TPS} creates a copy and allocates, we cannot treat the constant buffer of pre-allocated temporaries in the Descriptor as bona-fide TPSs. Note that the memory addresses of the temporaries in the buffer are constant and do not need to be cleaned up; they are immutable!. The temporaries, which we give the type GTPSA.TempTPS, do not have any of the problems of just wrapping a pointer as do the TPSs, and so that's what they are. Also in Julia, in order to access the fields of a TempTPS (e.g. mo) via unsafe_load without allocating, we need an immutable struct having the same structure as TPS. This is the purpose of GTPSA.LowTempTPS. We need to use the mo field for @FastGTPSA to finally allocate the result TPS with the mo of the result TempTPS.

As with TPS, we also had to define Base.unsafe_convert for TempTPS so we can @ccall. In this case, the unsafe_convert returns the member Ptr of the TempTPS.

GTPSA.TempTPS — Type

struct TempTPS{T<:Union{Float64,ComplexF64},D}

This is for internal use only. TempTPS is a temporary TPS, which has been pre-allocated in a buffer for each thread in the Descriptor C struct. When using the @FastGTPSA macro, all temporaries generated will be used from this buffer. "Constructors" of this type simply take a temporary from that particular thread's buffer in a stack-like manner and "Destructors" (which must be manually called because this is immutable) release it from the stack.

Fields

t::Ptr{TPS{T,D}} – Pointer to the TPS in the buffer in the Descriptor