Bounders for TaylorModels
The following is a blog post about my work at Google Summer of Code 2020 on the project Taylor Models and a guaranteed ODE solver. This project aims to improve some of the methods in the existing package TaylorModels.jl for the Julia Language.
What is a Taylor Model?
]add TaylorModels IntervalArithmetic
# Load the required packages for this notebook
using TaylorModels
using IntervalArithmetic
using Plots
using WebSockets
using WebIO, Interact
A Taylor Model is a tool for obtaining a validated polynomial approximation to a function inline_formula not implemented, where validated refers to the fact that the true function is guaranteed to be represented (or enclosed) by this approximation.
A Taylor Model inline_formula not implemented over the domain inline_formula not implemented for the function inline_formula not implemented is the tuple inline_formula not implemented specified by a inline_formula not implemented-degree polynomial inline_formula not implemented and remainder inline_formula not implemented, such that inline_formula not implemented. In other words, inline_formula not implemented is a rigorous approximation to the function inline_formula not implemented over the domain inline_formula not implemented and inline_formula not implemented provides an enclosure that guarantees that the true function inline_formula not implemented is contained in the set inline_formula not implemented for every inline_formula not implemented in the domain. An important fact about the remainder inline_formula not implemented is that it must satisfy inline_formula not implemented, this means that the exact value of the function inline_formula not implemented is enclosed by inline_formula not implemented.
As the name suggests the polynomial inline_formula not implemented is just the Taylor Series for the function inline_formula not implemented, typically the expansion point of the series is taken to be the midpoint of inline_formula not implemented, this can be generalized to consider other types of series expansions. Notice that the remainder inline_formula not implemented acts as a "tube" around the expansion inline_formula not implemented that "absorbs" all kind of computation errors (truncation, roundoff) to ensure that inline_formula not implemented is contained in inline_formula not implemented.
Remark: It is advisable to run this blog post interactively, in order to be able to exploit the sliders that are included in some cases.
Let's see this in action, as a matter of example we're going to compute the Taylor Model for the function inline_formula not implemented using the package TaylorModels.jl.
D = 0 .. 5 # Taylor Model domain
x0 = mid(D) # expansion point for the Taylor Series
f(x) = exp(x)
xrange = range(D.lo, D.hi, length=100)
for order in 2:5
# Let's define the Taylor Model for the univariate function f(x)
tm = TaylorModel1(order, x0, D) # create a Taylor Model "variable"
fT = f(tm) # from evaluating this variable on f we get its Taylor Model
plot(fT, label="Taylor Model, order=$order")
plot!(xrange, x -> f(x), label="f(x)", lw=3)
ylims!(0, 150)
end
From the previous animation, we see that as the order of the expansion is increased its remainder inline_formula not implemented gets tighter. This is the expected behavior since there is less error due to the expansion to be included in inline_formula not implemented.
The Taylor Models defines its own arithmetic, that means, every arithmetical operation has a corresponding operation in terms of Taylor Models. So, for example, we can define the product inline_formula not implemented of two Taylor Models for the functions inline_formula not implemented, such that
formula not implementedwhere inline_formula not implemented corresponds to the resulting Taylor Model of the product inline_formula not implemented. In other words, if we define a inline_formula not implemented Taylor Models for the functions inline_formula not implemented and then multiply them, the resulting Taylor Model is guaranteed to enclose the function inline_formula not implemented, and this holds for all arithmetic operations. In this same way is possible to compute composition of Taylor Models. The details of this Taylor Model arithmetic and composition can be found in [1].
To illustrate this Taylor Model arithmetic, let's compute the following
formula not implementedf(x) = sin(x)
g(x) = cos(x)
h(x) = x^3
s(x) = h(x) * (f(x) + g(x))
D = -1 .. 1
x0 = mid(D)
xrange = range(D.lo, D.hi, length=100)
for order in 2:5
tm = TaylorModel1(order, x0, D)
fT = f(tm)
gT = g(tm)
hT = h(tm)
sT = hT * (fT + gT)
plot(sT, label="Taylor Model, order=$order")
plot!(xrange, x -> s(x), label="s(x)", lw=3)
ylims!(-3, 3)
end
From this comes the true power of Taylor Models as this enables us to get validated results for a wide variety of calculations by just doing it over Taylor Models.
Bounds
The bound for a model (not just for a Taylor Model) is the enclosure that contains the range of the function inline_formula not implemented over some domain D. So, for example, for an interval-based model, this bound is just the function evaluated in inline_formula not implemented by interval arithmetic. For a Taylor Model, this corresponds to evaluate its polynomial part on interval arithmetic over inline_formula not implemented and then sum its remainder inline_formula not implemented.
# compute the bound for a Taylor Model
f(x) = 1 + x^5 - x^4
D = 0 .. 1
x0 = mid(D)
order = 3
bound_interval = f(D)
tm = TaylorModel1(order, x0, D)
fT = f(tm)
bound_tm = fT(fT.dom - fT.x0)
box_interval = IntervalBox(D, bound_interval)
box_tm = IntervalBox(D, bound_tm)
xrange = range(0, 1, length=100)
plot(xrange, x -> f(x), color=:red, label="f(x)", lw=3)
plot!(box_interval, label="Bound Interval Arithmetic")
plot!(box_tm, label="Bound Taylor Model")
Both types of enclosures contain the range of the function inline_formula not implemented over inline_formula not implemented, but with a great over-estimation of the function range, a way to improve this is to split or mince the original domain inline_formula not implemented in smaller equally large subdomains.
# compute the bound for a Taylor Model
f(x) = 1 + x^5 - x^4
D = mince(0 .. 1, 16) # 16 equally large subdomains
order = 3
p = plot()
interval_boxes = []
tm_boxes = []
for (idx, Di) in enumerate(D)
x0 = mid(Di)
bound_interval = f(Di)
tm = TaylorModel1(order, x0, Di)
fT = f(tm)
bound_tm = fT(fT.dom - fT.x0)
box_interval = IntervalBox(Di, bound_interval)
box_tm = IntervalBox(Di, bound_tm)
if idx == 16
label_interval = "Interval Arithmetic"
label_tm = "Taylor Model order=$order"
else
label_interval = ""
label_tm = ""
end
plot!(box_interval, c=:green, label=label_interval)
plot!(box_tm, c=:red, label=label_tm)
end
xrange = range(0, 1, length=100)
plot!(xrange, x -> f(x), color=:red, label="f(x)", lw=3)
The last plot shows an important fact: Taylor Models allows for better enclosures of a function.
To improve these bounds we can go further and split the domain even further, but a better (in terms of computational efficiency) choice is to use the so-called range bounders.
Range Bounders for Taylor Models
The main idea of range bounders is to bound the minimum (or maximum) of a Taylor Model over some region in order to tighten its bound. By bounding the minimum (maximum) of the Taylor Model in an interval inline_formula not implemented, then we can just take the lower (upper) bound inline_formula not implemented (inline_formula not implemented) as one of the bound for the Taylor Model as by definition nothing can be below (above) of the minimum (maximum). The kind of range bounders that follows, use the information from the linear and quadratic terms of the Taylor Model in order to estimate this bound inline_formula not implemented and then use it to improve its bound.
Linear Dominated Bounder
The linear dominated bounder (ldb) is an algorithm capable of tightening the bounds of a Taylor Model. To accomplish this, the ldb algorithm uses the linear part of a Taylor Model as a guideline to iteratively reduce the domain. This algorithm is motivated by the fact that far from a local optimum, the linear part of the function typically dominates its behavior, so, we can use exploit it to tightening the bound of the Taylor Model. The algorithm proceeds as follow
Re-expand the polynomial inline_formula not implemented at the midpoint inline_formula not implemented of the interval domain inline_formula not implemented.
Check the sign of the linear coefficients inline_formula not implemented's and save them for later.
Compute the bound of the linear part inline_formula not implemented and of the non-linear part inline_formula not implemented.
The minimum of the function is bounded by inline_formula not implemented, define inline_formula not implemented, if inline_formula not implemented is less than some pre-specified threshold, then finish, else
Reduce the i-th domain coordinate inline_formula not implemented by the following criteria
If inline_formula not implemented; then
inline_formula not implemented
If inline_formula not implemented; then
inline_formula not implemented
Else; nothing to do, as no linear part.
Go to 1, replacing the original domain inline_formula not implemented with the reduced domain of the previous step.
As an example of this algorithm let's improve the bound of the Taylor Model for the function inline_formula not implemented over the domain inline_formula not implemented. To this end, we're going to mince inline_formula not implemented in 16 equally large subdomains and compute a Taylor Model on every one of them,
f(x) = 1 + x^5 - x^4
order = 5
subdomains = mince(0 .. 1, 16)
xrange = range(0, 1, length=100)
boxes_tm = []
boxes_ldb = []
for i in 1:length(subdomains)
p = plot(legend=false)
plot!(xrange, x -> f(x), c=:red,)
D = subdomains[i]
x0 = mid(D)
tm = TaylorModel1(order, x0, D)
fT = f(tm)
bound_tm = fT(fT.dom - fT.x0)
bound_ldb = linear_dominated_bounder(fT)
box_tm = IntervalBox(D, bound_tm)
box_ldb = IntervalBox(D, bound_ldb)
push!(boxes_tm, box_tm)
push!(boxes_ldb, box_ldb)
end
for i in 1:length(subdomains)
box_ldb = boxes_ldb[i]
p = plot()
for box in boxes_tm
plot!(box, label="", fill=0.)
end
plot!(box_ldb, label="TM5 + Linear dominated bounder")
plot!(xrange, x -> f(x), label="f(x)", lw=3)
end
As we see, there are some subdomains with a great over-estimation of the bound, the last one, and the one that has the optimum.
Quadratic Fast Bounder
The idea behind the quadratic fast bounder (qfb) is to bound the second-order terms of the Taylor Series expansion, this becomes noticeably important when there is a local optimum, as the linear part vanishes and therefore the ldb algorithm can't properly work.
The key idea of this algorithm boils down to the following decomposition of a Taylor Model
formula not implementedThen, the lower bound for this model is just
formula not implementedwhere inline_formula not implemented denotes the lower bound and inline_formula not implemented is a factor introduced for convenience. For this algorithm, inline_formula not implemented is chosen to be
formula not implementedwith inline_formula not implemented being the hessian of inline_formula not implemented. Note that, near a local minimizer this hessian becomes positive definite, that means inline_formula not implemented for every vector inline_formula not implemented, therefore inline_formula not implemented trivially. Better yet, if we choose inline_formula not implemented as
formula not implementedfor some inline_formula not implemented in the domain, then inline_formula not implemented has linear terms and for a careful selection of inline_formula not implemented, inline_formula not implemented can contain no linear terms and still have inline_formula not implemented. By this selection, we're bounding not just the second-order terms but the first ones too. It turns out that this inline_formula not implemented must be chosen so that is a solution to the following system of linear equations
formula not implementedwhere inline_formula not implemented is the vector of linear terms of inline_formula not implemented.
Let's use this range bounder for the same function as in the ldb example
f(x) = 1 + x^5 - x^4
order = 5
subdomains = mince(0 .. 1, 16)
xrange = range(0, 1, length=100)
boxes_tm = []
boxes_qfb = []
for i in 1:length(subdomains)
p = plot(legend=false)
plot!(xrange, x -> f(x), c=:red,)
D = subdomains[i]
x0 = mid(D)
tm = TaylorModel1(order, x0, D)
fT = f(tm)
bound_tm = fT(fT.dom - fT.x0)
bound_qfb = quadratic_fast_bounder(fT)
box_tm = IntervalBox(D, bound_tm)
box_qfb = IntervalBox(D, bound_qfb)
push!(boxes_tm, box_tm)
push!(boxes_qfb, box_qfb)
end
for i in 1:length(subdomains)
box_ldb = boxes_qfb[i]
p = plot()
for box in boxes_tm
plot!(box, label="", fill=0.)
end
plot!(box_ldb, label="TM5 + Quadratic fast bounder")
plot!(xrange, x -> f(x), label="f(x)", lw=3)
end
Bounds comparison for distinct functions
f(x) = x^3 * sin(x)
D = -3 .. 3
n = 16
subdomains = mince(D, n)
xrange = range(D.lo, D.hi, length=100)
boxes_interval = [IntervalBox(d, f(d)) for d in subdomains]
for i in eachindex(subdomains), order in 3:5, bounder in ("nobounder", "ldb", "qfb")
d = subdomains[i]
x0 = mid(d)
plot(boxes_interval, label="Interval Arithmetic", fill=0., legend=:top)
tm = TaylorModel1(order, x0, d)
fT = f(tm)
if bounder == "nobounder"
bound = fT(fT.dom - fT.x0)
box = IntervalBox(d, bound)
plot!(box, label="Naive Taylor Model")
elseif bounder == "ldb"
bound = linear_dominated_bounder(fT)
box = IntervalBox(d, bound)
plot!(box, label="Linear dominated bounder")
elseif bounder == "qfb"
bound = quadratic_fast_bounder(fT)
box = IntervalBox(d, bound)
plot!(box, label="Quadratic fast bounder")
end
plot!(xrange, x -> f(x), label="f(x)", c=:red, lw=2)
end
f(x) = x^2 * sin(x^2 - 3) + x
D = -3 .. 3
n = 16
subdomains = mince(D, n)
xrange = range(D.lo, D.hi, length=100)
boxes_interval = [IntervalBox(d, f(d)) for d in subdomains]
for i in eachindex(subdomains), order in 3:5, bounder in ("nobounder", "ldb", "qfb")
d = subdomains[i]
x0 = mid(d)
plot(boxes_interval, label="Interval Arithmetic", fill=0., legend=:top)
tm = TaylorModel1(order, x0, d)
fT = f(tm)
if bounder == "nobounder"
bound = fT(fT.dom - fT.x0)
box = IntervalBox(d, bound)
plot!(box, label="Naive Taylor Model")
elseif bounder == "ldb"
bound = linear_dominated_bounder(fT)
box = IntervalBox(d, bound)
plot!(box, label="Linear dominated bounder")
elseif bounder == "qfb"
bound = quadratic_fast_bounder(fT)
box = IntervalBox(d, bound)
plot!(box, label="Quadratic fast bounder")
end
plot!(xrange, x -> f(x), label="f(x)", c=:red, lw=2)
end
f(x) = x^2 * sin(x^2 - 3) + cos(3 - x^3) * x
D = -3 .. 3
n = 16
subdomains = mince(D, n)
xrange = range(D.lo, D.hi, length=200)
boxes_interval = [IntervalBox(d, f(d)) for d in subdomains]
for i in eachindex(subdomains), order in 3:5, bounder in ("nobounder", "ldb", "qfb")
d = subdomains[i]
x0 = mid(d)
plot(boxes_interval, label="Interval Arithmetic", fill=0., legend=:top)
tm = TaylorModel1(order, x0, d)
fT = f(tm)
if bounder == "nobounder"
bound = fT(fT.dom - fT.x0)
box = IntervalBox(d, bound)
plot!(box, label="Naive Taylor Model")
elseif bounder == "ldb"
bound = linear_dominated_bounder(fT)
box = IntervalBox(d, bound)
plot!(box, label="Linear dominated bounder")
elseif bounder == "qfb"
bound = quadratic_fast_bounder(fT)
box = IntervalBox(d, bound)
plot!(box, label="Quadratic fast bounder")
end
plot!(xrange, x -> f(x), label="f(x)", c=:red, lw=2)
end
References
[1] Mioara Joldes, Rigorous Polynomial Approximations and Applications, PhD Thesis, École Normale Supérieure de Lyon, ENS-Lyon (2011).
[2] Kyoko Makino, Rigorous Analysis of Nonlinear Motion in Particle Accelerators, PhD Thesis, Michigan State University (1991).
[3] Kyoko Makino & Martin Berz, Verified Global Optimization with Taylor Model based Range Bounders, Transactions on Computers (2005).
[4] Martin Berz, Kyoko Makino, Youn-Kyung Kim, Long-term stability of the Tevatron by verified global optimization, Nuclear Instruments and Methods in Physics Research A (2006).