Week #2: New Ai2 implementation using Zonotopes
Summary of contributions
using NeuralVerification, LazySets, BenchmarkTools, Suppressor, Plots
Introduction
In this notebook I will be implementing the algorithm and testing it against other implementations in NeuralVerification.jl, using their test suit
ReLU function
Last week we presented an implementation of the ReLU activation function using Theorem 3.1 of [1], there is shown a zonotope overapproximation of the ReLU function of a Zonotope.
Now I'm going to explore this implementation further. As we can see, the Theorem proposes an overapproximation of the output of the ReLU function only when the points in a given dimension of the input set are not completely positive or negative, for example, a Zonotope that is completely positive:
Zp = Zonotope([2., 2], [1. 0; 0 1])
plot(Zp, ratio=1)
plot(overapproximate(Rectification(Zp), Zonotope), ratio=1)
As we can see, the ReLU function of the Zonotope is identical as the original, but when the original Zonotope has a dimension that is completely negative, the output Zonotope is 0 in that dimension
Zn = Zonotope([-2., 2], [1. 0; 0 1])
plot(Zn, ratio=1)
plot(overapproximate(Rectification(Zn), Zonotope), ratio=1)
As we can see, in the case of a two-dimensional zonotope, the resulting set is a line. The overapproximating begins when the input set has points in a dimension that are both positive and negative, at that point, the algorithm overapproximates the output set. For example:
Zn = Zonotope([1.0, -1.0], hcat([1.0 -1.0; 1.0 1.0]));
plot(Zn, ratio=1)
plot(Zn, ratio=1, label="Original Zonotope")
plot!(overapproximate(Rectification(Zn), Zonotope), label="Implemented ReLU")
plot!(VPolygon([[0., 0], [1., 1], [2., 0]]), color=:darkred, label="Real ReLU")
plot!(VPolygon([[2., 0], [3., 0]]), color=:darkred,)
Here we can see the overapproximation done by the function
Comparing results
In this section I will compare the performance of the new implementation to the method used last week
Z0 = Zonotope([1., 2], [[.5, .5], [0, .5], [.5, 0]]) # Initial Zonotope
Z1 =linear_map([2. -1; 0 1], Z0) # application of weights
begin
P = convert(HPolytope, $Z1) # convertion to polytope for intersections
# ReLU
A2 = intersection(P, HalfSpace([-1., 0], 0.)) # x1 >= 0
A3 = intersection(P, HalfSpace([1., 0], 0.)) # x1 < 0
cd = CustomDirections([[-1., 0], [-1, 1], [1, 1]]) # directions of the generators
Z2 = overapproximate(A2, Zonotope, cd) # overapproximate intersection with zonotope
Z3 = overapproximate(A3, Zonotope, cd)
Z4 = linear_map([1. 0; 0 1], Z2) # Z2 * (1 0; 0 1)
Z5 = linear_map([0. 0; 0 1], Z3) # Z3 * (0 0; 0 1)
CH6 = convex_hull(Z4, Z5) # union of Z4 and Z5
Z6 = overapproximate(CH6, Zonotope, cd) # overapproximate the union with a zonotope
end
overapproximate(Rectification($Z1), Zonotope)
We can see that the new function is faster and uses less memory, in this example, it was ~600 times faster and used ~100 time less memory
Implementation of using Zonotopes in NeuralVerification.jl
To implement the new version of using zonotopes in NeuralVerification.jl, I created a new Solver called ai2z
, to create a Problem
using the new solver, the initial conditions and the output will need to be passed as Zonotope
.
The creation of a new solver wasn't difficult, as all the structures are already there, the solver ai2
was used as a template. To apply the ReLU rectification, the function overapproximate(r::Rectification{N, <:AbstractZonotope{N}}, ::Type{<:Zonotope})
implemented last week in LazySets was used, as we discussed, it was much faster than an implementation using polytopes. This allowed me to implement the ReLU activation function as follows:
function forward_partition(act::ReLU, input::Zonotope)
return overapproximate(Rectification(Z), Zonotope)
end
This implementation is much faster compared to the existing in NeuralVerification.jl, specially in higher dimension:
function forward_partition(act::ReLU, input::HPolytope)
n = dim(input)
output = Vector{HPolytope}(undef, 0)
C, d = tosimplehrep(input)
dh = [d; zeros(n)]
for h in 0:(2^n)-1
P = getP(h, n)
Ch = [C; I - 2P]
input_h = HPolytope(Ch, dh)
if !isempty(input_h)
push!(output, linear_map(Matrix{Float64}(P), input_h))
end
end
return output
end
As we can see, this function is very expensive, as the loop inside it scales exponentially with the dimension of the input.
Changes in other files such as reachability.jl
were needed, as it assumed that all reachability methods received a HPolytope
as input/output. Other change in that file was the output of the solve
function, as it didn't give the user the final reach set.
Testing
NeuralVerification.jl has a big range of tests, I asked Tomer Arnon for what tests to run, he recommended the "hand-made" single-input-single-output networks and MNIST and ACASXU networks of higher dimensions.
The following code is obtained from NeuralVerification.jl and modified to get the data I want
identity_network.jl
small_nnet_file = "/home/sguadalupe/.julia/dev/NeuralVerification/examples/networks/small_nnet_id.nnet"
small_nnet = read_nnet(small_nnet_file, last_layer_activation = Id())
# The input set is always [-0.9:0.9]
in_hyper = Hyperrectangle(low = [-0.9], high = [0.9])
in_hpoly = convert(HPolytope, in_hyper)
# superset of the output
out_superset = Hyperrectangle(low = [-70.0], high = [-23.0])
# includes points in the output region
out_overlapping = Hyperrectangle(low = [-100.0], high = [-45.0])
problem_holds = Problem(small_nnet, in_hpoly, convert(HPolytope, out_superset))
problem_violated = Problem(small_nnet, in_hpoly, convert(HPolytope, out_overlapping));
begin
for solver in [MaxSens(resolution = 0.6), ExactReach(), Ai2(), Ai2z()]
println("$solver")
print("holds runtime: ")
holds = solve($solver, $problem_holds)
print("violated runtime: ")
violated = violated = solve($solver, $problem_violated)
print("width of output layer: ")
println(diameter(overapproximate((violated.reachable)[1], Hyperrectangle)))
println()
end
end
relu_network.jl
small_nnet_file = "/home/sguadalupe/.julia/dev/NeuralVerification/examples/networks/small_nnet.nnet"
# small_nnet encodes the simple function 24*max(x + 1.5, 0) + 18.5
small_nnet = read_nnet(small_nnet_file, last_layer_activation = ReLU())
# The input set is [-0.9:0.9]
in_hyper = Hyperrectangle(low = [-0.9], high = [0.9])
in_hpoly = convert(HPolytope, in_hyper)
# Output region is entirely contained in this interval:
out_superset = Hyperrectangle(low = [30.0], high = [80.0]) # 20.0 ≤ y ≤ 90.0
# Includes some points in the output region but not all:
out_overlapping = Hyperrectangle(low = [-1.0], high = [50.0]); # -1.0 ≤ y ≤ 50.0
begin
for solver in [MaxSens(resolution = 0.6), ExactReach(), Ai2(), Ai2z()]
println("$solver")
print("holds runtime: ")
holds = solve($solver, $problem_holds)
print("violated runtime: ")
violated = violated = solve($solver, $problem_violated)
print("width of output layer: ")
println(diameter(overapproximate((violated.reachable)[1], Hyperrectangle)))
println()
end
end
We can see the new implementation of using the new method of calculating the ReLU activation function is much faster than the previous implementation, obtaining the same precision for a fraction of the cost, in time and memory. In comparison with other algorithms, the new implementation is comparable in runtime and memory usage to MaxSens, but the latter being more precise, as to ExactReach being the slower of the bunch and getting the same precision as both implementations
References
Singh, G., Gehr, T., Mirman, M., Püschel, M., & Vechev, M. (2018). Fast and effective robustness certification. In Advances in Neural Information Processing Systems (pp. 10802-10813).