New Surrogates and final plans
In the previous article I left off talking about new Surrogate methods.
In the past two weeks I indeed managed to code the following new surrogates:
- Linear
- Lobachesky spline
- Neural Network
- Support vector machine
- Random forest
I also had to make sure that these new surrogates would comply with the optimization methods I coded beforehand. It turns out I have been quite sloppy: at the end I had to change around a lot of data structures of these surrogates to make everything compatible.
Now, this seems like a lot of work. Actually it was not that bad because I have taken advantage of a great deal of Packages, such as: GLM, Flux, LIBSVM and XGBoost.
Linear Surrogate
The definition and construction of a Linear Surrogate is indeed quite easy:
mutable struct LinearSurrogate{X,Y,C,L,U} <: AbstractSurrogate x::X y::Y coeff::C lb::L ub::U end function LinearSurrogate(x,y,lb::Number,ub::Number) ols = lm(reshape(x,length(x),1),y) LinearSurrogate(x,y,coef(ols),lb,ub) end
The bounds are needed in the construction because optimization method need to have explicit limitations. The ND case is the same, because I still take advantage of GLM.
Lobachesky spline
The lobachesky spline is super interesting. It is defined in this way:
With
The inner
By applying the central limit theorem the d-variate Lobachevsky spline converge to the d-variate Gaussian. Hence, Lobachevsky splines asymptotically behave like radial functions, though they are not radial in themselves.
Let's call our objective function
Then we can approximate
obj = x -> 3*x + log(x) a = 1.0 b = 4.0 x = sample(2000,a,b,SobolSample()) y = obj.(x) alpha = 2.0 n = 6 my_loba_surr = LobacheskySurrogate(x,y,alpha,n,a,b) int_1D = lobachesky_integral(my_loba_surr,a,b) int = quadgk(obj,a,b) int_val_true = int[1]-int[2] abs(int_1D - int_val_true) < 10^-5
Neural network and SVM
To build this surrogate I used the library Flux, which makes it rather easy. There is not much to say about this, I think that the syntax is quite convenient:
a = 0.0 b = 10.0 obj_1D = x -> log(x)^2*x+2*x x = sample(10,a,b,SobolSample()) y = obj_1D.(x); model = Chain(Dense(1,1)) loss(x, y) = Flux.mse(model(x), y) opt = Descent(0.01) n_echos = 5 my_neural = NeuralSurrogate(x,y,a,b,model,loss,opt,n_echos) val = my_neural(5.0)
The user just needs to define a few things about his NN and then the constructor takes care of it.
For the SVMSurrogate, I used the library LIBSVM. The syntax is exactly the same as the neural network.
Random forest surrogate
To build this surrogate I used the library XGBoost, which makes it rather easy.
The only difference from the other Library-ready surrogates is that the user needs to input the number of rounds, that is the number of trees:
lb = [0.0,0.0] ub = [10.0,10.0] s = sample(5,lb,ub,SobolSample()) x = Tuple.(s) obj_ND = x -> x[1] * x[2]^2 y = obj_ND.(x) my_forest_ND = RandomForestSurrogate(x,y,lb,ub,num_round) val = my_forest_ND((1.0,1.0))
Final weeks
In these last two weeks I plan on writing docs, examples and tutorials because a good package is useless if I am the only one that knows how to operate it. Also, I would love to finish the SOP optimization method whose PR is still open. I would also love to code up the MARS spline surrogate.
Anyway, I have a lot more ideas for this package so for sure the work will not end after JSOC. Cannot wait for the last article that will wrap up these amazing three months!
Happy coding,
Ludovico