Closing the GSOC adding a new optimization ND optimization method, finishing benchmarks and fixing minor bugs

In these last few weeks I mainly worked on:

  1. New optimization methods for multiple dimensions: SOP

  2. Benchmarking surrogates in combination with optimization methods to make sure that everything works properly

  3. Fixing some bugs discovered by point 2 in the new surrogate models introduced during this summer.

Unfortunately, point 3 took the majority of time, hence I could not finish the bonus point: DENSE surrogate. It's no big deal though, given that I will still be working on Surrogates.jl during the next months as I plan to write a paper on it, yay!

SOP ND

This optimization method was not in scope during GSOC, I figured that given I had already implemented the 1D version it could fit as a "good first issue" for newcomers.

Well, I have never been more wrong! First of all, the 1D version needed heavy refactoring to make it understable, on top of that the paper was not very accessible to begin with if someone hasn't the right background.

Many wannabe contributors tackled this without success, so I decided that it was not a good first issue after all. After redirecting them, I took this on.

I spent more time that I care to admit on this, plus I also discovered a two small bugs in the 1D version in the process: win win.

As we can see from the benchmarks, this optimization method can work with every surrogate and while being relatively expensive, it generally leads to very good results:

Let's take a Kriging surrogate and try it!

Before optimization:

After optimization:

So satisfying!

Benchmarks

Writing new surrogate is always incredibly fun, but benchmarking them is crucial: future users need to understand how changing the hyperparameters leads to change in the surrogate, and ultimately to the final fit.

Not only that, there are currently 5 optimization methods: they work very differently and not every one of them is good with every surrogate.

This is showcased with different nasty single dimension functions: take a look at them!

What's to come in the next few months

I have been working with Surrogates.jl and the SciML community for nearly two years now, the experience has been amazing. The work continues though, given that I have always seen the GSOC as a time where I ramp up the development of the library, which never really stops during other months.

I will be mainly working on:

  • Fixing some strange bugs in MARS and GEK surrogate, which make the prediction come out as mostly constant in some cases. I have been investigating this for a while now without much success.

  • Writing more surrogates models, taking some ideas from this library.

  • Attracting new Julia contributors to the library. I like to think that Surrogates.jl has a very understandable codebase, so I feel it could be a very good starting point for people interested in contributing to Julia. Plus, it is still relatively small with an easy workflow. I have been working with 3 or 4 interested students under the MLH framework and I hope to continue with this in the future.

  • Eventually writing a paper on Surrogates.jl. Chris and I have been talking about this a lot and I feel there are a few novel things with the library, but every time something comes up and the process is delayed. Hopefully I will win some grant for this, so that I do not procrastinate :D

That's all for this GSOC, thanks again to community for this amazing experience.