Parens for Python - Predicting Sportsball & UFOs

Forecasting time series

We are going to explore some more Python libraries through the use of libpython-clj.

This time, we are going to look at Facebook Prophet

{:deps
 {org.clojure/clojure {:mvn/version "1.10.1"}
  clj-python/libpython-clj {:mvn/version "1.36"}}}
deps.edn
Clojure

Install the python dependencies and model

pip3 install fbprophet
pip3 install holidays==0.9.12
1.9s
Clj & Python env (Bash in Clojure)

The main tutorial is on https://facebook.github.io/prophet/docs/quick_start.html#python-api. We'll be following along until we break off to look at UFOs.

Quick Start with some Sports Ball Stuff

The tutorial has to do with some forecasting for sports stuff. Honestly, I don't really follow any sports so it makes no sense to me, but the important bit is that Prophet takes in a csv file in the form of two columns ds and y. The ds one is the date in the format of YYYY-MM-dd and the y is the numeric value. That's it really.

We are going to load in the namespaces to start it all off as well as a macro to help with plotting.

(ns gigasquid.facebook-prophet
  (:require [libpython-clj.require :refer [require-python]]
            [libpython-clj.python :as py :refer [py. py.. py.-]]))
;;;; have to set the headless mode before requiring pyplot
(def mplt (py/import-module "matplotlib"))
(py. mplt "use" "Agg")
(require-python 'matplotlib.pyplot)
(require-python 'matplotlib.backends.backend_agg)
(defmacro with-show
  "Takes forms with mathplotlib.pyplot to then show locally"
  [& body]
  `(let [_# (matplotlib.pyplot/clf)
         fig# (matplotlib.pyplot/figure)
         agg-canvas# (matplotlib.backends.backend_agg/FigureCanvasAgg fig#)]
     ~(cons 'do body)
     (py. agg-canvas# "draw")
     (matplotlib.pyplot/savefig (str "results/" gensym ".png"))))
(require-python '[pandas :as pd])
(require-python '[fbprophet :as fbprophet])
(require-python '[matplotlib.pyplot :as pyplot])
16.8s
Clj & Python env (Clojure)
:ok

Let's download the sportsball data for some guy named Manning.

(def csv-file (slurp "https://raw.githubusercontent.com/facebook/prophet/master/examples/example_wp_log_peyton_manning.csv"))
(spit "manning.csv" csv-file)
(def df (pd/read_csv "manning.csv"))
(py.- df head)
1.1s
Clj & Python env (Clojure)
Vector(4) [libpython_clj.python.bridge$generic_python_as_jvm$reify__27744, "0x5fddf4c6", "<bound method NDFrame.head of ds y 0 2007-12-10 9.590761 1 2007-12-11 8.519590 2 2007-12-12 8.183677 3 2007-12-13 8.072467 4 2007-12-14 7.893572 ... ... ... 2900 2016-01-16 7.817223 2901 2016-01-17 9.273878 2902 2016-01-18 10.333775 2903 2016-01-19 9.125871 2904 2016-01-20 8.891374 [2905 rows x 2 columns]>", Map]

Now the predicting bit. We create a Prophet mode and fit the dataframe to it.

(def m (fbprophet/Prophet))
(py. m fit df)
2.9s
Clj & Python env (Clojure)
Vector(4) [libpython_clj.python.bridge$generic_python_as_jvm$reify__27972, "0x1e998f56", "<fbprophet.forecaster.Prophet object at 0x7f8dd470f7d0>", Map]

Predictions are then made on the dataframe for a number of periods

(def future (py. m make_future_dataframe :periods 365))
(def forecast (py. m predict future))
4.9s
Clj & Python env (Clojure)
gigasquid.facebook-prophet/forecast

The important columns are yhat, yhat_upper, and yhat_lower. We can easily extract them into a Clojure format and do what we like with them.

(def predicted-vals (mapv (fn [x y y1 y2] {:ds x :y y :yhat-upper y1 :yhat-lower y2})
                          (py/get-item forecast "ds")
                          (py/get-item forecast "yhat")
                          (py/get-item forecast "yhat_upper")
                          (py/get-item forecast "yhat_lower")))
(println (last predicted-vals))
0.6s
Clj & Python env (Clojure)

We can plot the forecast.

(with-show
  (py. m plot forecast))
0.5s
Clj & Python env (Clojure)

And we can plot the components of the forecast.

(with-show
  (py. m plot_components forecast))
0.8s
Clj & Python env (Clojure)

Bring on the UFOs!

Ok. Enough with the Sportsball, let look at some more interesting data like UFO sightings from http://www.nuforc.org/webreports/ndxevent.html.

The data is different from the first example in that the stats are monthly. We are going to take a look at the sightings from 2010 to today.

(def csv-file (slurp "https://raw.githubusercontent.com/gigasquid/libpython-clj-examples/master/resources/ufosightings-since-2010.csv"))
(spit "ufosightings-since-2010.csv" csv-file)
(def df (pd/read_csv "ufosightings-since-2010.csv"))
0.0s
Clj & Python env (Clojure)
gigasquid.facebook-prophet/df

We are going to do two things differently with this. First, we are going to factor in seasonality. The second thing is that we are going to make our predictions monthly since the data itself is monthly.

(def m (fbprophet/Prophet :seasonality_mode "multiplicative")) ;;; Let's factor in some holiday effects
(py. m fit df)
(def future (py. m make_future_dataframe :periods 48 :freq "M")) ;;; note Monthly prediction
(def forecast (py. m predict future))
2.8s
Clj & Python env (Clojure)
gigasquid.facebook-prophet/forecast

Now, we can see what the the UFO sightings look like.

(with-show
  (py. m plot forecast))
0.4s
Clj & Python env (Clojure)

We can see there is definitely a yearly pattern to it, but luckily it seems that the UFOs are on the decrease in the future.

(with-show
  (py. m plot_components forecast))
0.5s
Clj & Python env (Clojure)

It seems like July is the peak time for UFO sightings. It confirms the suspicion that the Independence Day movie was onto something. It also looks like January and February are low times for sightings. This makes sense with the classical view of cold-blooded reptilian aliens, but we must keep our minds open to other life forms based on temperature. For example, there could be butter based aliens out there that also would have movement hampered at colder temperatures.

Conclusion

Facebook prophet is a powerful tool for forecasting series data. It would work really well on everyday problems like sales data and page views.

It can also be applied to reassure ourselves that whatever we are doing in the fight against outer-space aliens seems to be working. Keep it up!

Runtimes (1)