Hugh Murrell / Aug 15 2019

Chapter 1, Introduction to Julia

Introduction to Julia

This introduction assumes that you have basic knowledge of some scripting language and provides examples of the Julia syntax.

Getting Julia

The main source of information is the Julia Documentation:

https://docs.julialang.org/en/v1/

You may find many other external resources useful, for example:

https://academy.juliabox.com/

Installation

Before we can start we need to set up our environment with the necessary tools and libraries for machine learning.

Installing Julia

Download Julia for your specific system from here https://julialang.org/downloads/

Follow the platform-specific instructions to install Julia on your system from here https://julialang.org/downloads/platform.html

Julia REPL

If you have done everything correctly, you’ll get a Julia prompt from the terminal like this:

This interface is known as the Julia REPL (Read, Execute, Print, Repeat).

The Julia programmer can use the REPL to execute Julia commands and Julia scripts that are edited with a normal text editor.

Jupyter

A better development environment is a Jupyter notebook which runs in your browser and comes with the IJulia package.

To install IJulia from the Julia REPL, hit ] to enter package mode and then enter

add IJulia

This adds packages for the IJulia kernel which links Julia to Jupyter.

To start Jupyter just navigate to a directory where your notebooks are stored and type

jupyter notebook

and jupyter will appear in your browser.

Now let's just treat Julia as a scripting language and take a head-first dive into Julia, and see what happens.

The ? gets you to the documentation for a Julia function:

whilst the ]? gets you documentation on the Julia package system:

Syntax

Syntax for basic math

sum = 3 + 8
11
difference = 10 - 3
7
product = 20 * 5
100
quotient = 100 / 10
10.0
power = 10 ^ 2
100
modulus = 101 % 2
1
concat = "Hugh" * "Murrell"
"HughMurrell"

String Interpolation

name = "Jane"
num_fingers = 10
num_toes = 10
println("Hello, my name is $name.")
println("I have $num_fingers fingers and $num_toes toes.")
println("That is $(num_fingers + num_toes) digits in all!!")

Variables and Types

We can assign a variable a value and find out what type a variable is with the typeof function:

# assign the integer value 7 to a variable
a = 7
7
  
# find out the type of the variable a
typeof(a)  
Int64
my_pi = 3.14159
typeof(my_pi)
Float64
s1 = "I am a string."
"I am a string."
typeof(s1)
String
s2 = " I am also a string. "
" I am also a string. "

Data structures

Once we start working with many pieces of data at once, it will be convenient to store data in structures like arrays or dictionaries (rather than just relying on variables).<br>

Tuples and arrays are both ordered sequences of elements (so we can index into them).

Tuples are immutable but dictionaries and arrays are both mutable.

Tuples

We can create a tuple by enclosing an ordered collection of elements in ( ).

Syntax: (item1, item2, ...)

# use round brackets to create a tuple 
myfavoriteanimals = ("penguins", "cats", "dogs")  
("penguins", "cats", "dogs")
# we can index a tuple
myfavoriteanimals[1] 
"penguins"
# cant update a tuple because they are immutable
# myfavoriteanimals[1] = "otters"   

NamedTuples

As you might guess, NamedTuples are just like Tuples except that each element additionally has a name!

They have a special syntax using = inside a tuple:

(name1 = item1, name2 = item2, ...)

myfavoriteanimals = 
    (bird = "penguins", mammal = "cats", marsupial = "sugargliders")
(bird = "penguins", mammal = "cats", marsupial = "sugargliders")
# you can index a named tuple with an interger index
myfavoriteanimals[1]  
"penguins"
# you can also use the name to access the data
myfavoriteanimals.bird 
"penguins"

Dictionaries

If we have sets of data related to one another, we may choose to store that data in a dictionary. We can create a dictionary using the Dict() function, which we can initialize as an empty dictionary or one storing key, value pairs.

Syntax:

Dict(key1 => value1, key2 => value2, ...)``` A good example is a contacts list, where we associate names with phone numbers.

myphonebook = Dict("Jenny" => "867-5309", "Ghostbusters" => "555-2368")
Dict{String,String} with 2 entries: "Jenny" => "867-5309" "Ghostbusters" => "555-2368"
# We can grab Jenny's number (a value) using the associated key
myphonebook["Jenny"] 
"867-5309"
# We can add another entry to this dictionary
myphonebook["Kramer"] = "555-FILK" 
"555-FILK"
myphonebook
Dict{String,String} with 3 entries: "Jenny" => "867-5309" "Kramer" => "555-FILK" "Ghostbusters" => "555-2368"
# we can delete an element from a named tuple
pop!(myphonebook, "Kramer") 
"555-FILK"
# dictionaries are not ordered. So, we can't index into them.
# myphonebook[1] 

Arrays

Unlike tuples, arrays are mutable. Unlike dictionaries, arrays contain ordered collections. <br> We can create an array by enclosing this collection in [ ].

Syntax: <br>

[item1, item2, ...]```

# Create a one dimensional 4-element array
a = [1, 2, 3, 4] 
4-element Array{Int64,1}: 1 2 3 4
# arrays can contain elements of different types
mixture = [1, 2, 3, "Ted", "Robyn"] 
5-element Array{Any,1}: 1 2 3 "Ted" "Robyn"
# We can use indexing to edit an existing element of an array
# note that indexing starts at 1 
mixture[3] = "Jane" 
mixture  
5-element Array{Any,1}: 1 2 "Jane" "Ted" "Robyn"
# We can extend a one dimensional array
push!(mixture,6) 
6-element Array{Any,1}: 1 2 "Jane" "Ted" "Robyn" 6
# and remove the last element
pop!(mixture) 
6
mixture
5-element Array{Any,1}: 1 2 "Jane" "Ted" "Robyn"

Multi-dimensional arrays

So far we have only seen examples of only 1D arrays of scalars, but arrays can have an arbitrary number of dimensions and can also store other arrays.

# an array of arrays
numbers = [[1, 2, 3], [4, 5], [6, 7, 8, 9]] 
3-element Array{Array{Int64,1},1}: [1, 2, 3] [4, 5] [6, 7, 8, 9]
# a 2D array (4 rows x 3 cols) populated with random reals
rand(4, 3)   
4×3 Array{Float64,2}: 0.591177 0.0925861 0.362851 0.450702 0.4616 0.719446 0.400575 0.867672 0.896866 0.464343 0.062655 0.205624
# a 3D array (4 rows, 3 cols, 2 slices)
rand(4, 3, 2) 
4×3×2 Array{Float64,3}: [:, :, 1] = 0.826381 0.154968 0.666813 0.371473 0.0719569 0.706175 0.618831 0.381972 0.224864 0.728921 0.772043 0.372454 [:, :, 2] = 0.617646 0.0562041 0.68891 0.106949 0.53928 0.947686 0.706569 0.81484 0.530244 0.511467 0.899308 0.785891

Copying Arrays

Be careful when you want to copy arrays!

# the original array is changed, 
# because array assignment is by reference
fibonacci = [1, 1, 2, 3, 5, 8, 13]   
somenumbers = fibonacci   
somenumbers[1] = -1   
fibonacci  
7-element Array{Int64,1}: -1 1 2 3 5 8 13
# the solution is to use the copy function 
# to make an independent copy of an array.
fibonacci = [1, 1, 2, 3, 5, 8, 13]   
somenumbers = copy(fibonacci)
somenumbers[1] = -1
fibonacci 
7-element Array{Int64,1}: 1 1 2 3 5 8 13

For arrays of arrays use deepcopy to recursively copy an array stucture!

We can also make an array of a similar size and shape via the function similar:

a = [1, 4, 5]
c = similar(a)
3-element Array{Int64,1}: 140597700651888 140597692167984 140597700652016

Note that arrays can be index'd by arrays:

a[1:2]
2-element Array{Int64,1}: 1 4

Control Flow

Control flow in Julia is pretty standard. You have your basic for and while loops, and your if statements. There's more in the documentation.

for i=1:5 #for i goes from 1 to 5
    print(i," ")
end
println()

t = 0
while t<5
    print(t," ")
    t+=1 # t = t + 1
end
println()

school = :UKZN
if school==:UKZN
    println("yay!")
else
    println("Not even worth discussing.")
end

One interesting feature about Julia control flow is that we can write multiple loops in one line:

for i=1:2,j=2:4
    print(i*j," ")
end

Function Syntax

# Create an inline function
f(x,y) = 2x+y 
f (generic function with 1 method)
# Call the function
f(1,2) 
4
# Long form definition
function f(x)
  x+2  
end 
f (generic function with 2 methods)

By default, Julia functions return the last value computed within them.

f(2)
4

Multiple Dispatch

A key feature of Julia is multiple dispatch.

Suppose that there is "one function", f, with two methods.

Methods are the actionable parts of a function. One method defined as f(::Any,::Any) and another as f(::Any), meaning that if you give f two values then it will call the first method, and if you give it one value then it will call the second method.

Multiple dispatch works on types. To define a dispatch on a type, use a ::Type signifier:

f(x,y) = 2x+y 
f (generic function with 2 methods)
f(x::Int,y::Int) = 3x+2y
f (generic function with 3 methods)

Julia will dispatch onto the strictest acceptible type signature.

f(2,3) # 3x+2y
12
f(2.0,3) # 2x+y since 2.0 is not an Int
7.0

We will go into more depth on multiple dispatch later since this is the core design feature of Julia.

The key feature is that Julia functions specialize on the types of their arguments.

This means that f is a separately compiled function for each method (and for parametric types, each possible method). The first time it is called it will compile.

functions can also feature optional arguments and return multiple values:

function test_function(x,y;z=0) #z is an optional argument
  if z==0
    return x+y,x*y #Return a tuple
  else
  return x*y*z,x+y+z #Return a different tuple
  end 
end 
test_function (generic function with 1 method)
x,y = test_function(1,2)
(3, 2)
x,y = test_function(1,2;z=3)
(6, 6)

The return type for multiple return values is a Tuple.

The syntax for a tuple is (x,y,z,...) or inside of functions you can use the shorthand x,y,z,...

Note that functions in Julia are "first-class". This means that functions are just a type themselves.

Therefore functions can make functions, you can store functions as variables, pass them as variables and return them.

For example:

function playtime(x) 
    y = 2+x
    function test(z=1)
        2y + z # y is defined in the previous scope, so it's available here
    end
    z = test() * test()
    return z,test
end #End function definition
playtime (generic function with 1 method)
z,t = playtime(2)
(81, getfield(Main, Symbol("#test#4")){Int64}(4))
t(3)
11

Lastly we show the anonymous function syntax. This allows you to define a function inline.

g = (x,y) -> 2x+y
#5 (generic function with 1 method)
((x,y) -> 4x+2y)(4,5)
26

Unlike named functions, g is simply a function in a variable and can be overwritten at any time:

g = (x) -> 2x
#9 (generic function with 1 method)
g(3)
6

An anonymous function cannot have more than 1 dispatch. However, as of v0.5, they are compiled and thus do not have any performance disadvantages from named functions.

Mutating functions

For high performance, Julia provides mutating functions. These functions change the input values that are passed in, instead of returning a new value.

By convention, mutating functions tend to be defined with a ! at the end and tend to mutate their first argument.

The purpose of mutating functions is that they allow one to reduce the number of memory allocations which is crucial for achiving high performance.

Structures

A type is what in many other languages is an "object" a thing which has named components.

An instantiation of the type is a specific one.

For example, you can think of a car as having an make and a model. So that means a Toyota RAV4 is an instantiation of the car type.

In Julia, we would define a car using the struct keyword as follows:

struct Car
    make
    model
end

We could then make the instance of a car as follows:

mycar = Car("Toyota","Rav4")
Car("Toyota", "Rav4")
mycar.make
"Toyota"

As with functions, a struct can be set "parametrically". For example, we can have an StaffMember have a name and a field (of type :Symbol) and an age. We can allow this age to be any Number type as follows:

struct StaffMember{T<:Number}
    name::String
    field::Symbol
    age::T
end
ter = StaffMember("Terry",:football,17)
StaffMember{Int64}("Terry", :football, 17)

Most of Julia's types, like Float64 and Int, are natively defined in Julia in this manner.

This means that there's no limit for user defined types, only your imagination.

Julia also has abstract types.

These types cannot be instantiated but are used to build the type hierarchy.

You've already seen one abstract type, Number.

You can define type heirarchies on abstract types. See the beautiful explanation at:

https://docs.julialang.org/en/v1/manual/types/index.html#Abstract-Types-1

Another "version" of type is immutable.

When one uses immutable, the fields of the type cannot be changed.

Many things like Julia's built-in Number types are defined as immutable in order to give good performance.

Lazy Iterator Types

While MATLAB or Python has easy functions for building arrays, Julia tends to side-step the actual "array" part with specially made types. One such example are ranges. To define a range, use the start:stepsize:end syntax. For example:

a = 1:5
println(a)
b = 1:2:10
println(b)

We can use them like any array. For example:

println(a[2]); println(b[3])

But what is b?

println(typeof(b))

b isn't an array, it's a StepRange. A StepRange has the ability to act like an array using its fields:

fieldnames(StepRange)
(:start, :step, :stop)

Note that at any time we can get the array from these kinds of type via the collect function:

c = collect(a)
5-element Array{Int64,1}: 1 2 3 4 5

The reason why lazy iterator types are preferred is that they do not do the computations until it's absolutely necessary, and they take up much less space.

We can check this with @time:

@time a = 1:100000
@time a = 1:100
@time b = collect(1:100000);

Notice that the amount of time the range takes is much shorter. This is mostly because there is a lot less memory allocation needed: only a StepRange is built, and all that holds is the three numbers. However, b has to hold 100000 numbers, leading to the huge difference.

Metaprogramming

Metaprogramming is a huge feature of Julia. The key idea is that every statement in Julia is of the type Expression.

Thus you can think of metaprogramming as "code which takes in code and outputs code". One basic example is the @time macro:

macro my_time(ex)
  return quote
    local t0 = time()
    local val = $ex
    local t1 = time()
    println("elapsed time: ", t1-t0, " seconds")
    val
  end
end
@my_time (macro with 1 method)
@my_time(collect(1:100000));

This takes in an expression ex, gets the time before and after evaluation, and prints the elapsed time. Note that $ex "interpolates" the expression into the macro.

function mandelbrot(a)
    z = 0
    for i=1:50
        z = z^2 + a
    end
    return z
end
 
for y=1.0:-0.05:-1.0
    for x=-2.0:0.0315:0.5
        abs(mandelbrot(complex(x, y))) < 2 ? print("*") : print(" ")
    end
    println()
end