Chapter 1, Introduction to Julia
Introduction to Julia
This introduction assumes that you have basic knowledge of some scripting language and provides examples of the Julia syntax.
Getting Julia
The main source of information is the Julia Documentation:
https://docs.julialang.org/en/v1/
You may find many other external resources useful, for example:
https://academy.juliabox.com/
Installation
Before we can start we need to set up our environment with the necessary tools and libraries for machine learning.
Installing Julia
Download Julia for your specific system from here https://julialang.org/downloads/
Follow the platform-specific instructions to install Julia on your system from here https://julialang.org/downloads/platform.html
Julia REPL
If you have done everything correctly, you’ll get a Julia prompt from the terminal like this:
This interface is known as the Julia REPL (Read, Execute, Print, Repeat).
The Julia programmer can use the REPL to execute Julia commands and Julia scripts that are edited with a normal text editor.
Jupyter
A better development environment is a Jupyter notebook which runs in your browser and comes with the IJulia
package.
To install IJulia
from the Julia REPL, hit ]
to enter package mode and then enter
add IJulia
This adds packages for the IJulia
kernel which links Julia to Jupyter.
To start Jupyter just navigate to a directory where your notebooks are stored and type
jupyter notebook
and jupyter will appear in your browser.
Now let's just treat Julia as a scripting language and take a head-first dive into Julia, and see what happens.
The ?
gets you to the documentation for a Julia function:
whilst the ]? gets you documentation on the Julia package system:
Syntax
Syntax for basic math
sum = 3 + 8
difference = 10 - 3
product = 20 * 5
quotient = 100 / 10
power = 10 ^ 2
modulus = 101 % 2
concat = "Hugh" * "Murrell"
String Interpolation
name = "Jane" num_fingers = 10 num_toes = 10 println("Hello, my name is $name.") println("I have $num_fingers fingers and $num_toes toes.") println("That is $(num_fingers + num_toes) digits in all!!")
Variables and Types
We can assign a variable a value and find out what type a variable is with the typeof
function:
# assign the integer value 7 to a variable a = 7
# find out the type of the variable a typeof(a)
my_pi = 3.14159 typeof(my_pi)
s1 = "I am a string."
typeof(s1)
s2 = " I am also a string. "
Data structures
Once we start working with many pieces of data at once, it will be convenient to store data in structures like arrays or dictionaries (rather than just relying on variables).<br>
Tuples and arrays are both ordered sequences of elements (so we can index into them).
Tuples are immutable but dictionaries and arrays are both mutable.
Tuples
We can create a tuple by enclosing an ordered collection of elements in ( )
.
Syntax: (item1, item2, ...)
# use round brackets to create a tuple myfavoriteanimals = ("penguins", "cats", "dogs")
# we can index a tuple myfavoriteanimals[1]
# cant update a tuple because they are immutable # myfavoriteanimals[1] = "otters"
NamedTuples
As you might guess, NamedTuple
s are just like Tuple
s except that each element additionally has a name!
They have a special syntax using =
inside a tuple:
(name1 = item1, name2 = item2, ...)
myfavoriteanimals = (bird = "penguins", mammal = "cats", marsupial = "sugargliders")
# you can index a named tuple with an interger index myfavoriteanimals[1]
# you can also use the name to access the data myfavoriteanimals.bird
Dictionaries
If we have sets of data related to one another, we may choose to store that data in a dictionary. We can create a dictionary using the Dict()
function, which we can initialize as an empty dictionary or one storing key, value pairs.
Syntax:
Dict(key1 => value1, key2 => value2, ...)```
A good example is a contacts list, where we associate names with phone numbers.
myphonebook = Dict("Jenny" => "867-5309", "Ghostbusters" => "555-2368")
# We can grab Jenny's number (a value) using the associated key myphonebook["Jenny"]
# We can add another entry to this dictionary myphonebook["Kramer"] = "555-FILK"
myphonebook
# we can delete an element from a named tuple pop!(myphonebook, "Kramer")
# dictionaries are not ordered. So, we can't index into them. # myphonebook[1]
Arrays
Unlike tuples, arrays are mutable. Unlike dictionaries, arrays contain ordered collections. <br> We can create an array by enclosing this collection in [ ]
.
Syntax: <br>
[item1, item2, ...]```
# Create a one dimensional 4-element array a = [1, 2, 3, 4]
# arrays can contain elements of different types mixture = [1, 2, 3, "Ted", "Robyn"]
# We can use indexing to edit an existing element of an array # note that indexing starts at 1 mixture[3] = "Jane" mixture
# We can extend a one dimensional array push!(mixture,6)
# and remove the last element pop!(mixture)
mixture
Multi-dimensional arrays
So far we have only seen examples of only 1D arrays of scalars, but arrays can have an arbitrary number of dimensions and can also store other arrays.
# an array of arrays numbers = [[1, 2, 3], [4, 5], [6, 7, 8, 9]]
# a 2D array (4 rows x 3 cols) populated with random reals rand(4, 3)
# a 3D array (4 rows, 3 cols, 2 slices) rand(4, 3, 2)
Copying Arrays
Be careful when you want to copy arrays!
# the original array is changed, # because array assignment is by reference fibonacci = [1, 1, 2, 3, 5, 8, 13] somenumbers = fibonacci somenumbers[1] = -1 fibonacci
# the solution is to use the copy function # to make an independent copy of an array. fibonacci = [1, 1, 2, 3, 5, 8, 13] somenumbers = copy(fibonacci) somenumbers[1] = -1 fibonacci
For arrays of arrays use deepcopy
to recursively copy an array stucture!
We can also make an array of a similar size and shape via the function similar
:
a = [1, 4, 5] c = similar(a)
Note that arrays can be index'd by arrays:
a[1:2]
Control Flow
Control flow in Julia is pretty standard. You have your basic for and while loops, and your if statements. There's more in the documentation.
for i=1:5 #for i goes from 1 to 5 print(i," ") end println() t = 0 while t<5 print(t," ") t+=1 # t = t + 1 end println() school = :UKZN if school==:UKZN println("yay!") else println("Not even worth discussing.") end
One interesting feature about Julia control flow is that we can write multiple loops in one line:
for i=1:2,j=2:4 print(i*j," ") end
Function Syntax
# Create an inline function f(x,y) = 2x+y
# Call the function f(1,2)
# Long form definition function f(x) x+2 end
By default, Julia functions return the last value computed within them.
f(2)
Multiple Dispatch
A key feature of Julia is multiple dispatch.
Suppose that there is "one function", f
, with two methods.
Methods are the actionable parts of a function. One method defined as f(::Any,::Any)
and another as f(::Any)
, meaning that if you give f
two values then it will call the first method, and if you give it one value then it will call the second method.
Multiple dispatch works on types. To define a dispatch on a type, use a ::Type
signifier:
f(x,y) = 2x+y
f(x::Int,y::Int) = 3x+2y
Julia will dispatch onto the strictest acceptible type signature.
f(2,3) # 3x+2y
f(2.0,3) # 2x+y since 2.0 is not an Int
We will go into more depth on multiple dispatch later since this is the core design feature of Julia.
The key feature is that Julia functions specialize on the types of their arguments.
This means that f
is a separately compiled function for each method (and for parametric types, each possible method). The first time it is called it will compile.
functions can also feature optional arguments and return multiple values:
function test_function(x,y;z=0) #z is an optional argument if z==0 return x+y,x*y #Return a tuple else return x*y*z,x+y+z #Return a different tuple end end
x,y = test_function(1,2)
x,y = test_function(1,2;z=3)
The return type for multiple return values is a Tuple.
The syntax for a tuple is (x,y,z,...)
or inside of functions you can use the shorthand x,y,z,...
Note that functions in Julia are "first-class". This means that functions are just a type themselves.
Therefore functions can make functions, you can store functions as variables, pass them as variables and return them.
For example:
function playtime(x) y = 2+x function test(z=1) 2y + z # y is defined in the previous scope, so it's available here end z = test() * test() return z,test end #End function definition
z,t = playtime(2)
t(3)
Lastly we show the anonymous function syntax. This allows you to define a function inline.
g = (x,y) -> 2x+y
((x,y) -> 4x+2y)(4,5)
Unlike named functions, g
is simply a function in a variable and can be overwritten at any time:
g = (x) -> 2x
g(3)
An anonymous function cannot have more than 1 dispatch. However, as of v0.5, they are compiled and thus do not have any performance disadvantages from named functions.
Mutating functions
For high performance, Julia provides mutating functions. These functions change the input values that are passed in, instead of returning a new value.
By convention, mutating functions tend to be defined with a !
at the end and tend to mutate their first argument.
The purpose of mutating functions is that they allow one to reduce the number of memory allocations which is crucial for achiving high performance.
Structures
A type is what in many other languages is an "object" a thing which has named components.
An instantiation of the type is a specific one.
For example, you can think of a car as having an make and a model. So that means a Toyota RAV4 is an instantiation of the car type.
In Julia, we would define a car using the struct
keyword as follows:
struct Car make model end
We could then make the instance of a car as follows:
mycar = Car("Toyota","Rav4")
mycar.make
As with functions, a struct
can be set "parametrically". For example, we can have an StaffMember
have a name
and a field
(of type :Symbol
) and an age. We can allow this age to be any Number
type as follows:
struct StaffMember{T<:Number} name::String field::Symbol age::T end ter = StaffMember("Terry",:football,17)
Most of Julia's types, like Float64 and Int, are natively defined in Julia in this manner.
This means that there's no limit for user defined types, only your imagination.
Julia also has abstract types.
These types cannot be instantiated but are used to build the type hierarchy.
You've already seen one abstract type, Number.
You can define type heirarchies on abstract types. See the beautiful explanation at:
https://docs.julialang.org/en/v1/manual/types/index.html#Abstract-Types-1
Another "version" of type is immutable
.
When one uses immutable
, the fields of the type cannot be changed.
Many things like Julia's built-in Number types are defined as immutable
in order to give good performance.
Lazy Iterator Types
While MATLAB or Python has easy functions for building arrays, Julia tends to side-step the actual "array" part with specially made types. One such example are ranges. To define a range, use the start:stepsize:end
syntax. For example:
a = 1:5 println(a) b = 1:2:10 println(b)
We can use them like any array. For example:
println(a[2]); println(b[3])
But what is b
?
println(typeof(b))
b
isn't an array, it's a StepRange. A StepRange has the ability to act like an array using its fields:
fieldnames(StepRange)
Note that at any time we can get the array from these kinds of type via the collect
function:
c = collect(a)
The reason why lazy iterator types are preferred is that they do not do the computations until it's absolutely necessary, and they take up much less space.
We can check this with @time
:
a = 1:100000 a = 1:100 b = collect(1:100000);
Notice that the amount of time the range takes is much shorter. This is mostly because there is a lot less memory allocation needed: only a StepRange
is built, and all that holds is the three numbers. However, b
has to hold 100000
numbers, leading to the huge difference.
Metaprogramming
Metaprogramming is a huge feature of Julia. The key idea is that every statement in Julia is of the type Expression
.
Thus you can think of metaprogramming as "code which takes in code and outputs code". One basic example is the @time
macro:
macro my_time(ex) return quote local t0 = time() local val = $ex local t1 = time() println("elapsed time: ", t1-t0, " seconds") val end end
collect(1:100000));(
This takes in an expression ex
, gets the time before and after evaluation, and prints the elapsed time. Note that $ex
"interpolates" the expression into the macro.
function mandelbrot(a) z = 0 for i=1:50 z = z^2 + a end return z end for y=1.0:-0.05:-1.0 for x=-2.0:0.0315:0.5 abs(mandelbrot(complex(x, y))) < 2 ? print("*") : print(" ") end println() end