Julia is a dynamic-typed language with a just-in-time compiler. This means that you don’t need to compile your program before you run it, like you would do in C++ or FORTRAN. Instead, Julia will take your code, guess types where necessary, and compile parts of code just before running it. Also, you don’t need to explicitly specify each type. Julia will guess types for you on the go.
The main differences between Julia and other dynamic languages such as R and Python are the following. First, Julia allows the user to specify type declarations. You already saw some types declarations in Why Julia? (Section 2): they are those double colons ::
that sometimes come after variables. However, if you don’t want to specify the type of your variables or functions, Julia will gladly infer (guess) them for you.
Second, Julia allows users to define function behavior across many combinations of argument types via multiple dispatch. We also covered multiple dispatch in Section 2.3. We defined a different type behavior by defining new function signatures for argument types while using the same function name.
Variables are values that you tell the computer to store with a specific name, so that you can later recover or change its value. Julia has several types of variables but, in data science, we mostly use:
Int64
Float64
Bool
String
Integers and real numbers are stored by using 64 bits by default, that’s why they have the 64
suffix in the name of the type. If you need more or less precision, there are Int8
or Int128
types, for example, where higher means more precision. Most of the time, this won’t be an issue so you can just stick to the defaults.
We create new variables by writing the variable name on the left and its value in the right, and in the middle we use the =
assignment operator. For example:
name = "Julia"
age = 9
9
Note that the return output of the last statement (age
) was printed to the console. Here, we are defining two new variables: name
and age
. We can recover their values by typing the names given in the assignment:
name
Julia
If you want to define new values for an existing variable, you can repeat the steps in the assignment. Note that Julia will now override the previous value with the new one. Supposed, Julia’s birthday has passed and now it has turned 10:
age = 10
10
We can do the same with its name
. Suppose that Julia has earned some titles due to its blazing speed. We would change the variable name
to the new value:
name = "Julia Rapidus"
Julia Rapidus
We can also do operations on variables such as addition or division. Let’s see how old Julia is, in months, by multiplying age
by 12:
12 * age
120
We can inspect the types of variables by using the typeof
function:
typeof(age)
Int64
The next question then becomes: “What else can I do with integers?” There is a nice handy function methodswith
that spits out every function available, along with its signature, for a certain type. Here, we will restrict the output to the first 5 rows:
first(methodswith(Int64), 5)
[1] AbstractFloat(x::Int64) @ Base float.jl:268
[2] Float16(x::Int64) @ Base float.jl:159
[3] Float32(x::Int64) @ Base float.jl:159
[4] Float64(x::Int64) @ Base float.jl:159
[5] Int64(x::Union{Bool, Int32, Int64, UInt16, UInt32, UInt64, UInt8, Int128, Int16, Int8, UInt128}) @ Core boot.jl:784
Having variables around without any sort of hierarchy or relationships is not ideal. In Julia, we can define that kind of structured data with a struct
(also known as a composite type). Inside each struct
, you can specify a set of fields. They differ from the primitive types (e.g. integer and floats) that are by default defined already inside the core of Julia language. Since most struct
s are user-defined, they are known as user-defined types.
For example, let’s create a struct
to represent scientific open source programming languages. We’ll also define a set of fields along with the corresponding types inside the struct
:
struct Language
name::String
title::String
year_of_birth::Int64
fast::Bool
end
To inspect the field names you can use the fieldnames
and pass the desired struct
as an argument:
fieldnames(Language)
(:name, :title, :year_of_birth, :fast)
To use struct
s, we must instantiate individual instances (or “objects”), each with its own specific values for the fields defined inside the struct
. Let’s instantiate two instances, one for Julia and one for Python:
julia = Language("Julia", "Rapidus", 2012, true)
python = Language("Python", "Letargicus", 1991, false)
Language("Python", "Letargicus", 1991, false)
One thing to note with struct
s is that we can’t change their values once they are instantiated. We can solve this with a mutable struct
. Also, note that mutable objects will, generally, be slower and more error prone. Whenever possible, make everything immutable. Let’s create a mutable struct
.
mutable struct MutableLanguage
name::String
title::String
year_of_birth::Int64
fast::Bool
end
julia_mutable = MutableLanguage("Julia", "Rapidus", 2012, true)
MutableLanguage("Julia", "Rapidus", 2012, true)
Suppose that we want to change julia_mutable
’s title. Now, we can do this since julia_mutable
is an instantiated mutable struct
:
julia_mutable.title = "Python Obliteratus"
julia_mutable
MutableLanguage("Julia", "Python Obliteratus", 2012, true)
Now that we’ve covered types, we can move to boolean operators and numeric comparison.
We have three boolean operators in Julia:
!
: NOT&&
: AND||
: ORHere are a few examples with some of them:
!true
false
(false && true) || (!false)
true
(6 isa Int64) && (6 isa Real)
true
Regarding numeric comparison, Julia has three major types of comparisons:
Here are some examples:
1 == 1
true
1 >= 10
false
It evens works between different types:
1 == 1.0
true
We can also mix and match boolean operators with numeric comparisons:
(1 != 10) || (3.14 <= 2.71)
true
Now that we already know how to define variables and custom types as struct
s, let’s turn our attention to functions. In Julia, a function maps argument’s values to one or more return values. The basic syntax goes like this:
function function_name(arg1, arg2)
result = stuff with the arg1 and arg2
return result
end
The function declaration begins with the keyword function
followed by the function name. Then, inside parentheses ()
, we define the arguments separated by a comma ,
. Inside the function, we specify what we want Julia to do with the parameters that we supplied. All variables that we define inside a function are deleted after the function returns. This is nice because it is like an automatic cleanup. After all the operations in the function body are finished, we instruct Julia to return the final result with the return
statement. Finally, we let Julia know that the function definition is finished with the end
keyword.
There is also the compact assignment form:
f_name(arg1, arg2) = stuff with the arg1 and arg2
It is the same function as before but with a different, more compact, form. As a rule of thumb, when your code can fit easily on one line of up to 92 characters, then the compact form is suitable. Otherwise, just use the longer form with the function
keyword. Let’s dive into some examples.
Let’s create a new function that adds numbers together:
function add_numbers(x, y)
return x + y
end
add_numbers (generic function with 1 method)
Now, we can use our add_numbers
function:
add_numbers(17, 29)
46
And it works also with floats:
add_numbers(3.14, 2.72)
5.86
Also, we can define custom behavior by specifying type declarations. Suppose that we want to have a round_number
function that behaves differently if its argument is either a Float64
or Int64
:
function round_number(x::Float64)
return round(x)
end
function round_number(x::Int64)
return x
end
round_number (generic function with 2 methods)
We can see that it is a function with multiple methods:
methods(round_number)
round_number(x::Int64)
@ Main none:5
round_number(x::Float64)
@ Main none:1
There is one issue: what happens if we want to round a 32-bit float Float32
? Or a 8-bit integer Int8
?
If you want something to function on all float and integer types, you can use an abstract type as the type signature, such as AbstractFloat
or Integer
:
function round_number(x::AbstractFloat)
return round(x)
end
round_number (generic function with 3 methods)
Now, it works as expected with any float type:
x_32 = Float32(1.1)
round_number(x_32)
1.0f0
NOTE: We can inspect types with the
supertypes
andsubtypes
functions.
Let’s go back to our Language
struct
that we defined above. This is an example of multiple dispatch. We will extend the Base.show
function that prints the output of instantiated types and struct
s.
By default, a struct
has a basic output, which you saw above in the python
case. We can define a new Base.show
method to our Language
type, so that we have some nice printing for our programming languages instances. We want to clearly communicate programming languages’ names, titles, and ages in years. The function Base.show
accepts as arguments a IO
type named io
followed by the type you want to define custom behavior:
Base.show(io::IO, l::Language) = print(
io, l.name, ", ",
2021 - l.year_of_birth, " years old, ",
"has the following titles: ", l.title
)
Now, let’s see how python
will output:
python
Python, 30 years old, has the following titles: Letargicus
A function can, also, return two or more values. See the new function add_multiply
below:
function add_multiply(x, y)
addition = x + y
multiplication = x * y
return addition, multiplication
end
add_multiply (generic function with 1 method)
In that case, we can do two things:
We can, analogously as the return values, define two variables to hold the function return values, one for each return value:
return_1, return_2 = add_multiply(1, 2)
return_2
2
Or we can define just one variable to hold the function’s return values and access them with either first
or last
:
all_returns = add_multiply(1, 2)
last(all_returns)
2
Some functions can accept keyword arguments instead of positional arguments. These arguments are just like regular arguments, except that they are defined after the regular function’s arguments and separated by a semicolon ;
. For example, let’s define a logarithm
function that by default uses base \(e\) (2.718281828459045) as a keyword argument. Note that, here, we are using the abstract type Real
so that we cover all types derived from Integer
and AbstractFloat
, being both themselves subtypes of Real
:
AbstractFloat <: Real && Integer <: Real
true
function logarithm(x::Real; base::Real=2.7182818284590)
return log(base, x)
end
logarithm (generic function with 1 method)
It works without specifying the base
argument as we supplied a default argument value in the function declaration:
logarithm(10)
2.3025850929940845
And also with the keyword argument base
different from its default value:
logarithm(10; base=2)
3.3219280948873626
Often we don’t care about the name of the function and want to quickly make one. What we need are anonymous functions. They are used a lot in Julia’s data science workflow. For example, when using DataFrames.jl
(Section 4) or Makie.jl
(Section 6), sometimes we need a temporary function to filter data or format plot labels. That’s when we use anonymous functions. They are especially useful when we don’t want to create a function, and a simple in-place statement would be enough.
The syntax is simple. We use the ->
operator. On the left of ->
we define the parameter name. And on the right of ->
we define what operations we want to perform on the parameter that we defined on the left of ->
. Here is an example. Suppose that we want to undo the log transformation by using an exponentiation:
map(x -> 2.7182818284590^x, logarithm(2))
2.0
Here, we are using the map
function to conveniently map the anonymous function (first argument) to logarithm(2)
(the second argument). As a result, we get back the same number, because logarithm and exponentiation are inverse (at least in the base that we’ve chosen – 2.7182818284590)
In most programming languages, the user is allowed to control the computer’s flow of execution. Depending on the situation, we want the computer to do one thing or another. In Julia we can control the flow of execution with if
, elseif
, and else
keywords. These are known as conditional statements.
The if
keyword prompts Julia to evaluate an expression and, depending on whether it’s true
or false
, execute certain portions of code. We can compound several if
conditions with the elseif
keyword for complex control flow. Finally, we can define an alternative portion to be executed if anything inside the if
or elseif
s is evaluated to true
. This is the purpose of the else
keyword. Finally, like all the previous keyword operators that we saw, we must tell Julia when the conditional statement is finished with the end
keyword.
Here’s an example with all the if
-elseif
-else
keywords:
a = 1
b = 2
if a < b
"a is less than b"
elseif a > b
"a is greater than b"
else
"a is equal to b"
end
a is less than b
We can even wrap this in a function called compare
:
function compare(a, b)
if a < b
"a is less than b"
elseif a > b
"a is greater than b"
else
"a is equal to b"
end
end
compare(3.14, 3.14)
a is equal to b
The classical for loop in Julia follows a similar syntax as the conditional statements. You begin with a keyword, in this case for
. Then, you specify what Julia should “loop” for, i.e., a sequence. Also, like everything else, you must finish with the end
keyword.
So, to make Julia print every number from 1 to 10, you can use the following for loop:
for i in 1:10
println(i)
end
The while loop is a mix of the previous conditional statements and for loops. Here, the loop is executed every time the condition is true
. The syntax follows the same form as the previous one. We begin with the keyword while
, followed by a statement that evaluates to true
or false
. As usual, you must end with the end
keyword.
Here’s an example:
n = 0
while n < 3
global n += 1
end
n
3
As you can see, we have to use the global
keyword. This is because of variable scope. Variables defined inside conditional statements, loops, and functions exist only inside them. This is known as the scope of the variable. Here, we had to tell Julia that the n
inside while
loop is in the global scope with the global
keyword.
Finally, we also used the +=
operator which is a nice shorthand for n = n + 1
.