2  Control flow, structs, modules and types

We have discussed a lot of Zig’s syntax in the last chapter, especially in Section 1.2.2 and Section 1.2.3. But we still need to discuss some other very important elements of the language. Elements that you will use constantly on your day-to-day routine.

We begin this chapter by discussing the different keywords and structures in Zig related to control flow (e.g. loops and if statements). Then, we talk about structs and how they can be used to do some basic Object-Oriented (OOP) patterns in Zig. We also talk about type inference and type casting. Finally, we end this chapter by discussing modules, and how they relate to structs.

2.1 Control flow

Sometimes, you need to make decisions in your program. Maybe you need to decide whether or not to execute a specific piece of code. Or maybe, you need to apply the same operation over a sequence of values. These kinds of tasks, involve using structures that are capable of changing the “control flow” of our program.

In computer science, the term “control flow” usually refers to the order in which expressions (or commands) are evaluated in a given language or program. But this term is also used to refer to structures that are capable of changing this “evaluation order” of the commands executed by a given language/program.

These structures are better known by a set of terms, such as: loops, if/else statements, switch statements, among others. So, loops and if/else statements are examples of structures that can change the “control flow” of our program. The keywords continue and break are also examples of symbols that can change the order of evaluation, since they can move our program to the next iteration of a loop, or make the loop stop completely.

2.1.1 If/else statements

An if/else statement performs a “conditional flow operation”. A conditional flow control (or choice control) allows you to execute or ignore a certain block of commands based on a logical condition. Many programmers and computer science professionals also use the term “branching” in this case. In essence, an if/else statement allow us to use the result of a logical test to decide whether or not to execute a given block of commands.

In Zig, we write if/else statements by using the keywords if and else. We start with the if keyword followed by a logical test inside a pair of parentheses, followed by a pair of curly braces which contains the lines of code to be executed in case the logical test returns the value true.

After that, you can optionally add an else statement. To do that, just add the else keyword followed by a pair of curly braces, with the lines of code to executed in case the logical test defined at if returns false.

In the example below, we are testing if the object x contains a number that is greater than 10. Judging by the output printed to the console, we know that this logical test returned false. Because the output in the console is compatible with the line of code present in the else branch of the if/else statement.

const x = 5;
if (x > 10) {
    try stdout.print(
        "x > 10!\n", .{}
    );
} else {
    try stdout.print(
        "x <= 10!\n", .{}
    );
}
x <= 10!

2.1.2 Switch statements

Switch statements are also available in Zig, and they have a very similar syntax to a switch statement in Rust. As you would expect, to write a switch statement in Zig we use the switch keyword. We provide the value that we want to “switch over” inside a pair of parentheses. Then, we list the possible combinations (or “branches”) inside a pair of curly braces.

Let’s take a look at the code example below. You can see that I’m creating an enum type called Role. We talk more about enums in Section 7.6. But in summary, this Role type is listing different types of roles in a fictitious company, like SE for Software Engineer, DE for Data Engineer, PM for Product Manager, etc.

Notice that we are using the value from the role object in the switch statement, to discover which exact area we need to store in the area variable object. Also notice that we are using type inference inside the switch statement, with the dot character, as we are going to describe in Section 2.4. This makes the zig compiler infer the correct data type of the values (PM, SE, etc.) for us.

Also notice that, we are grouping multiple values in the same branch of the switch statement. We just separate each possible value with a comma. For example, if role contains either DE or DA, the area variable would contain the value "Data & Analytics", instead of "Platform" or "Sales".

const std = @import("std");
const stdout = std.io.getStdOut().writer();
const Role = enum {
    SE, DPE, DE, DA, PM, PO, KS
};

pub fn main() !void {
    var area: []const u8 = undefined;
    const role = Role.SE;
    switch (role) {
        .PM, .SE, .DPE, .PO => {
            area = "Platform";
        },
        .DE, .DA => {
            area = "Data & Analytics";
        },
        .KS => {
            area = "Sales";
        },
    }
    try stdout.print("{s}\n", .{area});
}
Platform

2.1.2.1 Switch statements must exhaust all possibilities

One very important aspect about switch statements in Zig is that they must exhaust all existing possibilities. In other words, all possible values that could be found inside the order object must be explicitly handled in this switch statement.

Since the role object have type Role, the only possible values to be found inside this object are PM, SE, DPE, PO, DE, DA and KS. There are no other possible values to be stored in this role object. Thus, the switch statements must have a combination (branch) for each one of these values. This is what “exhaust all existing possibilities” means. The switch statement covers every possible case.

Therefore, you cannot write a switch statement in Zig, and leave an edge case with no explicit action to be taken. This is a similar behaviour to switch statements in Rust, which also have to handle all possible cases.

2.1.2.2 The else branch

Take a look at the dump_hex_fallible() function below as an example. This function comes from the Zig Standard Library. More precisely, from the debug.zig module1. There are multiple lines in this function, but I omitted them to focus solely on the switch statement found in this function. Notice that this switch statement has four possible cases (i.e. four explicit branches). Also, notice that we used an else branch in this case.

An else branch in a switch statement works as the “default branch”. Whenever you have multiple cases in your switch statement where you want to apply the exact same action, you can use an else branch to do that.

pub fn dump_hex_fallible(bytes: []const u8) !void {
    // Many lines ...
    switch (byte) {
        '\n' => try writer.writeAll("␊"),
        '\r' => try writer.writeAll("␍"),
        '\t' => try writer.writeAll("␉"),
        else => try writer.writeByte('.'),
    }
}

Many programmers would also use an else branch to handle a “not supported” case. That is, a case that cannot be properly handled by your code, or, just a case that should not be “fixed”. Therefore, you can use an else branch to panic (or raise an error) in your program to stop the current execution.

Take the code example below. We can see that, we are handling the cases for the level object being either 1, 2, or 3. All other possible cases are not supported by default, and, as consequence, we raise a runtime error in such cases through the @panic() built-in function.

Also notice that, we are assigning the result of the switch statement to a new object called category. This is another thing that you can do with switch statements in Zig. If a branch outputs a value as result, you can store the result value of the switch statement into a new object.

const level: u8 = 4;
const category = switch (level) {
    1, 2 => "beginner",
    3 => "professional",
    else => {
        @panic("Not supported level!");
    },
};
try stdout.print("{s}\n", .{category});
thread 13103 panic: Not supported level!
t.zig:9:13: 0x1033c58 in main (switch2)
            @panic("Not supported level!");
            ^

2.1.2.3 Using ranges in switch

Furthermore, you can also use ranges of values in switch statements. That is, you can create a branch in your switch statement that is used whenever the input value is within the specified range. These “range expressions” are created with the operator .... It is important to emphasize that the ranges created by this operator are inclusive on both ends.

For example, I could easily change the previous code example to support all levels between 0 and 100. Like this:

const level: u8 = 4;
const category = switch (level) {
    0...25 => "beginner",
    26...75 => "intermediary",
    76...100 => "professional",
    else => {
        @panic("Not supported level!");
    },
};
try stdout.print("{s}\n", .{category});
beginner

This is neat, and it works with character ranges too. That is, I could simply write 'a'...'z', to match any character value that is a lowercase letter, and it would work fine.

2.1.2.4 Labeled switch statements

In Section 1.7 we have talked about labeling blocks, and also, about using these labels to return a value from the block. Well, from version 0.14.0 and onwards of the zig compiler, you can also apply labels over switch statements, which makes it possible to almost implement a “C goto” like pattern.

For example, if you give the label xsw to a switch statement, you can use this label in conjunction with the continue keyword to go back to the beginning of the switch statement. In the example below, the execution goes back to the beginning of the switch statement two times, before ending at the 3 branch.

xsw: switch (@as(u8, 1)) {
    1 => {
        try stdout.print("First branch\n", .{});
        continue :xsw 2;
    },
    2 => continue :xsw 3,
    3 => return,
    4 => {},
    else => {
        try stdout.print(
            "Unmatched case, value: {d}\n", .{@as(u8, 1)}
        );
    },
}

2.1.3 The defer keyword

With the defer keyword you can register an expression to be executed when you exit the current scope. Therefore, this keyword has a similar functionality as the on.exit() function from R. Take the foo() function below as an example. When we execute this foo() function, the expression that prints the message “Exiting function …” is getting executed only when the function exits its scope.

const std = @import("std");
const stdout = std.io.getStdOut().writer();
fn foo() !void {
    defer std.debug.print(
        "Exiting function ...\n", .{}
    );
    try stdout.print("Adding some numbers ...\n", .{});
    const x = 2 + 2; _ = x;
    try stdout.print("Multiplying ...\n", .{});
    const y = 2 * 8; _ = y;
}

pub fn main() !void {
    try foo();
}
Adding some numbers ...
Multiplying ...
Exiting function ...

Therefore, we can use defer to declare an expression that is going to be executed when your code exits the current scope. Some programmers like to interpret the phrase “exit of the current scope” as “the end of the current scope”. But this interpretation might not be entirely correct, depending on what you consider as “the end of the current scope”.

I mean, what do you consider as the end of the current scope? Is it the closing curly bracket (}) of the scope? Is it when the last expression in the function get’s executed? Is it when the function returns to the previous scope? Etc. For example, it would not be correct to interpret the “exit of the current scope” as the closing curly bracket of the scope. Because the function might exit from an earlier position than this closing curly bracket (e.g. an error value was generated at a previous line inside the function; the function reached an earlier return statement; etc.). Anyway, just be careful with this interpretation.

Now, if you remember of what we have discussed in Section 1.7, there are multiple structures in the language that create their own separate scopes. For/while loops, if/else statements, functions, normal blocks, etc. This also affects the interpretation of defer. For example, if you use defer inside a for loop, then, the given expression will be executed everytime this specific for loop exits its own scope.

Before we continue, is worth emphasizing that the defer keyword is an “unconditional defer”. Which means that the given expression will be executed no matter how the code exits the current scope. For example, your code might exit the current scope because of an error value being generated, or, because of a return statement, or, a break statement, etc.

2.1.4 The errdefer keyword

On the previous section, we have discussed the defer keyword, which you can use to register an expression to be executed at the exit of the current scope. But this keyword has a brother, which is the errdefer keyword. While defer is an “unconditional defer”, the errdefer keyword is a “conditional defer”. Which means that the given expression is executed only when you exit the current scope on a very specific circumstance.

In more details, the expression given to errdefer is executed only when an error occurs in the current scope. Therefore, if the function (or for/while loop, if/else statement, etc.) exits the current scope in a normal situation, without errors, the expression given to errdefer is not executed.

This makes the errdefer keyword one of the many tools available in Zig for error handling. In this section, we are more concerned with the control flow aspects around errdefer. But we are going to discuss errdefer later as a error handling tool in Section 10.2.4.

The code example below demonstrates three things:

  • that defer is an “unconditional defer”, because the given expression get’s executed regardless of how the function foo() exits its own scope.
  • that errdefer is executed because the function foo() returned an error value.
  • that defer and errdefer expressions are executed in a LIFO (last in, first out) order.
const std = @import("std");
fn foo() !void { return error.FooError; }
pub fn main() !void {
    var i: usize = 1;
    errdefer std.debug.print("Value of i: {d}\n", .{i});
    defer i = 2;
    try foo();
}
Value of i: 2
error: FooError
/t.zig:6:5: 0x1037e48 in foo (defer)
    return error.FooError;
    ^

When I say that “defer expressions” are executed in a LIFO order, what I want to say is that the last defer or errdefer expressions in the code are the first ones to be executed. You could also interpret this as: “defer expressions” are executed from bottom to top, or, from last to first.

Therefore, if I change the order of the defer and errdefer expressions, you will notice that the value of i that get’s printed to the console changes to 1. This doesn’t mean that the defer expression was not executed in this case. This actually means that the defer expression was executed only after the errdefer expression. The code example below demonstrates this:

const std = @import("std");
fn foo() !void { return error.FooError; }
pub fn main() !void {
    var i: usize = 1;
    defer i = 2;
    errdefer std.debug.print("Value of i: {d}\n", .{i});
    try foo();
}
Value of i: 1
error: FooError
/t.zig:6:5: 0x1037e48 in foo (defer)
    return error.FooError;
    ^

2.1.5 For loops

A loop allows you to execute the same lines of code multiple times, thus, creating a “repetition space” in the execution flow of your program. Loops are particularly useful when we want to replicate the same function (or the same set of commands) over different inputs.

There are different types of loops available in Zig. But the most essential of them all is probably the for loop. A for loop is used to apply the same piece of code over the elements of a slice, or, an array.

For loops in Zig use a syntax that may be unfamiliar to programmers coming from other languages. You start with the for keyword, then, you list the items that you want to iterate over inside a pair of parentheses. Then, inside of a pair of pipes (|) you should declare an identifier that will serve as your iterator, or, the “repetition index of the loop”.

for (items) |value| {
    // code to execute
}

Therefore, instead of using a (value in items) syntax, in Zig, for loops use the syntax (items) |value|. In the example below, you can see that we are looping through the items of the array stored at the object name, and printing to the console the decimal representation of each character in this array.

If we wanted, we could also iterate through a slice (or a portion) of the array, instead of iterating through the entire array stored in the name object. Just use a range selector to select the section you want. For example, I could provide the expression name[0..3] to the for loop, to iterate just through the first 3 elements in the array.

const name = [_]u8{'P','e','d','r','o'};
for (name) |char| {
    try stdout.print("{d} | ", .{char});
}
80 | 101 | 100 | 114 | 111 | 

In the above example we are using the value itself of each element in the array as our iterator. But there are many situations where we need to use an index instead of the actual values of the items.

You can do that by providing a second set of items to iterate over. More precisely, you provide the range selector 0.. to the for loop. So, yes, you can use two different iterators at the same time in a for loop in Zig.

But remember from Section 1.4 that, every object you create in Zig must be used in some way. So if you declare two iterators in your for loop, you must use both iterators inside the for loop body. But if you want to use just the index iterator, and not use the “value iterator”, then, you can discard the value iterator by maching the value items to the underscore character, like in the example below:

const name = "Pedro";
for (name, 0..) |_, i| {
    try stdout.print("{d} | ", .{i});
}
0 | 1 | 2 | 3 | 4 |

2.1.6 While loops

A while loop is created from the while keyword. A for loop iterates through the items of an array, but a while loop will loop continuously, and infinitely, until a logical test (specified by you) becomes false.

You start with the while keyword, then, you define a logical expression inside a pair of parentheses, and the body of the loop is provided inside a pair of curly braces, like in the example below:

var i: u8 = 1;
while (i < 5) {
    try stdout.print("{d} | ", .{i});
    i += 1;
}
1 | 2 | 3 | 4 | 

You can also specify the increment expression to be used at the beginning of a while loop. To do that, we write the increment expression inside a pair of parentheses after a colon character (:). The code example below demonstrates this other pattern.

var i: u8 = 1;
while (i < 5) : (i += 1) {
    try stdout.print("{d} | ", .{i});
}
1 | 2 | 3 | 4 | 

2.1.7 Using break and continue

In Zig, you can explicitly stop the execution of a loop, or, jump to the next iteration of the loop, by using the keywords break and continue, respectively. The while loop presented in the next code example is, at first sight, an infinite loop. Because the logical value inside the parenthese will always be equal to true. But what makes this while loop stop when the i object reaches the count 10? It is the break keyword!

Inside the while loop, we have an if statement that is constantly checking if the i variable is equal to 10. Since we are incrementing the value of i at each iteration of the while loop, this i object will eventually be equal to 10, and when it is, the if statement will execute the break expression, and, as a result, the execution of the while loop is stopped.

Notice the use of the expect() function from the Zig Standard Library after the while loop. This expect() function is an “assert” type of function. This function checks if the logical test provided is equal to true. If so, the function do nothing. Otherwise (i.e. the logical test is equal to false), the function raises an assertion error.

var i: usize = 0;
while (true) {
    if (i == 10) {
        break;
    }
    i += 1;
}
try std.testing.expect(i == 10);
try stdout.print("Everything worked!", .{});
Everything worked!

Since this code example was executed successfully by the zig compiler, without raising any errors, we known that, after the execution of the while loop, the i object is equal to 10. Because if it wasn’t equal to 10, an error would have been raised by expect().

Now, in the next example, we have a use case for the continue keyword. The if statement is constantly checking if the current index is a multiple of 2. If it is, we jump to the next iteration of the loop. Otherwise, the loop just prints the current index to the console.

const ns = [_]u8{1,2,3,4,5,6};
for (ns) |i| {
    if ((i % 2) == 0) {
        continue;
    }
    try stdout.print("{d} | ", .{i});
}
1 | 3 | 5 | 

2.2 Function parameters are immutable

We have already discussed a lot of the syntax behind function declarations in Section 1.2.2 and Section 1.2.3. But I want to emphasize a curious fact about function parameters (a.k.a. function arguments) in Zig. In summary, function parameters are immutable in Zig.

Take the code example below, where we declare a simple function that just tries to add some amount to the input integer, and returns the result back. If you look closely at the body of this add2() function, you will notice that we try to save the result back into the x function argument.

In other words, this function not only uses the value that it received through the function argument x, but it also tries to change the value of this function argument, by assigning the addition result into x. However, function arguments in Zig are immutable. You cannot change their values, or, you cannot assign values to them inside the body’s function.

This is the reason why, the code example below does not compile successfully. If you try to compile this code example, you will get a compile error message about “trying to change the value of a immutable (i.e. constant) object”.

const std = @import("std");
fn add2(x: u32) u32 {
    x = x + 2;
    return x;
}

pub fn main() !void {
    const y = add2(4);
    std.debug.print("{d}\n", .{y});
}
t.zig:3:5: error: cannot assign to constant
    x = x + 2;
    ^

2.2.1 A free optimization

If a function argument receives as input an object whose data type is any of the primitive types that we have listed in Section 1.5, this object is always passed by value to the function. In other words, this object is copied into the function stack frame.

However, if the input object have a more complex data type, for example, it might be a struct instance, or an array, or an union value, etc., in cases like that, the zig compiler will take the liberty of deciding for you which strategy is best. Thus, the zig compiler will pass your object to the function either by value, or by reference. The compiler will always choose the strategy that is faster for you. This optimization that you get for free is possible only because function arguments are immutable in Zig.

2.2.2 How to overcome this barrier

There are some situations where you might need to change the value of your function argument directly inside the function’s body. This happens more often when we are passing C structs as inputs to Zig functions.

In a situation like this, you can overcome this barrier by using a pointer. In other words, instead of passing a value as input to the argument, you can pass a “pointer to value” instead. You can change the value that the pointer points to, by dereferencing it.

Therefore, if we take our previous add2() example, we can change the value of the function argument x inside the function’s body by marking the x argument as a “pointer to a u32 value” (i.e. *u32 data type), instead of a u32 value. By making it a pointer, we can finally alter the value of this function argument directly inside the body of the add2() function. You can see that the code example below compiles successfully.

const std = @import("std");
fn add2(x: *u32) void {
    const d: u32 = 2;
    x.* = x.* + d;
}

pub fn main() !void {
    var x: u32 = 4;
    add2(&x);
    std.debug.print("Result: {d}\n", .{x});
}
Result: 6

Even in this code example above, the x argument is still immutable. Which means that the pointer itself is immutable. Therefore, you cannot change the memory address that it points to. However, you can dereference the pointer to access the value that it points to, and also, to change this value, if you need to.

2.3 Structs and OOP

Zig is a language more closely related to C (which is a procedural language), than it is to C++ or Java (which are object-oriented languages). Because of that, you do not have advanced OOP (Object-Oriented Programming) patterns available in Zig, such as classes, interfaces or class inheritance. Nonetheless, OOP in Zig is still possible by using struct definitions.

With struct definitions, you can create (or define) a new data type in Zig. These struct definitions work the same way as they work in C. You give a name to this new struct (or, to this new data type you are creating), then, you list the data members of this new struct. You can also register functions inside this struct, and they become the methods of this particular struct (or data type), so that, every object that you create with this new type, will always have these methods available and associated with them.

In C++, when we create a new class, we normally have a constructor method (or, a constructor function) which is used to construct (or, to instantiate) every object of this particular class, and we also have a destructor method (or a destructor function), which is the function responsible for destroying every object of this class.

In Zig, we normally declare the constructor and the destructor methods of our structs, by declaring an init() and a deinit() methods inside the struct. This is just a naming convention that you will find across the entire Zig Standard Library. So, in Zig, the init() method of a struct is normally the constructor method of the class represented by this struct. While the deinit() method is the method used for destroying an existing instance of that struct.

The init() and deinit() methods are both used extensively in Zig code, and you will see both of them being used when we talk about allocators in Section 3.3. But, as another example, let’s build a simple User struct to represent a user of some sort of system.

If you look at the User struct below, you can see the struct keyword. Notice the data members of this struct: id, name and email. Every data member has its type explicitly annotated, with the colon character (:) syntax that we described earlier in Section 1.2.2. But also notice that every line in the struct body that describes a data member, ends with a comma character (,). So every time you declare a data member in your Zig code, always end the line with a comma character, instead of ending it with the traditional semicolon character (;).

Next, we have registered an init() function as a method of this User struct. This init() method is the constructor method that we will use to instantiate every new User object. That is why this init() function returns a new User object as result.

const std = @import("std");
const stdout = std.io.getStdOut().writer();
const User = struct {
    id: u64,
    name: []const u8,
    email: []const u8,

    pub fn init(id: u64,
                name: []const u8,
                email: []const u8) User {

        return User {
            .id = id,
            .name = name,
            .email = email
        };
    }

    pub fn print_name(self: User) !void {
        try stdout.print("{s}\n", .{self.name});
    }
};

pub fn main() !void {
    const u = User.init(1, "pedro", "email@gmail.com");
    try u.print_name();
}
pedro

The pub keyword plays an important role in struct declarations, and OOP in Zig. Every method that you declare in your struct that is marked with the keyword pub, becomes a public method of this particular struct.

So every method that you create inside your struct, is, at first, a private method of that struct. Meaning that, this method can only be called from within this struct. But, if you mark this method as public, with the keyword pub, then, you can call the method directly from an instance of the User struct.

In other words, the functions marked by the keyword pub are members of the public API of that struct. For example, if I did not mark the print_name() method as public, then, I could not execute the line u.print_name(). Because I would not be authorized to call this method directly in my code.

2.3.1 Anonymous struct literals

You can declare a struct object as a literal value. When we do that, we normally specify the data type of this struct literal by writing its data type just before the opening curly brace. For example, I could write a struct literal value of the type User that we have defined in the previous section like this:

const eu = User {
    .id = 1,
    .name = "Pedro",
    .email = "someemail@gmail.com"
};
_ = eu;

However, in Zig, we can also write an anonymous struct literal. That is, you can write a struct literal, but not specify explicitly the type of this particular struct. An anonymous struct is written by using the syntax .{}. So, we essentially replaced the explicit type of the struct literal with a dot character (.).

As we described in Section 2.4, when you put a dot before a struct literal, the type of this struct literal is automatically inferred by the zig compiler. In essence, the zig compiler will look for some hint of what is the type of that struct. This hint can be the type annotation of a function argument, or the return type annotation of the function that you are using, or the type annotation of an existing object. If the compiler does find such type annotation, it will use this type in your literal struct.

Anonymous structs are very commonly used as inputs to function arguments in Zig. One example that you have seen already constantly, is the print() function from the stdout object. This function takes two arguments. The first argument, is a template string, which should contain string format specifiers in it, which tells how the values provided in the second argument should be printed into the message.

While the second argument is a struct literal that lists the values to be printed into the template message specified in the first argument. You normally want to use an anonymous struct literal here, so that the zig compiler do the job of specifying the type of this particular anonymous struct for you.

const std = @import("std");
pub fn main() !void {
    const stdout = std.io.getStdOut().writer();
    try stdout.print("Hello, {s}!\n", .{"world"});
}
Hello, world!

2.3.2 Struct declarations must be constant

Types in Zig must be const or comptime (we are going to talk more about comptime in Section 12.1). What this means is that you cannot create a new data type, and mark it as variable with the var keyword. So struct declarations are always constant. You cannot declare a new struct type using the var keyword. It must be const.

In the Vec3 example below, this declaration is allowed because I’m using the const keyword to declare this new data type.

const Vec3 = struct {
    x: f64,
    y: f64,
    z: f64,
};

2.3.3 The self method argument

In every language that have OOP, when we declare a method of some class or struct, we usually declare this method as a function that has a self argument. This self argument is the reference to the object itself from which the method is being called from.

It is not mandatory to use this self argument. But why would you not use this self argument? There is no reason to not use it. Because the only way to get access to the data stored in the data members of your struct is to access them through this self argument. If you don’t need to use the data in the data members of your struct inside your method, you very likely don’t need a method. You can just declare this logic as a simple function, outside of your struct declaration.

Take the Vec3 struct below. Inside this Vec3 struct we declared a method named distance(). This method calculates the distance between two Vec3 objects, by following the distance formula in euclidean space. Notice that this distance() method takes two Vec3 objects as input, self and other.

const std = @import("std");
const m = std.math;
const Vec3 = struct {
    x: f64,
    y: f64,
    z: f64,

    pub fn distance(self: Vec3, other: Vec3) f64 {
        const xd = m.pow(f64, self.x - other.x, 2.0);
        const yd = m.pow(f64, self.y - other.y, 2.0);
        const zd = m.pow(f64, self.z - other.z, 2.0);
        return m.sqrt(xd + yd + zd);
    }
};

The self argument corresponds to the Vec3 object from which this distance() method is being called from. While the other is a separate Vec3 object that is given as input to this method. In the example below, the self argument corresponds to the object v1, because the distance() method is being called from the v1 object, while the other argument corresponds to the object v2.

const v1 = Vec3 {
    .x = 4.2, .y = 2.4, .z = 0.9
};
const v2 = Vec3 {
    .x = 5.1, .y = 5.6, .z = 1.6
};

std.debug.print(
    "Distance: {d}\n",
    .{v1.distance(v2)}
);
Distance: 3.3970575502926055

2.3.4 About the struct state

Sometimes you don’t need to care about the state of your struct object. Sometimes, you just need to instantiate and use the objects, without altering their state. You can notice that when you have methods inside your struct declaration that might use the values that are present in the data members, but they do not alter the values in these data members of the struct in anyway.

The Vec3 struct that was presented in Section 2.3.3 is an example of that. This struct have a single method named distance(), and this method does use the values present in all three data members of the struct (x, y and z). But at the same time, this method does not change the values of these data members at any point.

As a result of that, when we create Vec3 objects we usually create them as constant objects, like the v1 and v2 objects presented in Section 2.3.3. We can create them as variable objects with the var keyword, if we want to. But because the methods of this Vec3 struct do not change the state of the objects in any point, it’s unnecessary to mark them as variable objects.

But why? Why am I talking about this here? It’s because the self argument in the methods is affected depending on whether the methods present in a struct change or don’t change the state of the object itself. More specifically, when you have a method in a struct that changes the state of the object (i.e. change the value of a data member), the self argument in this method must be annotated in a different manner.

As I described in Section 2.3.3, the self argument in methods of a struct is the argument that receives as input the object from which the method was called from. We usually annotate this argument in the methods by writing self, followed by the colon character (:), and the data type of the struct to which the method belongs to (e.g. User, Vec3, etc.).

If we take the Vec3 struct that we defined in the previous section as an example, we can see in the distance() method that this self argument is annotated as self: Vec3. Because the state of the Vec3 object is never altered by this method.

But what if we do have a method that alters the state of the object, by altering the values of its data members, how should we annotate self in this instance? The answer is: “we should annotate self as a pointer of x, instead of just x”. In other words, you should annotate self as self: *x, instead of annotating it as self: x.

If we create a new method inside the Vec3 object that, for example, expands the vector by multiplying its coordinates by a factor of two, then, we need to follow this rule specified in the previous paragraph. The code example below demonstrates this idea:

const std = @import("std");
const m = std.math;
const Vec3 = struct {
    x: f64,
    y: f64,
    z: f64,

    pub fn distance(self: Vec3, other: Vec3) f64 {
        const xd = m.pow(f64, self.x - other.x, 2.0);
        const yd = m.pow(f64, self.y - other.y, 2.0);
        const zd = m.pow(f64, self.z - other.z, 2.0);
        return m.sqrt(xd + yd + zd);
    }

    pub fn twice(self: *Vec3) void {
        self.x = self.x * 2.0;
        self.y = self.y * 2.0;
        self.z = self.z * 2.0;
    }
};

Notice in the code example above that we have added a new method to our Vec3 struct named twice(). This method doubles the coordinate values of our vector object. In the case of the twice() method, we annotated the self argument as *Vec3, indicating that this argument receives a pointer (or a reference, if you prefer to call it this way) to a Vec3 object as input.

var v3 = Vec3 {
    .x = 4.2, .y = 2.4, .z = 0.9
};
v3.twice();
std.debug.print("Doubled: {d}\n", .{v3.x});
Doubled: 8.4

Now, if you change the self argument in this twice() method to self: Vec3, like in the distance() method, you will get the compiler error exposed below as result. Notice that this error message is showing a line from the twice() method body, indicating that you cannot alter the value of the x data member.

// If we change the function signature of double to:
    pub fn twice(self: Vec3) void {
t.zig:16:13: error: cannot assign to constant
        self.x = self.x * 2.0;
        ~~~~^~

This error message indicates that the x data member belongs to a constant object, and, because of that, it cannot be changed. Ultimately, this error message is telling us that the self argument is constant.

If you take some time, and think hard about this error message, you will understand it. You already have the tools to understand why we are getting this error message. We have talked about it already in Section 2.2. So remember, every function argument is immutable in Zig, and self is no exception to this rule.

In this example, we marked the v3 object as a variable object. But this does not matter. Because it is not about the input object, it is about the function argument.

The problem begins when we try to alter the value of self directly, which is a function argument, and, every function argument is immutable by default. You may ask yourself how can we overcome this barrier, and once again, the solution was also discussed in Section 2.2. We overcome this barrier, by explicitly marking the self argument as a pointer.

Note

If a method of your x struct alters the state of the object, by changing the value of any data member, then, remember to use self: *x, instead of self: x in the function signature of this method.

You could also interpret the content discussed in this section as: “if you need to alter the state of your x struct object in one of its methods, you must explicitly pass the x struct object by reference to the self argument of this method”.

2.4 Type inference

Zig is a strongly typed language. But, there are some situations where you don’t have to explicitly write the type of every single object in your source code, as you would expect from a traditional strongly typed language, such as C and C++.

In some situations, the zig compiler can use type inference to solve the data types for you, easing some of the burden that you carry as a developer. The most common way this happens is through function arguments that receive struct objects as input.

In general, type inference in Zig is done by using the dot character (.). Everytime you see a dot character written before a struct literal, or before an enum value, or something like that, you know that this dot character is playing a special party in this place. More specifically, it is telling the zig compiler something along the lines of: “Hey! Can you infer the type of this value for me? Please!”. In other words, this dot character is playing a similar role as the auto keyword in C++.

I gave you some examples of this in Section 2.3.1, where we used anonymous struct literals. Anonymous struct literals are, struct literals that use type inference to infer the exact type of this particular struct literal. This type inference is done by looking for some minimal hint of the correct data type to be used. You could say that the zig compiler looks for any neighbouring type annotation that might tell it what the correct type would be.

Another common place where we use type inference in Zig is at switch statements (which we talked about in Section 2.1.2). I also gave some other examples of type inference in Section 2.1.2, where we were inferring the data types of enum values listed inside of switch statements (e.g. .DE). But as another example, take a look at this fence() function reproduced below, which comes from the atomic.zig module2 of the Zig Standard Library.

There are a lot of things in this function that we haven’t talked about yet, such as: what comptime means? inline? extern? Let’s just ignore all of these things, and focus solely on the switch statement that is inside this function.

We can see that this switch statement uses the order object as input. This order object is one of the inputs of this fence() function, and we can see in the type annotation, that this object is of type AtomicOrder. We can also see a bunch of values inside the switch statements that begin with a dot character, such as .release and .acquire.

Because these weird values contain a dot character before them, we are asking the zig compiler to infer the types of these values inside the switch statement. Then, the zig compiler is looking into the current context where these values are being used, and it is trying to infer the types of these values.

Since they are being used inside a switch statement, the zig compiler looks into the type of the input object given to the switch statement, which is the order object in this case. Because this object have type AtomicOrder, the zig compiler infers that these values are data members from this type AtomicOrder.

pub inline fn fence(self: *Self, comptime order: AtomicOrder) void {
    // many lines of code ...
    if (builtin.sanitize_thread) {
        const tsan = struct {
            extern "c" fn __tsan_acquire(addr: *anyopaque) void;
            extern "c" fn __tsan_release(addr: *anyopaque) void;
        };

        const addr: *anyopaque = self;
        return switch (order) {
            .unordered, .monotonic => @compileError(
                @tagName(order)
                ++ " only applies to atomic loads and stores"
            ),
            .acquire => tsan.__tsan_acquire(addr),
            .release => tsan.__tsan_release(addr),
            .acq_rel, .seq_cst => {
                tsan.__tsan_acquire(addr);
                tsan.__tsan_release(addr);
            },
        };
    }

    return @fence(order);
}

This is how basic type inference is done in Zig. If we didn’t use the dot character before the values inside this switch statement, then, we would be forced to explicitly write the data types of these values. For example, instead of writing .release we would have to write AtomicOrder.release. We would have to do this for every single value in this switch statement, and this is a lot of work. That is why type inference is commonly used on switch statements in Zig.

2.5 Type casting

In this section, I want to discuss type casting (or, type conversion) with you. We use type casting when we have an object of type “x”, and we want to convert it into an object of type “y”, i.e. we want to change the data type of the object.

Most languages have a formal way to perform type casting. In Rust for example, we normally use the keyword as, and in C, we normally use the type casting syntax, e.g. (int) x. In Zig, we use the @as() built-in function to cast an object of type “x”, into an object of type “y”.

This @as() function is the preferred way to perform type conversion (or type casting) in Zig. Because it is explicit, and, it also performs the casting only if it is unambiguous and safe. To use this function, you just provide the target data type in the first argument, and, the object that you want cast as the second argument.

const std = @import("std");
const expect = std.testing.expect;
test {
    const x: usize = 500;
    const y = @as(u32, x);
    try expect(@TypeOf(y) == u32);
}
1/1 file10b91234368fa.test_0...OKAll 1 tests passe
  ed.

This is the general way to perform type casting in Zig. But remember, @as() works only when casting is unambiguous and safe, and there are situations where these assumptions do not hold. For example, when casting an integer value into a float value, or vice-versa, it is not clear to the compiler how to perform this conversion safely.

Therefore, we need to use specialized “casting functions” in such situations. For example, if you want to cast an integer value into a float value, then, you should use the @floatFromInt() function. In the inverse scenario, you should use the @intFromFloat() function.

In these functions, you just provide the object that you want to cast as input. Then, the target data type of the “type casting operation” is determined by the type annotation of the object where you are saving the results. In the example below, we are casting the object x into a value of type f32, because the object y, which is where we are saving the results, is annotated as an object of type f32.

const std = @import("std");
const expect = std.testing.expect;
test {
    const x: usize = 565;
    const y: f32 = @floatFromInt(x);
    try expect(@TypeOf(y) == f32);
}
1/1 file10b9130ae9221.test_0...OKAll 1 tests passe
  ed.

Another built-in function that is very useful when performing type casting operations is @ptrCast(). In essence, we use the @as() built-in function when we want to explicit convert (or cast) a Zig value/object from a type “x” to a type “y”, etc. However, pointers (we are going to discuss pointers in more depth in Chapter 6) are a special type of object in Zig, i.e. they are treated differently from “normal objects”.

Everytime a pointer is involved in some “type casting operation” in Zig, the @ptrCast() function is used. This function works similarly to @floatFromInt(). You just provide the pointer object that you want to cast as input to this function, and the target data type is, once again, determined by the type annotation of the object where the results are being stored.

const std = @import("std");
const expect = std.testing.expect;
test {
    const bytes align(@alignOf(u32)) = [_]u8{
        0x12, 0x12, 0x12, 0x12
    };
    const u32_ptr: *const u32 = @ptrCast(&bytes);
    try expect(@TypeOf(u32_ptr) == *const u32);
}
1/1 file10b914c434861.test_0...OKAll 1 tests passe
  ed.

2.6 Modules

We already talked about what modules are, and also, how to import other modules into your current module via import statements. Every Zig module (i.e. a .zig file) that you write in your project is internally stored as a struct object. Take the line exposed below as an example. In this line we are importing the Zig Standard Library into our current module.

const std = @import("std");

When we want to access the functions and objects from the standard library, we are basically accessing the data members of the struct stored in the std object. That is why we use the same syntax that we use in normal structs, with the dot operator (.) to access the data members and methods of the struct.

When this “import statement” get’s executed, the result of this expression is a struct object that contains the Zig Standard Library modules, global variables, functions, etc. And this struct object get’s saved (or stored) inside the constant object named std.

Take the thread_pool.zig module from the project zap3 as an example. This module is written as if it was a big struct. That is why we have a top-level and public init() method written in this module. The idea is that all top-level functions written in this module are methods from the struct, and all top-level objects and struct declarations are data members of this struct. The module is the struct itself.

So you would import and use this module by doing something like this:

const std = @import("std");
const ThreadPool = @import("thread_pool.zig");
const num_cpus = std.Thread.getCpuCount()
    catch @panic("failed to get cpu core count");
const num_threads = std.math.cast(u16, num_cpus)
    catch std.math.maxInt(u16);
const pool = ThreadPool.init(
    .{ .max_threads = num_threads }
);

  1. https://github.com/ziglang/zig/blob/master/lib/std/debug.zig↩︎

  2. https://github.com/ziglang/zig/blob/master/lib/std/atomic.zig.↩︎

  3. https://github.com/kprotty/zap/blob/blog/src/thread_pool.zig↩︎