6 Pointers and Optionals

On our next project we are going to build a HTTP server from scratch. But in order to do that, we need to learn more about pointers and how they work in Zig. Pointers in Zig are similar to pointers in C. But they come with some extra advantages in Zig.

A pointer is an object that contains a memory address. This memory address is the address where a particular value is stored in memory. It can be any value. Most of the times, it’s a value that comes from another object (or variable) present in our code.

In the example below, I’m creating two objects (number and pointer). The pointer object contains the memory address where the value of the number object (the number 5) is stored. So, that is a pointer in a nutshell. It’s a memory address that points to a particular existing value in the memory. You could also say, that, the pointer object points to the memory address where the number object is stored.

const number: u8 = 5;
const pointer = &number;
_ = pointer;

We create a pointer object in Zig by using the & operator. When you put this operator before the name of an existing object, you get the memory address of this object as result. When you store this memory address inside a new object, this new object becomes a pointer object. Because it stores a memory address.

People mostly use pointers as an alternative way to access a particular value. For example, I can use the pointer object to access the value stored by the number object. This operation of accessing the value that the pointer “points to” is normally called of dereferencing the pointer. We can dereference a pointer in Zig by using the * method of the pointer object. Like in the example below, where we take the number 5 pointed by the pointer object, and double it.

const number: u8 = 5;
const pointer = &number;
const doubled = 2 * pointer.*;
std.debug.print("{d}\n", .{doubled});

This syntax to dereference the pointer is nice. Because we can easily chain it with methods of the value pointed by the pointer. We can use the User struct that we have created in Section 2.3 as an example. If you comeback to that section, you will see that this struct have a method named print_name().

So, for example, if we have an user object, and a pointer that points to this user object, we can use the pointer to access this user object, and, at the same time, call the method print_name() on it, by chaining the dereference method (*) with the print_name() method. Like in the example below:

const u = User.init(1, "pedro", "email@gmail.com");
const pointer = &u;
try pointer.*.print_name();

pedro

We can also use pointers to effectively alter the value of an object. For example, I could use the pointer object to set the value of the object number to 6, like in the example below.

var number: u8 = 5;
const pointer = &number;
pointer.* = 6;
try stdout.print("{d}\n", .{number});

Therefore, as I mentioned earlier, people use pointers as an alternative way to access a particular value. And they use it especially when they do not want to “move” these values around. There are situations where, you want to access a particular value in a different scope (i.e., a different location) of your code, but you do not want to “move” this value to this new scope (or location) that you are in.

This matters especially if this value is big in size. Because if it is, then, moving this value becomes an expensive operation to do. The computer will have to spend a considerable amount of time copying this value to this new location.

Therefore, many programmers prefer to avoid this heavy operation of copying the value to the new location, by accessing this value through pointers. We are going to talk more about this “moving operation” over the next sections. For now, just keep in mind that avoiding this “move operation” is one of main reasons why pointers are used in programming languages.

6.1 Constant objects vs variable objects

You can have a pointer that points to a constant object, or, a pointer that points to a variable object. But regardless of who this pointer is, a pointer must always respect the characteristics of the object that it points to. As a consequence, if the pointer points to a constant object, then, you cannot use this pointer to change the value that it points to. Because it points to a value that is constant. As we discussed in Section 1.4, you cannot change a value that is constant.

For example, if I have a number object, which is constant, I cannot execute the expression below where I’m trying to change the value of number to 6 through the pointer object. As demonstrated below, when you try to do something like that, you get a compile time error:

const number = 5;
const pointer = &number;
pointer.* = 6;

p.zig:6:12: error: cannot assign to constant
    pointer.* = 6;

If I change the number object to be a variable object, by introducing the var keyword, then, I can successfully change the value of this object through a pointer, as demonstrated below:

var number: u8 = 5;
const pointer = &number;
pointer.* = 6;
try stdout.print("{d}\n", .{number});

You can see this relationship between “constant versus variable” on the data type of your pointer object. In other words, the data type of a pointer object already gives you some clues about whether the value that it points to is constant or not.

When a pointer object points to a constant value, then, this pointer have a data type *const T, which means “a pointer to a constant value of type T”. In contrast, if the pointer points to a variable value, then, the type of the pointer is usually *T, which is simply “a pointer to a value of type T”. Hence, whenever you see a pointer object whose data type is in the format *const T, then, you know that you cannot use this pointer to change the value that it points to. Because this pointer points to a constant value of type T.

We have talked about the value pointed by the pointer being constant or not, and the consequences that arises from it. But, what about the pointer object itself? I mean, what happens if the pointer object itself is constant or not? Think about it. We can have a constant pointer that points to a constant value. But we can also have a variable pointer that points to a constant value. And vice-versa.

Until this point, the pointer object was always constant, but what does this mean for us? What is the consequence of the pointer object being constant? The consequence is that we cannot change the pointer object, because it is constant. We can use the pointer object in multiple ways, but we cannot change the memory address that is inside this pointer object.

However, if we mark the pointer object as a variable object, then, we can change the memory address pointed by this pointer object. The example below demonstrates that. Notice that the object pointed by the pointer object changes from c1 to c2.

const c1: u8 = 5;
const c2: u8 = 6;
var pointer = &c1;
try stdout.print("{d}\n", .{pointer.*});
pointer = &c2;
try stdout.print("{d}\n", .{pointer.*});

5
6

Thus, by setting the pointer object to a var or const object, you specify if the memory address contained in this pointer object can change or not in your program. On the other side, you can change the value pointed by the pointer, if, and only if this value is stored in a variable object. If this value is in a constant object, then, you cannot change this value through a pointer.

6.2 Types of pointer

In Zig, there are two types of pointers (Zig Software Foundation 2024), which are:

single-item pointer (*);
many-item pointer ([*]);

Single-item pointer objects are objects whose data types are in the format *T. So, for example, if an object have a data type *u32, it means that, this object contains a single-item pointer that points to an unsigned 32-bit integer value. As another example, if an object have type *User, then, it contains a single-item pointer to an User value.

In contrast, many-item pointers are objects whose data types are in the format [*]T. Notice that the star symbol (*) is now inside a pair of brackets ([]). If the star symbol is inside a pair of brackets, you know that this object is a many-item pointer.

When you apply the & operator over an object, you will always get a single-item pointer. Many-item pointers are more of a “internal type” of the language, more closely related to slices. So, when you deliberately create a pointer with the & operator, you always get a single-item pointer as result.

6.3 Pointer arithmetic

Pointer arithmetic is available in Zig, and they work the same way they work in C. When you have a pointer that points to an array, the pointer usually points to the first element in the array, and you can use pointer arithmetic to advance this pointer and access the other elements in the array.

Notice in the example below, that initially, the ptr object was pointing to the first element in the array ar. But then, I started to walk through the array, by advancing the pointer with simple pointer arithmetic.

const ar = [_]i32{ 1, 2, 3, 4 };
var ptr: [*]const i32 = &ar;
try stdout.print("{d}\n", .{ptr[0]});
ptr += 1;
try stdout.print("{d}\n", .{ptr[0]});
ptr += 1;
try stdout.print("{d}\n", .{ptr[0]});

1
2
3

Although you can create a pointer to an array like that, and start to walk through this array by using pointer arithmetic, in Zig, we prefer to use slices, which were presented in Section 1.6.

Behind the hood, slices already are pointers, and they also come with the len property, which indicates how many elements are in the slice. This is good because the zig compiler can use it to check for potential buffer overflows, and other problems like that.

Also, you don’t need to use pointer arithmetic to walk through the elements of a slice. You can simply use the slice[index] syntax to directly access any element you want in the slice. As I mentioned in Section 1.6, you can get a slice from an array by using a range selector inside brackets. In the example below, I’m creating a slice (sl) that covers the entire ar array. I can access any element of ar from this slice, and, the slice itself already is a pointer behind the hood.

const ar = [_]i32{1,2,3,4};
const sl = ar[0..ar.len];
_ = sl;

6.4 Optionals and Optional Pointers

Let’s talk about optionals and how they relate to pointers in Zig. By default, objects in Zig are non-nullable. This means that, in Zig, you can safely assume that any object in your source code is not null.

This is a powerful feature of Zig when you compare it to the developer experience in C. Because in C, any object can be null at any point, and, as consequence, a pointer in C might point to a null value. This is a common source of undefined behaviour in C. When programmers work with pointers in C, they have to constantly check if their pointers are pointing to null values or not.

In contrast, when working in Zig, if for some reason, your Zig code produces a null value somewhere, and, this null value ends up in an object that is non-nullable, a runtime error is always raised by your Zig program. Take the program below as an example. The zig compiler can see the null value at compile time, and, as result, it raises a compile time error. But, if a null value is raised during runtime, a runtime error is also raised by the Zig program, with a “attempt to use null value” message.

var number: u8 = 5;
number = null;

p5.zig:5:14: error: expected type 'u8',
        found '@TypeOf(null)'
    number = null;
             ^~~~

You don’t get this type of safety in C. In C, you don’t get warnings or errors about null values being produced in your program. If for some reason, your code produces a null value in C, most of the times, you end up getting a segmentation fault error as result, which can mean many things. That is why programmers have to constantly check for null values in C.

Pointers in Zig are also, by default, non-nullable. This is another amazing feature in Zig. So, you can safely assume that any pointer that you create in your Zig code is pointing to a non-null value. Therefore, you don’t have this heavy work of checking if the pointers you create in Zig are pointing to a null value.

6.4.1 What are optionals?

Ok, we know now that all objects are non-nullable by default in Zig. But what if we actually need to use an object that might receive a null value? Here is where optionals come in.

An optional object in Zig is rather similar to a std::optional object in C++. It is an object that can either contain a value, or nothing at all (a.k.a. the object can be null). To mark an object in our Zig code as “optional”, we use the ? operator. When you put this ? operator right before the data type of an object, you transform this data type into an optional data type, and the object becomes an optional object.

Take the snippet below as an example. We are creating a new variable object called num. This object have the data type ?i32, which means that, this object contains either a signed 32-bit integer (i32), or, a null value. Both alternatives are valid values to the num object. That is why, I can actually change the value of this object to null, and, no errors are raised by the zig compiler, as demonstrated below:

var num: ?i32 = 5;
num = null;

6.4.2 Optional pointers

You can also mark a pointer object as an optional pointer, meaning that, this object contains either a null value, or, a pointer that points to a value. When you mark a pointer as optional, the data type of this pointer object becomes ?*const T or ?*T, depending if the value pointed by the pointer is a constant value or not. The ? identifies the object as optional, while the * identifies it as a pointer object.

In the example below, we are creating a variable object named num, and an optional pointer object named ptr. Notice that the data type of the object ptr indicates that it’s either a null value, or a pointer to an i32 value. Also, notice that the pointer object (ptr) can be marked as optional, even if the object num is not optional.

What this code tells us is that, the num variable will never contain a null value. This variable will always contain a valid i32 value. But in contrast, the ptr object might contain either a null value, or, a pointer to an i32 value.

var num: i32 = 5;
var ptr: ?*i32 = &num;
ptr = null;
num = 6;

But what happens if we turn the table, and mark the num object as optional, instead of the pointer object. If we do that, then, the pointer object is not optional anymore. It would be a similar (although different) result. Because then, we would have a pointer to an optional value. In other words, a pointer to a value that is either a null value, or, a not-null value.

In the example below, we are recreating this idea. Now, the ptr object have a data type of *?i32, instead of ?*i32. Notice that the * symbol comes before of ? this time. So now, we have a pointer that points to a value that is either null , or, a signed 32-bit integer.

var num: ?i32 = 5;
// ptr have type `*?i32`, instead of `?*i32`.
const ptr = &num;
_ = ptr;

6.4.3 Null handling in optionals

When you have an optional object in your Zig code, you have to explicitly handle the possibility of this object being null. It’s like error-handling with try and catch. In Zig you also have to handle null values like if they were a type of error.

We can do that, by using either:

an if statement, like you would do in C.
the orelse keyword.
unwrap the optional value with the ? method.

When you use an if statement, you use a pair of pipes to unwrap the optional value, and use this “unwrapped object” inside the if block. Using the example below as a reference, if the object num is null, then, the code inside the if statement is not executed. Otherwise, the if statement will unwrap the object num into the not_null_num object. This not_null_num object is guaranteed to be not null inside the scope of the if statement.

const num: ?i32 = 5;
if (num) |not_null_num| {
    try stdout.print("{d}\n", .{not_null_num});
}

Now, the orelse keyword behaves like a binary operator. You connect two expressions with this keyword. On the left side of orelse, you provide the expression that might result in a null value, and on the right side of orelse, you provide another expression that will not result in a null value.

The idea behind the orelse keyword is: if the expression on the left side result in a not-null value, then, this not-null value is used. However, if this expression on the left side result in a null value, then, the value of the expression on the right side is used instead.

Looking at the example below, since the x object is currently null, the orelse decided to use the alternative value, which is the number 15.

const x: ?i32 = null;
const dbl = (x orelse 15) * 2;
try stdout.print("{d}\n", .{dbl});

You can use the if statement or the orelse keyword, when you want to solve (or deal with) this null value. However, if there is no clear solution to this null value, and the most logic and sane path is to simply panic and raise a loud error in your program when this null value is encountered, you can use the ? method of your optional object.

In essence, when you use this ? method, the optional object is unwrapped. If a not-null value is found in the optional object, then, this not-null value is used. Otherwise, the unreachable keyword is used. You can read more about this unreacheable keyword at the official documentation ¹. But in essence, when you build your Zig source code using the build modes ReleaseSafe or Debug, this unreacheable keyword causes the program to panic and raise an error during runtime, like in the example below:

const std = @import("std");
const stdout = std.io.getStdOut().writer();
fn return_null(n: i32) ?i32 {
    if (n == 5) return null;
    return n;
}

pub fn main() !void {
    const x: i32 = 5;
    const y: ?i32 = return_null(x);
    try stdout.print("{d}\n", .{y.?});
}

thread 12767 panic: attempt to use null value
p7.zig:12:34: 0x103419d in main (p7):
    try stdout.print("{d}\n", .{y.?});
                                 ^

https://ziglang.org/documentation/master/#unreachable.↩︎