BattlefyBlogHistoryOpen menu
Close menuHistory

Zig made it easy to pass strings back and forth with WebAssembly

Ronald Chen October 31st 2022

We use strings all the time in JavaScript. When we want to read/write strings with WebAssembly, we have a problem.

WebAssembly function parameter and return value types only support 32/64-bit integers/floats. WebAssembly has no concept of a string. Worse, WebAssembly doesn't even have the concept of an array.

What WebAssembly does have is a linear memory model. One can think of WebAssembly's memory as one large Uint8Array that can only grow in size.

Back in JavaScript land, we can already read/ write strings to an Uint8Array with TextEncoder/ TextDecoder.

Putting the pieces together

Neat, if we encode/decode strings from WebAssembly's memory, we can pass the index and length via functions.

Sending a string from WebAssembly to JavaScript

  1. WebAssembly encodes a string to memory as UTF-8
  2. WebAssembly sends the index and length of the string to JavaScript
  3. JavaScript uses TextDecode to read string out of memory at the given index and length

Zig makes this easy as it already represents strings as an array of UTF-8 bytes, and when Zig is compiled to WebAssembly, Zig's memory is the same as WebAssembly's memory. Furthermore, Zig keeps track of string as slices. Slices are simply a pointer and length to a thing. A pointer to something in Zig's memory is the same as an index to WebAssembly's memory!

String from Zig to JavaScript

Zig defines a string and sends it to JavaScript:

extern "app" ask(pointer: [*]const u8, length: u32) [*:0]u8;

fn sendQuestion() void {
  const question: []const u8 = "What is your name?";
  const name_pointer: [*:0]u8 = ask(question.ptr, question.len);
  ...
}

Aside: [*]const u8 means an slice of unknown length of unmodifiable u8. [*]const u8 is compiled down to a pointer. Pointers in WebAssembly are just indices in memory which are u32. Therefore, in this context [*]const u8 is the same as u32. [*:0]u8 will be explained later.

Aside: Where is the question string allocated? Why don't we need to use an allocator? For compile time strings, Zig will statically allocate the string somewhere in memory for us. This is why question.ptr and question.len are well-defined.

JavaScript receives the pointer and length, then decodes it out of memory:

... in import object
app: {
  ask(pointer, length) {
    const question = decodeString(pointer, length);
    const name = window.prompt(message) ?? "Human";
    return encodeString(name);
  }
},
...

const decodeString = (pointer, length) => {
  const slice = new Uint8Array(
    memory.buffer, // memory exported from Zig
    pointer,
    length
  );
  return new TextDecoder().decode(slice);
};

What about encodeString?

String from JavaScript to Zig

When we want to encode a string to WebAssembly memory, we run into another problem. Where in memory should we encode the string? We can't just write anywhere into memory, as Zig might already have data there.

What we need to do is ask Zig for space. We need to tell Zig to allocate a segment of memory for us.

Sending a string from JavaScript to Zig

  1. JavaScript uses TextDecode to write the string to a temporary array
  2. JavaScript asks Zig to allocate memory of the size of the array
  3. Zig allocates memory and returns an index
  4. JavaScript writes the array into memory at the given index
  5. JavaScript returns the index of the string to Zig
  6. Zig reads a string out of memory with the given pointer

For step 2, we need a function that allocates into WebAssembly memory. To do this, in Zig, we wrap alloc and export it:

export fn allocUint8(length: u32) [*]const u8 {
    const slice = std.heap.page_allocator.alloc(u8, length) catch
        @panic("failed to allocate memory");
    return slice.ptr;
}

Aside: std.heap.page_allocator maps directly to WebAssembly memory.

Aside: Why is the function is named allocUint8? While it can allocate space for a string, it can be generally used to allocate u8.

JavaScript encodes a string to memory and sends a pointer to Zig:

... in import object
app: {
  ask(pointer, length) {
    const question = decodeString(pointer, length);
    const name = window.prompt(message) ?? "Human";
    return encodeString(name);
  }
},
...

const encodeString = (string) => {
  const buffer = new TextEncoder().encode(string);
  const pointer = allocUint8(buffer.length + 1); // ask Zig to allocate memory
  const slice = new Uint8Array(
    memory.buffer, // memory exported from Zig
    pointer,
    buffer.length + 1
  );
  slice.set(buffer);
  slice[buffer.length] = 0; // null byte to null-terminate the string
  return pointer;
};

Why are we null-terminating the string? Two reasons.

First, strings in Zig are both null-terminated, and length tracked. This allows Zig to easily interop with APIs that expect null-terminated strings and, simultaneously, don't require O(N) to figure out the length of a string.

Second, WebAssembly multi-value return is not implemented yet. This means we can't return [pointer, buffer.length] in JavaScript. We can only return a single value, and a null-terminated string has an implicit length.

In Zig, we take the pointer and convert it back to a normal string:

fn sendQuestion() void {
  const question: []const u8 = "What is your name?";
  const name_pointer: [*:0]u8 = ask(question.ptr, question.len);
  defer std.heap.page_allocator.free(name_pointer);
  const name: []const u8 = std.mem.span(name_pointer);
  ... use name
}

Aside: We can now understand [*:0]u8 to mean a null-terminated slice of unknown length of u8.

Since memory was allocated by JavaScript side, we need to deallocate it on the Zig side.

std.mem.span scans the memory starting at the pointer until it finds a null byte and returns a normal string.

For a complete example see https://github.com/Pyrolistical/zig-wasm-canvas [demo]

Do you want to learn the latest web tech while building esports? You're in luck, Battlefy is hiring.

2022

Powered by
BATTLEFY