What’s a Memory Allocator anyway ?

by Arpit Kumar

05 Jun, 2023

8 minute read

A memory allocator is a component of a programming language or runtime system that is responsible for managing the allocation and deallocation of memory during program execution

# programming

Table of Content

What is a memory allocator ?

Strategies for Memory Allocations

How languages determine when and how to free memory

Exploring Memory Allocators in Zig: Choosing the Right Tool for Efficient Memory Management

I recently came across an interesting talk by Benjamin Feng about memory allocators. It reminded me about something I often forget because my recent work mostly involves languages like Java and Golang, which automatically handle memory for us using garbage collection (GC). This means we don’t have to worry much about managing memory manually. After sharing the video with my colleagues, I realized that writing a blog post about the basics of memory management and memory allocators could be helpful for me and others who want to create high-performance software.

What is a memory allocator ?

A memory allocator is a component of a programming language or runtime system that is responsible for managing the allocation and deallocation of memory during program execution. It is particularly important in languages that allow dynamic memory allocation, where memory can be requested and released dynamically as needed.

The primary purpose of a memory allocator is to provide a mechanism for allocating memory blocks of various sizes to fulfill the memory requirements of a program. When a program requests memory, the allocator locates a suitable free block of memory and returns a pointer to that block. The program can then use the allocated memory for storing data.

Memory allocators typically work with a lower-level memory management system, such as the operating system’s memory manager or a heap manager, to obtain larger blocks of memory from the system and divide them into smaller allocations requested by the program. The allocator keeps track of which parts of the allocated memory are in use and which parts are free or available for reuse.

In some languages, memory allocation and deallocation are explicitly managed by the programmer, using functions like malloc() and free() in C or new and delete in C++. However, many modern programming languages, such as Java, Python, and C#, provide automatic memory management through a process called garbage collection. In these languages, the memory allocator works closely with the garbage collector to automatically reclaim memory that is no longer needed by the program.

Memory allocators can have different strategies for managing memory, such as first-fit, best-fit, or buddy allocation etc. These strategies determine how the allocator searches for free memory blocks and selects the most appropriate one for a given allocation request. The choice of allocator can have a significant impact on the program’s performance, especially in memory-intensive applications.

Strategies for Memory Allocations

There are several different kinds of memory allocators, each with its own characteristics and purposes. Here are some commonly used memory allocators:

Stack Allocator: The stack allocator manages memory using a stack data structure. It is typically used for local variables and function call frames. Memory allocation and deallocation on the stack is fast and deterministic, following a last-in, first-out (LIFO) approach.
Heap Allocator: The heap allocator manages a larger pool of memory known as the heap. It allows for dynamic memory allocation and deallocation during program execution. Common heap allocation functions include malloc and free in C/C++ and new and delete in C++.
Buddy Allocator: The buddy allocator divides memory into fixed-size blocks and allocates them based on power-of-two sizes. It satisfies allocation requests by splitting or merging blocks to provide the closest matching size. It is commonly used in operating systems and embedded systems.
Pool Allocator: The pool allocator preallocates a fixed-size block of memory and divides it into smaller fixed-size blocks. It is useful when there is a known maximum number of allocations and deallocations. The pool allocator can provide faster memory management than the general-purpose heap allocator.
Slab Allocator: The slab allocator manages memory in fixed-size blocks called slabs. Each slab is divided into smaller units, such as cache lines or pages, and can be allocated or deallocated as a whole. It is commonly used in operating systems for managing kernel objects, such as file descriptors or network sockets.
Region-Based Allocator: The region-based allocator divides memory into contiguous regions, each with its own allocation pool. When a region is full, a new region is allocated, which reduces fragmentation. It is commonly used in garbage collectors and functional programming languages.
Object Pool Allocator: The object pool allocator preallocates a fixed number of objects and provides a pool of available objects for allocation. It is useful when there is a frequent need for creating and destroying objects, as it reduces the overhead of memory allocation and deallocation.
TCMalloc: TCMalloc (Thread-Caching Malloc) is a memory allocator developed by Google. It aims to reduce contention for global locks by providing per-thread memory caches. It is designed to improve performance in multithreaded applications.

These are just a few examples of memory allocators, and there are many other specialized allocators tailored for specific use cases or programming languages. The choice of allocator depends on factors such as performance requirements, memory usage patterns, and programming language conventions.

How languages determine when and how to free memory

There are several techniques that are used in programming languages. Some of the commonly used alternatives include:

Manual Memory Management: Instead of relying on automatic garbage collection, programmers explicitly allocate and deallocate memory for objects. This requires careful tracking of memory allocations and deallocations, and can be error-prone if not managed properly. Languages like C and C++ rely on manual memory management.
Reference Counting: This technique tracks the number of references to an object. Each time a reference is created or destroyed, the reference count is updated. When the reference count reaches zero, the object is deallocated. Reference counting can have overhead due to the constant bookkeeping, and it can also suffer from circular references, where objects reference each other and their reference counts never reach zero. Python uses a combination of reference counting and garbage collection.
Automatic Reference Counting (ARC): ARC is similar to reference counting, but it aims to address some of the issues with reference counting by introducing additional techniques. ARC can dynamically determine when to insert retain and release operations to manage the reference counts. It can eliminate certain circular reference problems and reduce the overhead of reference counting. Apple’s Objective-C and Swift programming languages use ARC.
Ownership Types: Ownership types provide a static analysis approach to memory management. The ownership type system tracks and enforces ownership relationships between objects, ensuring that memory is deallocated when it is no longer needed. Ownership types can eliminate certain memory-related bugs and provide compile-time guarantees, but they can also introduce complexity and require additional annotations or language features. The Rust programming language employs ownership types.

Each approach has its own trade-offs in terms of performance, complexity, and programmer productivity, and the choice of memory management technique depends on the specific requirements of the programming language or application.

I became intrigued by Zig after reading aboutTigerBeetle, a financial database, and later came across a tweet from Mitchell Hashimoto where he mentioned developing a new terminal emulator in Zig.

I like to build and use my own tools, but I went a little far on this one... For the past few years, I've been casually off and on building my own terminal emulator. And for the past 18mo, I've been using it as my exclusive terminal full-time, which is neat. pic.twitter.com/naCfrZ0iBm
— Mitchell Hashimoto (@mitchellh) May 26, 2023

Zig provides flexibility for working with different memory allocators by allowing you to specify allocators explicitly when performing allocations. This allows you to choose the most appropriate allocator for your specific use case, balancing performance, memory usage, and safety.

It’s worth noting that Zig’s memory model emphasizes predictability and control over memory management. The language provides features like compile-time memory tracking, control over uninitialized memory, and explicit deallocation, which can help prevent memory leaks and other memory-related issues.

When working with Zig, it’s important to consider the specific needs and constraints of your application to choose the most suitable allocator or even create custom allocators to optimize memory usage and performance.

Exploring Memory Allocators in Zig: Choosing the Right Tool for Efficient Memory Management

Certainly! Here’s some additional information about the allocators available in Zig:

std.heap.page_allocator: This allocator obtains memory from the operating system in large pages. It is efficient for allocating large chunks of memory. Large pages reduce the number of page table entries, resulting in faster memory operations. However, it can be slower for small allocations due to the overhead of managing larger pages.
std.heap.stack_allocator: This allocator operates on the stack, making it highly efficient for small allocations. It is commonly used for allocating temporary or short-lived objects. However, it’s important to use this allocator cautiously for large allocations, as it can lead to stack overflows if the allocation size exceeds the available stack space.
std.heap.arena_allocator: The arena allocator allocates memory in a contiguous block. It is designed for allocating a large number of small objects. By keeping all allocations within a single block, it helps to reduce fragmentation and improves cache locality. This allocator is particularly useful in scenarios where a large number of small allocations are required, such as in parsers or data structures.
std.heap.bump_allocator: The bump allocator manages memory as a linked list of free blocks. It is efficient for allocating a small number of small objects and can be used to implement a stack-like allocation strategy. As allocations are made, the allocator bumps a pointer forward within the memory block, ensuring each allocation is contiguous. Deallocation is not supported individually; the entire block is released when no more allocations are required.
std.heap.custom_allocator: Zig provides the flexibility to implement custom allocators using the std.mem.Allocator interface. This allows programmers to design allocators tailored to specific requirements, such as specialized memory management strategies or integration with external memory systems. Custom allocators can provide optimized memory allocation and deallocation schemes based on the unique needs of the application.

These allocators offer a range of options for memory management in Zig, allowing developers to choose the most appropriate allocator for their specific use cases based on factors like allocation size, performance requirements, and fragmentation considerations.