A History of malloc
I was reading up on malloc. I had naively assumed malloc was a system call, it's not. Under the covers malloc uses brk or sbrk to request memory from the kernel. However, malloc does not always have to use brk/sbrk, if it has memory in its "free list" that can fulfill the request then no system call is needed. So you do not always have to pay the price of a system call when you use malloc.
Another interesting thing about malloc is that it cannot return memory to the kernel unless the memory that is freed is at the top of the heap. If you malloc some memory at the start of your program, malloc some more later and then free the original memory, malloc/free cannot return the original chunk of memory to the kernel until the second piece of memory is freed.
The first malloc was written in "The Old Testament" - K&R, it's about 200 lines of code. They managed the free list by using a union with the memory that was actually stored in the free list - this saved space, an important requirement when the amount of memory available was very limited.
Poul-Henning Kamp re-wrote malloc for FreeBSB 2.2 and documented it in Malloc(3) Revisited, this malloc is known as pkmalloc. By this time systems where using virtual memory, this meant that in the K&R approach a chunk of memory on the free list could be paged out to disk, now the free list was embedded in these chunks of memory so when malloc came to look for memory on the free list it would have to page all this memory back in, killing performance!! Kamp's version of malloc was 1136 lines of code long and had a good reputation for performance.
Then came fast multi-processor machines with large memory, and another re-write of malloc. Jason Evans re-wrote malloc for FreeBSB and his version is known as jemalloc, he wrote about it in A Scalable Concurrent Malloc(3) Implementation for FreeBSD. Now the issues are less about paging to disk but fast locks and worrying about NUMA issues (trying to allocate memory close to the CPU that you think will be using it). Firefox are attempting to use jemalloc internally for their memory management.
There are quite a few malloc implementations out there, Google have one called tcmalloc in their perftools bundle. It's very easy to swap the malloc your code uses, all you have to do is link against the library with the new malloc. Though this can sometimes lead to trouble :-)
Another interesting thing about malloc is that it cannot return memory to the kernel unless the memory that is freed is at the top of the heap. If you malloc some memory at the start of your program, malloc some more later and then free the original memory, malloc/free cannot return the original chunk of memory to the kernel until the second piece of memory is freed.
The first malloc was written in "The Old Testament" - K&R, it's about 200 lines of code. They managed the free list by using a union with the memory that was actually stored in the free list - this saved space, an important requirement when the amount of memory available was very limited.
Poul-Henning Kamp re-wrote malloc for FreeBSB 2.2 and documented it in Malloc(3) Revisited, this malloc is known as pkmalloc. By this time systems where using virtual memory, this meant that in the K&R approach a chunk of memory on the free list could be paged out to disk, now the free list was embedded in these chunks of memory so when malloc came to look for memory on the free list it would have to page all this memory back in, killing performance!! Kamp's version of malloc was 1136 lines of code long and had a good reputation for performance.
Then came fast multi-processor machines with large memory, and another re-write of malloc. Jason Evans re-wrote malloc for FreeBSB and his version is known as jemalloc, he wrote about it in A Scalable Concurrent Malloc(3) Implementation for FreeBSD. Now the issues are less about paging to disk but fast locks and worrying about NUMA issues (trying to allocate memory close to the CPU that you think will be using it). Firefox are attempting to use jemalloc internally for their memory management.
There are quite a few malloc implementations out there, Google have one called tcmalloc in their perftools bundle. It's very easy to swap the malloc your code uses, all you have to do is link against the library with the new malloc. Though this can sometimes lead to trouble :-)
0 Comments:
Post a Comment
<< Home