Memory consumption of tinymail

Previous (older) memory reports

Memory report of before March 2007

Some explanations

Architecture used while creating this report

The architecture used was a typical X86 machine (with a 32bit word size). This means that a pointer consumes 4 bytes or 8 nibbles or 32 bits. For non C developers: a memory address is 4 bytes in size. Pointing to such a memory address requires 4 bytes per instance. A good example is pointing to a string of characters, like the subject of your messages. Not only does this use memory (well, mmap-ed memory, but later more on this) for the characters, it also uses memory for pointing to it and a character for ending the string.

I see you thinking: dude, that's five bytes. What are you trying to say? That five bytes are important? Ok, well. Calculate with me: we have the to, from, cc, subject and a bunch of other pointers to other things too. Multiply that (something like 20 such things) with 100,000 items in your Linux mailing list mailbox and multiply that with four bytes per pointer and one byte for the ending character. Is it still insignificant? On your desktop, yes and I agree. On your mobile phone too?

Now, before you answer. I agree that with time comes more physical RAM in those small devices of you. Eventually it wont be significant anymore. But let's develop for today, shall we? In my personal opinion it's sometimes good to, for quantitative things, really know where each and every byte is going to per instance. It's not that our human brain is too small to grasp that simple knowledge, right?

Alignment for mmap

Because some architectures don't like non-aligned access to mmap-ed memory regions, all mmap-ed data is also data-padded to the word-size. This means that on a 32bit computer, it's data padded to 4 bytes. To put it in another way: depending on the length of a string, padding bytes will be added until there's an alignment on four bytes.

This doesn't influence heap allocations. It does, however, affect the mmap size. More on mmap lower. Also read the section about compressed file systems (like the typical flash filesystems LogFS and jffs2).

Getting new summary information online

Getting summary information online means that the tool will get the summary of the mailbox to your local storage, by putting it somewhere in your temporary directory (usually in /tmp/tinymail.0).

Offline using the summary information

The summary information is all what is needed to display an overview of a mailbox on your screen. With an overview I mean the typical things that people want in an overview. Things like the from, to, cc and subject header. But also things like whether or not there's attachments and whether or not the message was once seen or not.

Allocated memory by the summary per folder

The larger the mailbox, the larger this summary obviously will be. Each such summary item consumes some memory. The tinymail project members make the per-such-item memory consumption, a well data alignment of the item's memory and letting the harmony of them perform fast enough a top priority.

Obviously the larger the summary, the more items, the more memory will be used. Although we have mmap to get some of that memory back, each item needs some administrative data in memory too. For example pointers to the mmap-ed memory regions. Pointers do consume memory. Quite a lot if we think quantitative.

Mapped memory, mmap, by the summary per folder

But lucky we are to have the mmap() syscall of POSIX operating systems. The mmap syscall on most kernels (which you can also interpret as "operating systems") does (more or less) exactly the same as what swap partitions or swap files do (I wrote most, this doesn't necessarily have to be true). Except that with mmap you can explicitly tell the kernel that the area of memory is set read-only.

Given that the data is also available on the (slower) filesystem, it being read-only is very interesting to know for the kernel. This of course means that it doesn't have to care about writing the data back to the filesystem: it'll never change. Tinymail uses all its mmap things read-only.

We all know what happens with memory pages that aren't often used, right? When memory usage is under pressure, which often happens on small & mobile devices with few memory, the kernel will start making decisions: pages that haven't recently been accessed will be paged out of physical memory. The page will still be in the virtual address space, but that might as well mean that the page is only available on slower swap or in an mmap. This basically means that the page is, at that moment, effectively not memory nor consuming any significant system resources (other than some in-kernel administration). The good news is that with a read-only mmap, since the kernel can assume that nothing has changed, it doesn't need to write anything. This means that no level wearing would take place on flash (so the normal draw-back of swap files on flash devices doesn't exist in this case) and no "swapping" phenomenon (having to constantly write lots of data to a slower device). Hence the reasons for read-only mmap()s.

An mmap, especially a read-only one, also is so-called demand paged. This means that unless the page is accessed, it wont be in physical memory. Of course will tinymail based applications access the pages. Or at least some or even most of them. But under memory pressure, which happens often on devices with few memory, will the kernel be much smarter than the application about which pages it should give priority.

This is the primary reason why memory measuring the total virtual memory consumption of a tinymail based application isn't really going to be interesting unless you do it on the real target device under real life circumstances.

A desktop with huge amounts (gigabytes nowadays) of RAM will simply have all the pages in real memory. That's because the kernel is allowed to use your RAM. It would be foolish not to use it if you have it, right? Maybe you are paranoid about seeing a system resource monitory telling you that some application is using one percentage more of the real RAM. The kernel, however, certainly isn't paranoid. Its in charge and it knows much better than your system resource monitor tool what at that particular moment in time is the current best virtual memory setting.

Maybe you as an intelligent being know better. In that case you should join a kernel development team and implement a better virtual memory layer for that kernel.

More other memory

Memory being used by your own code and data

This memory report isn't showing you data and code that you will be allocating yourself, of course.

Memory being used for administering things

Tinymail consumes a few hundred kilobytes of administrative memory. This might be the overhead per instance within the GLib object system (GObject). This can be pointers, like doubly-linked list pointers, arrays, strings and other things like that. The testing tools used in this report consume some of this administrative memory too.

Note, however, that a larger tinymail based application might use some more parts of the entire tinymail framework. It shouldn't start consuming a lot more than what you can see on this report. It definitely shouldn't start growing: do also test with the demo user interface if your application's memory consumption starts growing due to tinymail things. You do need to destroy and free tinymail instances, of course. Also don't forget to disable the GLib slab allocator before drawing any conclusions (more on this below).

If you are being managed by a garbage collector, try to take the time to understand how it works. For example the Python one has some very interesting things for you to know about before drawing conclusions. A hint for Python? Try gc.collect() at intelligently picked locations.

In-kernel memory like administrative things for the flash filesystem

The kernel might use some administration code and data. That's normal though. Most, or all, kernels are applications like any other. They consume some memory. The kernel usually doesn't consume really a lot. Especially not if we compare the numbers with some user applications.

Specifically the filesystem layer for flash filesystems, like LogFS and jffs2, consume some memory when being used with mmap. It's not terribly much, but it's memory. If you are to-the-byte measuring how much RAM chips your device will need, say if you want to reduce the manufacturing cost of one unit, then you must take into account things like this too.

I assume that, in this case, a professional company like yours knows this. I assume that you are consulting the kernel developers who wrote the code which you are depending your technology on. If you aren't, although their consultancy time might cost some money: this is a very wise thing to do.

Less memory than you might think after this report

GSlice, a slab allocator

If you follow the preparations below, you won't be using the GLib slab allocator GSlice. Software developers who develop a bit lower to the system, like but not only the C++ and the C ones, know about things like data alignment. This means that in order to make accessing the memory perform better, structures are aligned to the word size of the architecture by your compiler. In case the structure was inefficiently ordered, this can effectively mean that you are wasting alignment space.

Fewer developers also know that each allocation has a small administrative cost. Depending on things like the libc and the architecture, the so-called heap-admin is used for for example storing the allocation size in it. Heap-admin can for example be used by the free() call, to know the size of what it must free.

For many equal sized allocations, a slab allocator therefore will reduce memory consumption. However, a slab allocator often in stead of truly destroying the object, keeps the object around. That's because it assumes that the application developer will most likely request a new equal sized object instance soon. Calls like malloc() and free() are rather expensive in terms of performance, a slab allocator usually aims to avoid those calls.

This means that you'd see what an untrained eye might experience as a memory leak. This is why I ask you to turn off the slab allocator in the preparations section. This gives you a more clear view of the memory consumption, even though it's with the extra heap-admin (which you can subtract manually if you like).

The mmap is quite efficient

Although I explained it using most likely enough words, above (enough depending on what exactly and on what level it is that you want to know about it) and although a lot kernel developers try very hard to explain how mmap works, few people fully understand how mmap works. Even a lot trained software developers, including myself, don't always fully grasp the subject. You usually have to talk with a trained kernel developer to get the basic idea. The reason is because it ain't simple either. On for example Linux you actually need to grasp quite large parts of the virtual memory layer.

A lot people think that since the mmap-ed memory is part of the address space, it'll consume more physical RAM depending on the file size being mmap-ed. Yet you can easily mmap a file of gigabytes on a mobile device that only has a few megabytes of physical RAM installed. This effectively increases the virtual memory size to gigabytes, indeed. Whether or not you can do this depends on a few other things too, like support for large files on the filesystem being used. You can also split-up the files in smaller ones and mmap those. Although this indeed will consume more physical RAM than a small mmap, it's actually mostly only due to the fact that for a larger file it's possible that the filesystem developer needed more in-kernel administrative data to keep the mapping healthy and correct.

While a file of 2MB might easily fit in your available physical RAM, and therefore it might be possible that the kernel simply puts every piece of it in physical RAM, it's also possible that it doesn't fit at a specific moment or that a larger mmap-ed file of say 2GB will not fit at all. It's also possible that due to using all of the pages of the mapping frequently, that the entire file (if it fits in your physical RAM) will be entirely mapped into real RAM.

I hope this example illustrates the mmap feature of your kernel a little bit. You can view it as a window to a file of which only the used pieces are in real physical RAM unless you have plenty of physical RAM available to load larger pieces of it in. The later case being the typical one on your desktop with gigabytes of RAM available and in unused state.

Compressed filesystems (might) mean less memory with mmap

Some discussions with the LogFS developer revealed to me (I already wrote above that I often don't fully grasp the exact and full picture of how mmap really works) that on compressed filesystems it's possible that mmap-ed regions use less memory per page than the typical 4K size of a page. Especially when it's a read-only mmap this compressing might become more interesting (which is the case for tinymail's mmap usage).

So the pages that are in physical RAM, might be there in compressed state. Therefore consuming less (a page has more filesystem data of the file than 4K). I don't know how accurate this is nor am I going to in-depth investigate it. I can imagine that in theory this is indeed possible. Feel free to consult a kernel developer for more information on this.

Preparations

The test accounts

There are test accounts set up for you. The default tools use these test accounts. The test accounts always have the exact same amount of test data as they get periodically flushed and reset. They contain both pure spam and mailing list data.

Spam is interesting because most strings are in a funny encoding and usually quite unique per mailbox. Tinymail sometimes reuses the same string by keeping only one copy in memory. It doesn't do this for mmap-ed data though (this would mean developing some sort of partition table too, which sounded a little bit overkill).

Mailing lists are also interesting for the exact same reason, but the other way around. Because we only keep one copy, and because mailing lists have a lot threads with reply-to's in them (which basically means that there will be identical strings), it's interesting to measure using this data too.

To put it in another way: the test accounts are often rather a worst-case than a best-case scenario. I think the test accounts are a good-enough average to do testing with. If you disagree, you can contact the tinymail team and ask whether you can provide the test accounts with your more ideal test data. You can of course prepare test reports with your own data source and contact the tinymail team about your results. We will most likely be happy to add your report to this one, in case you make your data source public (so that others can reproduce your test).

Building

cd trunk
~/repos/tinymail/trunk$ ./autogen.sh --prefix=/opt/tinymail --enable-tests
~/repos/tinymail/trunk$ make && sudo make install

Preparing

/opt/tinymail/bin/memory-test -o -j
NAME=Testing
Getting headers of INBOX/Testing ...
NAME=700
Getting headers of INBOX/700 ...
NAME=50000
Getting headers of INBOX/50000 ...
NAME=5000
Getting headers of INBOX/5000 ...
NAME=500
Getting headers of INBOX/500 ...
NAME=40000
Getting headers of INBOX/40000 ...
NAME=30000
Getting headers of INBOX/30000 ...
NAME=3000
Getting headers of INBOX/3000 ...
NAME=2000
Getting headers of INBOX/2000 ...
NAME=200
Getting headers of INBOX/200 ...
NAME=10000
Getting headers of INBOX/10000 ...
NAME=1000
Getting headers of INBOX/1000 ...
NAME=100
Getting headers of INBOX/100 ...

Securing your test data

Note that the test-tool will create a unique folder (the last characters are a number), so it can also be /tmp/tinymail.1 for example. Mostly depending on how many tests you launched recently.

mv /tmp/tinymail.0/ /tmp/tinymail.measuring

Some environment variables

These will deactivate the GLib slab allocated GSlice. Fore more information on this, take a look at the sections above.

export G_SLICE=always-malloc
export G_DEBUG=gc-friendly

The results

Getting the summary information while online, from scratch

Getting the information for using it consumes more or less the same amount of memory as using the summary information offline (it consumes a little bit more to actually download it, than to only use it). But the allocations stick longer. By that I mean that it takes longer for the memory to get deallocated. This is indeed due to the fact that while things are being downloaded, it's not immediately put in the mmap-ed file. And that's because the mmap-ed file is mmap-ed as read-only. Which means that I can't make it grow nor that I can write to it unless I reload the file from scratch.

What will happen is that 1,000 items are harvested and kept in memory. After each 1,000th item a dump to a new file and a full reload of the mmap-ed file takes place.

The command to run a test

valgrind --tool=massif /opt/tinymail/bin/memory-test -o

The valgrind massif report

The order in which the folders where tested was: INBOX/Testing, INBOX/700, INBOX/50000, INBOX/5000, INBOX/500, INBOX/40000, INBOX/30000, INBOX/3000, INBOX/2000, INBOX/200, INBOX/10000, INBOX/1000, INBOX/100 (note that you might not find some folders due to their memory consumption being too small when compared to the five bigger ones).

I am not a bot

Offline using the summary information

The command to run a test

valgrind --tool=massif /opt/tinymail/bin/memory-test -c /tmp/tinymail.measuring/

In a table

mailbox itemsheapmmap
50,0008,613K13,229K
40,0006,860K10,741K
30,0005,000K7,858K
10,0001,860K2,418K
5,000900K1,252K

The original valgrind massif report

The order in which the folders where tested was: INBOX/Testing, INBOX/700, INBOX/50000, INBOX/5000, INBOX/500, INBOX/40000, INBOX/30000, INBOX/3000, INBOX/2000, INBOX/200, INBOX/10000, INBOX/1000, INBOX/100 (note that you might not find some folders due to their memory consumption being too small when compared to the five bigger ones).

I am not a bot

Original file sizes of the files that will be mmap-ed

find . -name summary.mmap -exec stat --printf "%n: %s\n" {} \;
./700/summary.mmap: 171216
./50000/summary.mmap: 13546268
./5000/summary.mmap: 1281312
./500/summary.mmap: 122604
./40000/summary.mmap: 10997876
./30000/summary.mmap: 8046364
./3000/summary.mmap: 784652
./2000/summary.mmap: 497708
./200/summary.mmap: 39952
./10000/summary.mmap: 2475932
./1000/summary.mmap: 246384
./100/summary.mmap: 21880

Attachments