Porting Camel providers to Tinymail's camel-lite

About

Tinymail uses its own modified Camel called camel-lite. Camel-light improves on bandwidth utilization and memory consumption.

The summary changes

In general if you already have a Camel provider implementation, you will have to take a close look at its summary customizations. A lot Camel providers don't need any such customizations, some do. Your summary implementation will usually have extended the CamelMessageInfo? type by over-allocating it. Camel will cast your instances to CamelMessageInfoBase? instances to get the generic stuff out of it.

The custom added fields must be stored, the original Camel API allowed you to use the FILE* pointer for that. This way if both your write and your read where still in sync, Camel would simply take your writing and reading to the summary file into account.

Tinymail's camel-lite, though, uses a mmap() for reading the file. The writing, therefore, hasn't changed a lot: you have to make sure that the strings that you write are '\0' terminated. Although the generic Camel file operations API has been altered for you in such a way that if you simply use the standard Camel file operations for this, that you wont have to change anything at all here.

The reading, though, must be modified to get the pointer of your custom things out of the mmap. In such a way that your extended CamelMessageInfo? instance will get its pointers pointed to those locations.

For getting the string pointers out of the mmap (this is what most of the summary file consists of) you can use the camel_file_util_mmap_decode_uint32 for the string-length which is put right in front of the string. It'll by reference set an integer with the length of the string and return the offset of the string. For example:

char *ptrchr = s->filepos;
int len;
ptrchr = camel_file_util_mmap_decode_uint32 (ptrchr, &len, TRUE);
if (len) 
	mi->cc = (const char*)ptrchr;
ptrchr += len;
s->filepos = ptrchr;

You can also use it to decode any encoded unsigned integer and move the offset:

ptrchr = camel_file_util_mmap_decode_uint32 (ptrchr, &mi->size, FALSE);
s->filepos = ptrchr;

And if you need to get the value of a normal integer that has been stored in network byte order (very few integers in the file should be like this. It's preferable to encode the integers before writing, which is what the standard Camel file I/O API for writing to the summary file will do for you):

s->version = g_ntohl(get_unaligned_u32(s->filepos)); 
s->filepos += 4;

Samples that have already been put in place

The POP3 provider of Camel doesn't do summaries; the POP3 provider of camel-lite does but doesn't do anything non-generic with its summaries. Therefore a port wasn't needed either.

You'll notice that the porting work is trivial:

  • The one for IMAP (The ported functions are called "summary_header_load" and "message_info_load" in this file)
  • The one for NNTP (The ported function is called "summary_header_load" in this file)

The bandwidth changes (note that a lot of this is indeed obvious)

It's important for your own provider to attempt to utilize an absolute minimum of bandwidth. More important than the simple raw bandwidth is the amount of queries that you need to send.

Tinymail is designed to be used with mobile devices on mobile situations. It's not uncommon that people using E-mail clients based on Tinymail will want to use GPRS. GPRS is known for its high latency: it can take seconds for a command to be received (a lot depends on the network provider, but you usually don't control that as the software developer). There's a page available about NETEM that explains how you can emulate a GPRS network by slowing down your own network interfaces and more importantly, by adding random latency.

If you can do more with less roundtrips, then that is usually the preferred strategy. POP3 is a very bad protocol for this reason: it doesn't make it possible to with a single query get all summary information. If your protocol supports 'getting a lot using one query', then please use this in stead of 'getting things one by one, by asking for each thing'. While streaming the result of the query, it's recommended to start saving what you have received. For example periodically.

Don't forget: having to ask for something, means a roundtrip. On amount of time, this means that the latency of the network will be the one bottleneck when it comes to speed. Once you are streaming, latency is not a big problem anymore (that the packets take a second, doesn't matter if the stream is big enough and each packet has a more or less consistent latency when compared with the others packets being sent).

It's obvious that you should not do a query that requests a large result set if you only need a very small piece of the result. If your service supports making the result set exactly the size as the things that you need (or even smaller with compression techniques), then this is obviously preferred. If you need consultancy about what exactly the generic things of Tinymail need, you can of course ask this on our mailing list. Take into account, when thinking about your result set, that some GPRS users need to pay per megabyte downloaded. For them it's usually not funny if somebody coded synchronizing the local things with the remote server very inefficiently in terms of bandwidth consumption.

It's usually also a good idea to implement capabilities like COMPRESS on IMAP or support deflate and other compression algorithms at the level of TLS (or wrapped SSL) support. For example by using a recent version of Mozilla's NSS, a recent version of OpenSSL or GnuTLS as SSL implementation. This if course requires that the service-side also does all this.

At the client-side it's recommended to quickly store as much as possible, as quickly as possible: the reason for that is that if the connection fails, restarting from the position where it failed is something your user will enjoy a lot (actually, it's something that will make a difference in amount of bytes that the client needs to get over the wire when retrying). Storing as much state as quickly as possible, is usually a very good idea. Making your code recoverable by using that state, is a good idea too.

The IMAP and POP providers of Tinymail have been adapted to behave this way already. The team of developers who work on Tinymail have a good expertise in this area. If you have questions or need actual consultancy, they will enjoy helping you out with this.

Flaky connectivity

The networks on which Tinymail based E-mail accounts will be used, will not be of great quality. Your code should assume that connection failures will frequently happen. This is important in the world of mobile devices. Although setting the timeouts of the socket operations lower than usual is not always a bad idea, you must take into account that if you set them too low the latency of for example GPRS networks is sometimes high enough to trigger the timeout. You don't want to make your software think that a timeout occurred while there was actually nothing wrong with the connection (it was just being very slow).

Please try to understand that super good connectivity and good/stable wireless networks are not the reality for a lot of people who'll use Tinymail based E-mail clients.

This page describes how you can simulate your network to the level of GPRS. This page adds to that how to get yourself a virtual machine with server software running in it, of which the VM's network is slowed down to the level of GPRS.