James Antill - lighttpd's "AIO sendfile"
Nov. 14th, 2006
10:01 pm - lighttpd's "AIO sendfile"
I keep track of a couple of the other web servers out there, even though I don't think much of them ... for various reasons. But when I saw that lighttpd was advertising AIO sendfile for their 1.5.x I was pretty impressed, while I hadn't been following it closely I didn't think the Linux AIO sendfile code had gone in yet (I'd been planning on trying to use the splice() API with IO helpers to do the same thing ... but I keep not getting around to it).
Alas. looking deeper into it, the release note is very misleading. They are actually doing AIO reads into allocated memory, and then using non-AIO sendfile() on the fd associated with the memory. Of course, this means:
- It copies the entire file into memory (in sections of half a MB).
- It requires copying an entire file section before anything is sent.
- There is no sharing! So if you have 10,000 connections requesting the same 2MB file you get 5GB of memory used.
- The real page cache readahead is going to have to be that much more intelligent to do the right thing.
- There is no guarantee that the memory isn't swapped! So you can still have all the blocking artifacts of normal non-AIO sendfile().
...basically this is a lot of work to reimplement the page cache, badly.
I also seriously question whether the same "improvement" could have been got by just over allocating on the process to CPU mapping (something that I understand lighttpd can do as well as and-httpd). It also implies that posix_fadvise() should be doing more work (it is supposed to solve this exact problem), but maybe sprinkling explicit readhead() calls would also work around the problem (without screwing everyone else on the box).
Also why lighttpd uses sendfile() instead of just write() for the last piece, I'm not sure ... it's like doing a mmap() and then calling sendfile() on the mmap'd fd.