Kernel events without kevents

Posted Mar 15, 2007 12:46 UTC (Thu) by pphaneuf (guest, #23480)
In reply to: Kernel events without kevents by k8to
Parent article: Kernel events without kevents

WaitForMultipleObjects() resembles poll() a lot, except with a much more baroque return value handling, and an extra feature that could either be extremely useful or just weird, depending on who you ask (there's a bool parameter to "wait for all", where if true, everything has to be ready before returning).

select() isn't too bad on the size of the blob itself (three bits per file descriptor isn't "big", IMHO), but is quite inefficient when the numbers of the file descriptors themselves are all over the place and sparse (if you only wait on fd 1000, it has to scan the bitset for fds 0 to 999 for no reason). poll() is better for that aspect, but in exchange has a "big data blob", so handling a lot of fds is better with select(). WaitForMultipleObjects() uses a simple array of object handles, but doesn't tell you if the object is "readable", "writable" or anything like that, just that it "has been signaled", and it doesn't tell you *which* ones have been signaled, you have to check them all.

epoll is better than all of the above, IMHO. Now, the question is how does it fare compared to kevents (or BSD's kqueue)...

Kernel events without kevents

Posted Mar 15, 2007 12:57 UTC (Thu) by k8to (guest, #15413) [Link] (6 responses)

In the interim I went and looked up the MSDN WaitForMultipleObjects docs. Then I cried. I bet there's a way to find out which of the many objects were signalled during the call, but I can't find it anywhere in the page! Par for the course.

Regarding select I was not commenting on the size of the object, but that there there are N fields to check for N fds, and on the kernel side N fields to set for N fds, thus for large values of N this has unpleasant results. Also the code for handling it (on both sides) is more bother and less focus than a simple stream of events a la epoll. Hurrah for epoll.

The convolving of futexes to fds somehow rubs me the wrong way, but I suppose fds themselves are conceptually simple and inexpensive, and are probably implemented inexpensively.

Kernel events without kevents

Posted Mar 15, 2007 13:26 UTC (Thu) by pphaneuf (guest, #23480) [Link]

Note also how the whole "abandoned" deal limits the number of objects that can be watched to something like 127. Yeah, there is indeed subject for crying. :-)

fds aren't that expensive, no. And really, anything blocking is subject to wanting to put it into select() or epoll, and locking a mutex can be quite blocking. What you'd want is a kind of "try lock" primitive, that would either take it right there (if it's unlocked), or if it couldn't, will make an fd ready when it could take it.

Note that you can already implement a semaphore (and thus a mutex, which is the binary semaphore special case, easily covered by a general semaphore), by using a pipe, as I "discovered" recently. The problem compared to futexes is that, if I am not mistaken, futexes only do a syscall in case of contention, where my trick does a syscall every time (it's be quick, but there I go for full disclosure).

Kernel events without kevents

Posted Mar 16, 2007 20:45 UTC (Fri) by mikov (guest, #33179) [Link] (4 responses)

I am not sure what you mean about not knowing which object was signalled with WaitForMultipleObjects(). It is right there in the documentation:
http://msdn2.microsoft.com/en-us/library/ms687025.aspx

The return value WAIT_OBJECT_0+index indicates that object index was signalled.

In general there is no arguing that Win32's handling of async IO, threads, synchronization, etc is very self consistent and sadly still quite ahead of the situation in Linux. Win32 is at least 14 years old and has had all that from the beginning, but we still can't quite catch up ...

Kernel events without kevents

Posted Mar 16, 2007 23:18 UTC (Fri) by pphaneuf (guest, #23480) [Link] (3 responses)

Hmm, for some reason, I thought it was the number of signalled objects, like select() or poll()...

It's actually the index of the signalled object, as you say, and of the lowest one if there are more than one, which means it's even crummier than I thought (how bad can it be, with a limit of 64 objects?!?): you can easily starve the highest numbered objects like that.

While WaitForMultipleObjects() is pretty awful, it's true they have some other pretty nifty things, like the overlapped I/O and completion ports stuff. I also like what you can do with a WNDPROC on an hidden window, without the application knowing anything, as long as it pumps the message queue (which on Win32, you have to do anyway for your program to work). Too bad there's nothing like that on Unix.

Win32 didn't spring fully formed 14 years ago, by the way, I'd point at all the "Ex" and "Ex2" suffixes lying about as examples of added features, when it's not whole APIs.

Kernel events without kevents

Posted Mar 17, 2007 1:50 UTC (Sat) by mikov (guest, #33179) [Link] (2 responses)

Well, I have to respectfully disagree (let's be extra careful not to get into a silly Win32 vs POSIX or whatever flame :-)

WaitForMultipleObjects() is not ideal, of course, and has limitations (personally I have never struggled with them after quite a few years of system programming for Win32), but they also come with their known workarounds. However it is nothing if not consistent and easy to use. It fits perfectly with the rest of the model - files, async. (overlapped), operations, threads, etc. There is never question of what is the correct way to implement something and more importantly there is no need for ridiculous hacks like using pipes from signals, etc.

Usually people who haven't seriously used the Base (*) Win32 API tend to underestimate it (I am not implying that you are one of them), but some parts of it are quite good, IMHO. As I said, the most important quality in my mind is that it is really well thought out and everything fits together. There are no corner cases which do not work. By comparison Linux or POSIX is has a few non-obvious problems and solutions - it shows that it has grown evolutionary.

(*) By the "base" Win32 API I mean pretty much only the IO, synchronzation and threads. Those parts have remained consistent and stable for as long as I remember. The rest of the Win32 API (the GUI, COM, etc) is of course a complete nightmare.

About the Ex functions. AFAIK, they were not added later, they simply provide different functionality. For example ReadFileEx() is not a newer more powerful version of ReadFile() - it just operates with a different model. More importantly, you can't emulate ReadFile() using ReadFileEx()! (Well, I could be wrong about when they were added ReadFileEx, since I haven't exactly tracked the API changes - it uses a fundamental concept though)

I agree with you that poll() and epoll() are more powerfull/convenient than WaitForMultipleObjects(). However they stand alone - you can't use them (yet) to wait for a signal, for a completion of a child process, for a semaphore (or has changed already?), etc. That is the problem !

BTW, I also tend to disagree about the utility of the WNDPROC in a hidden window for purposes like you describe. The whole UI paradigm in Win32 is a remnant from Win16, and is more or less incompatible with the rest of the API - thus the need for "hacks" like MsgWaitForMultipleObjects().

In my Windows code I've always tried at all cost to avoid using window messages for anything besides "pure" UI - in my experience they make the code very fragile (and obviously non-portable). You also have the problem that UI operations performance can limit your background tasks. I am not aware of any contemporary Win32 non-GUI API that relies on messages. (Win16 sockets used to, but that has been deprecated in Win32)

Kernel events without kevents

Posted Mar 17, 2007 11:59 UTC (Sat) by pphaneuf (guest, #23480) [Link] (1 responses)

64 is a pretty low number of objects, I'd say, but it depends on the design. It seems more oriented to a "one thread per connection" design, where a single connection could be handling a few things (waiting for an answer from a database, but also watching the socket for disconnection, say). Also, the way it can easily lead to starvation seems like a major design problem with it. Careless ordering of the list of handles could lead to a stuck process! Odd, that.

I both agree and disagree with your reply. Win32 had advanced capabilities for a long time, such as completion ports (although I think they were severely crippled on the non-NT platforms, if I recall correctly). We still don't have many of those on Linux, and POSIX itself is so out of date, it's practically irrelevant. If you want to do something high-performance that works on multiple Unix platforms, you have to invent your own abstraction for the various high-performance APIs, because POSIX is just too pathetic. Sure, you can also make a POSIX version for complete portability, but you just know that this one isn't going to be the one for high-performance requirements.

You say

It's a kind strange, comparing the two APIs. I feel that the Win32 is a bit more integrated, yes, having had this stuff for a longer time (everything is an object with a handle, that can be given to WaitForMultipleObjects()), but somehow doesn't feel like that (reading from a socket with ReadFile() seems a bit odd).

The classic Unix API feels more tasteful, but seems to suffer from some rot. For example, newer additions, like threads, feels like they are crummy copies of other APIs, not fitting in well with the rest. For example, what you say about poll() not being able to wait on a number of things is really mostly linked to those new things (mostly related to threads) not having been made file descriptors in the first place! Remember that file descriptors are more or less the Unix equivalent of handles, despite having the name "file" in it (it's the "everything is a file" concept).

The signals are the exception (and thus, waiting for child processes), but that's by design. There are only two ways to affect a Unix process, synchronously (but not necessarily blocking) through a file descriptor, or asynchronously through a signal. And there are bridges to go from one to another (SIGIO to make file descriptor asynchronous, and pipes to turn signals into a synchronous event). The signal and pipe trick might sound hacky, but really, it's a matter of keeping the core simple, providing lightweight primitives that can be built upon to make the same effect.

It helps to remember Windows NT heritage from VMS, which had a number of distinct file types, that were opened and accessed completely differently, including a B-Tree file type. Think about it: on VMS, there was the equivalent of Berkeley DB in the kernel! Where on Unix, the philosophy is more that we'll give you the tools to write Berkeley DB, and you go from there. Hence my not being incredibly excited about signalfd(): it's nice, but not exceedingly so, since I could easily do without. In my Windows code I've always tried at all cost to avoid using window messages for anything besides "pure" UI - in my experience they make the code very fragile (and obviously non-portable). You also have the problem that UI operations performance can limit your background tasks. I am not aware of any contemporary Win32 non-GUI API that relies on messages. (Win16 sockets used to, but that has been deprecated in Win32)

You mention UI operations limiting your "background" tasks. This is always true, for a single thread, as really there isn't one that's foreground and the other that's background, they're all equals, competing for execution on a thread. If your UI is giving you grief, the answer isn't necessarily to stop using messages, but to have another thread, so that there's more execution contexts to do the work. Ideally, it would all be so uniform that UI code could run on any thread as well, so things would just get done as quickly as possible, no matter what it is, nothing having the edge over the other.

Kernel events without kevents

Posted Mar 17, 2007 18:35 UTC (Sat) by mikov (guest, #33179) [Link]

64 is a pretty low number of objects, I'd say, but it depends on the design. It seems more oriented to a "one thread per connection" design, where a single connection could be handling a few things (waiting for an answer from a database, but also watching the socket for disconnection, say). Also, the way it can easily lead to starvation seems like a major design problem with it. Careless ordering of the list of handles could lead to a stuck process! Odd, that.

Well, I've never had to use WaitForMultipleObjects() with more than a few handles. I think that for a single-threaded IO server design (which admittedly I haven't done), I'd use the ReadFileEx family of functions, which register a completion callback (APC) - this elliminates the 64 handle problem. Then I'd use WaitForMultipleObjectsEx() if I have to let the callbacks run _and_ wait for semaphores and stuff.

Addmitedly, since Win32 has always had well integrated threads, the need to shoehorn everything into a single thread has never been very strong.

It's a kind strange, comparing the two APIs. I feel that the Win32 is a bit more integrated, yes, having had this stuff for a longer time (everything is an object with a handle, that can be given to WaitForMultipleObjects()), but somehow doesn't feel like that (reading from a socket with ReadFile() seems a bit odd).

I guess it is force of habit. The other methods are also available - recv(),read(), etc. BTW, for some things the Win32 API can be a major PITA. For example serial communication - this is where you don't want overlapped operations and WaitForMultipleObjects - instead you really want select(). Alas, in Win32 select() works only on sockets ...

If your UI is giving you grief, the answer isn't necessarily to stop using messages, but to have another thread, so that there's more execution contexts to do the work. Ideally, it would all be so uniform that UI code could run on any thread as well, so things would just get done as quickly as possible, no matter what it is, nothing having the edge over the other.

But that's exactly it. In the main thread of a GUI application you need messages to drive the GUI. However if you create another thread (let's say for IO), there is no point at all in using Windows messages there. None of the APIs generate them, so you'd have to write code to send them yourself, make a message loop to handle them, etc. What's the point ? If you really needed some sort of of message queue in your design, there is zero reason to use the Windows GUI one - you are much better off coding something custom.