2 April 2004 1 - socket event synchronization in the virtual machineBasically, the VM needs to wait for network events at the host sockets library level in order to be efficient, and not doing so causes unnecessary complexity in addition to degraded performance.On every supported platform, except Macintosh, the VM polls outside the host kernel for network events, using non-blocking socket functions exclusively. This is inefficient. It also leads to complications. For example, the Unix implementation makes special effort to ensure keyboard interrupts are handled during waits for network events [see aioPollForIO()]. On Win32, the sources are riddled with socket access protection [LOCKSOCKET() and UNLOCKSOCKET()] and "polling cycles" [SetEvent(sqPollEvent), which can cause delays of up to 100ms]. The Macintosh implementation uses a notification service provided by MacTCP instead of polling for network events. But Apple is phasing out MacTCP. There is some confusion as to the precise nature of the sockets interface that future versions of MacOS will provide. However, it seems that it will be a Berkeley-style interface, so Macintosh would be subject to complications such as those mentioned above if the current networking framework were used. Finally, the way a socket's network event synchronization semaphore is used in Smalltalk is inefficient and awkward. Each method in the "waiting" protocol of Socket uses a >>whileTrue: loop to wait for a particular event (inefficient), broken by the signalling of the synchronization semaphore by a socket primitive or by a timeout Delay (awkward). The loop test selects for the desired event, effectively acting as an inefficient hedge against the occurrence of an undesired socket event. It would make a lot more sense to simply tell the VM to associate a Smalltalk semaphore with a particular event, wait on the semaphore until the VM signals it, and let the primitives worry about timeouts (timeouts are supported by the Berkeley interface). The >>waitTimeoutMSecs: mechanism is unnecessary. In my implementation, socket operations are accomplished in the following way:
I suspect the current implementation was heavily guided by the alternate programming model of MacTCP (and possibly a hesitancy to use MacOS threads-- they may not even have existed when the initial design was conceived). 2 - representation of socket states and transport typesThe current implementation models sockets in all states and transport types via a single Behavior. This complicates the primitives by forcing them to keep track of the state transitions and operation semantics, increases coupling between them, and exposes them to extremely subtle bugs.My implementation models socket states and transport types as runtime objects in Smalltalk, where it is easier to understand. Instead of one Socket class, I use separate classes for server TCP sockets, incoming TCP client sockets (used by servers), outgoing TCP client sockets, and UDP sockets. These are all subclasses of a new NetResource class, whose subclasses also include my socket address resolver class (name resolution is performed with the same synchronization scheme used for sockets), as well as hardware ports like MIDI ports and serial ports. This class factoring makes the primitives simpler, shorter (and thus more efficient), and easier to port than those of the current implementation. 3 - Slang and handwritten host codeThe current implementation uses Slang as far down as it can, with stubs for handwritten host code. I think this mixture makes the system harder to understand. I don't see much benefit in having part of the primitive source in Smalltalk when most of it is in handwritten C, at least until someone writes the IP stack itself in Smalltalk (I almost did this for the aforementioned custom hardware project...).My implementation implements the primitives completely in handwritten C and uses the plugin interface for invocation (there's no reason for the network primitives to be statically linked into the interpreter). It seems simpler this way; it's certainly easier to debug. Moving on to my higher-level Smalltalk networking framework...My framework is based on streams, whereas the offical one is not. The NetResource class mentioned above is a subclass of a new ExternalResource class. Other subclasses include a new File class. Instances of classes in the ExternalResource hierarchy are designed to be used with a new NetStream class. Using NetStreams, one can use streams on all sorts of external resources with a minimal, consistent, and familiar message interface. Besides being easier to use, NetStreams on sockets end up copying data much less than the current sockets support.The flow hierarchy itself uses NetStreams (on any external resource). It makes writing new clients and servers much easier, by keeping track of common bookkeeping and Process management details. I see a lot of these things repeated in the current implementations of POP and HTTP, for example. I think the "pluggable" scheme I wrote is a lot more effective than using a new Socket subclass for each protocol. |