packet, in the form of the buffer, then moves through the network stack until it reaches the socket layer, where it is copied out of the kernel when the user’s program calls read(). Data sent by the program is handled in a similar way by the kernel, in that kernel buffers are eventually added to the transmit descriptor ring and a flag is then set to tell the device that it can place the data in the buffer on the network.
All of this work in the kernel leaves the last copy problem unsolved; several attempts have been made to extend the sockets API to remove this copy operation. 3,1 The problem remains as to how memory can safely be shared across the user/kernel boundary. The kernel cannot give its memory to the user program, because at that point it loses control over the memory. A user program that crashes may leave the kernel without a significant chunk of usable memory, leading to system performance degradation. There are also security issues inherent in sharing memory buffers across the kernel/user boundary. At this point there is no single answer to how a user program might achieve higher bandwidth using the sockets API.
For programmers who are more concerned with latency than with bandwidth, even less has been done. The only significant improvement for programs that are waiting for a network event has been the addition of a set of kernel events that a program can wait on. Kernel events, or kevents(), are an extension of the select() mechanism to encompass any possible event that the kernel might be able to tell the program about. Before the advent of kevents(), a user program could call select() on any file descriptor, which would let the program know when any of a set of file descriptors was readable, writ-able, or had an error. When programs were written to sit in a loop and wait on a set of file descriptors—for example, reading from the network and writing to disk—the select() call was sufficient, but once a program wanted to check for other events, such as timers and signals, select() no longer served. The problem for low-latency applica-
tions is that kevents() do not deliver data; they deliver only a signal that data is ready, just as the select() call did. The next logical step would be to have an event-based API that also delivers data. There is no good reason to have the application cross the user/kernel boundary twice simply to get the data that the kernel knows the application wants.
The sockets API not only presents performance problems to the application writer, but also narrows the type of communication that can take place. The client/server paradigm is inherently a 1:1 type of communication. Although a server may handle requests from a diverse group of clients, each client has only one connection to a single server for a request or set of requests. In a world in which each computer had only one network interface, that paradigm made perfect sense. A connection between a client and server is identified by a quad of <Source IP, Source Port, Destination IP, Destination Port>. Since services generally have a well-known destination port (e.g., 80 for HTTP), the only value that can easily vary is the source port, since the IP addresses are fixed.
In the Internet of 1982 each machine that was not a router had only a single network interface, meaning that to identify a service, such as a remote printer, the client computer needed a single destination address and port and had, itself, only a single source address and port to work with. The idea that a computer might have multiple ways of reaching a service was too complicated and far too expensive to implement. Given these constraints, there was no reason for the sockets API to expose to the programmer the ability to write a multihomed program— one that could manage which interfaces or connections mattered to it. Such features, when they were implemented, were a part of the routing software within the operating system. The only way programs could, eventually, get access to them was through an obscure set of nonstandard kernel APIs called a routing socket.
On a system with multiple network interfaces it is not possible, using the standard sockets API, to write an application that can easily be multihomed—that is, take advantage of both interfaces so that if one were to fail, or if the primary route over which the packets were flowing were to break, the application would not lose its connection to the server.
The recently developed SCTP (Stream Control Transport Protocol) 4 incorporates support for multihoming at the protocol level, but it is impossible to export this support through the sockets API. Several ad-hoc system calls
References:
Archives