Sun Microsystems, Inc.
spacer |
black dot
6.  Socket Interfaces Advanced Socket Topics Address Binding  Previous   Contents   Next 

Zero Copy and Checksum Off-load

In SunOS version 5.6 and compatible versions, the TCP/IP protocol stack has been enhanced to support two new features: zero copy and TCP checksum off-load.

  • Zero copy uses virtual memory MMU remapping and a copy-on-write technique to move data between the application and the kernel space.

  • Checksum off-loading relies on special hardware logic to off-load the TCP checksum calculation.

Although zero copy and checksum off-loading are functionally independent of each other, they have to work together to obtain the optimal performance. Checksum off-loading requires hardware support from the network interface. Without this hardware support, zero copy is not enabled.

Zero copy requires that the applications supply page-aligned buffers before applying virtual memory page remapping. Applications should use large, circular buffers on the transmit side to avoid expensive copy-on-write faults. A typical buffer allocation is sixteen 8K buffers.

Socket Options

You can set and get several options on sockets through setsockopt(3SOCKET) and getsockopt(3SOCKET). For example, you can change the send or receive buffer space. The general forms of the calls are:

setsockopt(s, level, optname, optval, optlen);


getsockopt(s, level, optname, optval, optlen);

In some cases, such as setting the buffer sizes, these are only hints to the operating system. The operating system can adjust the values appropriately at any time.

The arguments of setsockopt(3SOCKET) and getsockopt(3SOCKET) calls are:


Socket on which the option is to be applied


Specifies the protocol level, such as socket level, indicated by the symbolic constant SOL_SOCKET in sys/socket.h


Symbolic constant defined in sys/socket.h that specifies the option


Points to the value of the option


Points to the length of the value of the option

For getsockopt(3SOCKET), optlen is a value-result argument, initially set to the size of the storage area pointed to by optval and set on return to the length of storage used.

When a program needs to determine an existing socket's type (for example, stream or datagram), the program should invoke inetd(1M) by using the SO_TYPE socket option and the getsockopt(3SOCKET) call:

#include <sys/types.h>
#include <sys/socket.h>
int type, size;
size = sizeof (int);
if (getsockopt(s, SOL_SOCKET, SO_TYPE, (char *) &type, &size) <0) {

After getsockopt(3SOCKET), type is set to the value of the socket type, as defined in sys/socket.h. For a datagram socket, type would be SOCK_DGRAM.

inetd Daemon

The inetd(1M) daemon is invoked at startup time and gets the services for which it listens from the /etc/inet/inetd.conf file. The daemon creates one socket for each service listed in /etc/inet/inetd.conf, binding the appropriate port number to each socket. See the inetd(1M) man page for details.

The inetd(1M) daemon polls each socket, waiting for a connection request to the service corresponding to that socket. For SOCK_STREAM type sockets, inetd(1M) accepts (accept(3SOCKET)) on the listening socket, forks (fork(2)), duplicates (dup(2)) the new socket to file descriptors 0 and 1 (stdin and stdout), closes other open file descriptors, and executes (exec(2)) the appropriate server.

The primary benefit of using inetd(1M) is that services not in use do not consume machine resources. A secondary benefit is that inetd(1M) does most of the work to establish a connection. The server started by inetd(1M) has the socket connected to its client on file descriptors 0 and 1, and can immediately read, write, send, or receive. Servers can use buffered I/O as provided by the stdio conventions, as long as they use fflush(3C) when appropriate.

The getpeername(3SOCKET) routine returns the address of the peer (process) connected to a socket. This routine is useful in servers started by inetd(1M). For example, you could use this routine to log the Internet address such as fec0::56:a00:20ff:fe7d:3dd2, which is conventional for representing the IPv6 address of a client. An inetd(1M) server could use the following sample code:

    struct sockaddr_storage name;
    int namelen = sizeof (name);
    char abuf[INET6_ADDRSTRLEN];
    struct in6_addr addr6;
    struct in_addr addr;

    if (getpeername(fd, (struct sockaddr *)&name, &namelen) == -1) {
    } else {
        addr = ((struct sockaddr_in *)&name)->sin_addr;
        addr6 = ((struct sockaddr_in6 *)&name)->sin6_addr;
        if (name.ss_family == AF_INET) {
                (void) inet_ntop(AF_INET, &addr, abuf, sizeof (abuf));
        } else if (name.ss_family == AF_INET6 &&
                   IN6_IS_ADDR_V4MAPPED(&addr6)) {
                /* this is a IPv4-mapped IPv6 address */
                IN6_MAPPED_TO_IN(&addr6, &addr);
                (void) inet_ntop(AF_INET, &addr, abuf, sizeof (abuf));
        } else if (name.ss_family == AF_INET6) {
                (void) inet_ntop(AF_INET6, &addr6, abuf, sizeof (abuf));

        syslog("Connection from %s\n", abuf);

Broadcasting and Determining Network Configuration

Broadcasting is not supported in IPv6. It is supported only in IPv4.

Messages sent by datagram sockets can be broadcast to reach all of the hosts on an attached network. The network must support broadcast because the system provides no simulation of broadcast in software. Broadcast messages can place a high load on a network because they force every host on the network to service them. Broadcasting is usually used for either of two reasons:

  • To find a resource on a local network without having its address

  • For functions like routing that require information to be sent to all accessible neighbors

To send a broadcast message, create an Internet datagram socket:
s = socket(AF_INET, SOCK_DGRAM, 0);

and bind a port number to the socket:

sin.sin_family = AF_INET;
sin.sin_addr.s_addr = htonl(INADDR_ANY);
sin.sin_port = htons(MYPORT);
bind(s, (struct sockaddr *) &sin, sizeof sin);

The datagram can be broadcast on only one network by sending to the network's broadcast address. A datagram can also be broadcast on all attached networks by sending to the special address INADDR_BROADCAST, defined in netinet/in.h.

The system provides a mechanism to determine a number of pieces of information about the network interfaces on the system, including the IP address and broadcast address. The SIOCGIFCONF ioctl(2) call returns the interface configuration of a host in a single ifconf structure. This structure contains an array of ifreq structures, one for each address family supported by each network interface to which the host is connected.

The following example shows the ifreq structures defined in net/if.h.

Example 6-14 net/if.h Header File

struct ifreq {
#define IFNAMSIZ 16
char ifr_name[IFNAMSIZ]; /* if name, e.g., "en0" */
union {
		struct sockaddr ifru_addr;
		struct sockaddr ifru_dstaddr;
		char ifru_oname[IFNAMSIZ]; /* other if name */
		struct sockaddr ifru_broadaddr;
		short ifru_flags;
		int ifru_metric;
		char ifru_data[1]; /* interface dependent data */
		char ifru_enaddr[6];
} ifr_ifru;
#define ifr_addr ifr_ifru.ifru_addr
#define ifr_dstaddr ifr_ifru.ifru_dstaddr
#define ifr_oname ifr_ifru.ifru_oname
#define ifr_broadaddr ifr_ifru.ifru_broadaddr
#define ifr_flags ifr_ifru.ifru_flags
#define ifr_metric ifr_ifru.ifru_metric
#define ifr_data ifr_ifru.ifru_data
#define ifr_enaddr ifr_ifru.ifru_enaddr

  Previous   Contents   Next