Sun Microsystems, Inc.
spacer |
black dot
9.  Real-time Programming and Administration Memory Locking Locking a Page   Previous   Contents   Next 

Unlocking a Page

To unlock a page of memory, a process requests the release of a segment of locked virtual pages by a calling munlock(3C), which decrements the lock counts of the specified physical pages. After decrementing a page's lock count to 0, the page swaps normally.

Locking All Pages

A superuser process can request that all mappings within its address space be locked by a call to mlockall(3C). If the flag MCL_CURRENT is set, all the existing memory mappings are locked. If the flag MCL_FUTURE is set, every mapping that is added to or that replaces an existing mapping is locked into memory.

Sticky Locks

A page is permanently locked into memory when its lock count reaches 65535 (0xFFFF). The value 0xFFFF is defined by implementation and might change in future releases. Pages locked in this manner cannot be unlocked. Reboot the system to recover.

High Performance I/O

This section describes I/O with real-time processes. In SunOS, the libraries supply two sets of interfaces and calls to perform fast, asynchronous I/O operations. The POSIX asynchronous I/O interfaces are the most recent standard. The SunOS environment also provides file and in-memory synchronization operations and modes to prevent information loss and data inconsistency.

Standard UNIX I/O is synchronous to the application programmer. An application that calls read(2) or write(2) usually waits until the system call has finished.

Real-time applications need asynchronous, bounded I/O behavior. A process that issues an asynchronous I/O call proceeds without waiting for the I/O operation to complete. The caller is notified when the I/O operation has finished.

Asynchronous I/O can be used with any SunOS file. Files are opened synchronously and no special flagging is required. An asynchronous I/O transfer has three elements: call, request, and operation. The application calls an asynchronous I/O interface, the request for the I/O is placed on a queue, and the call returns immediately. At some point, the system dequeues the request and initiates the I/O operation.

Asynchronous and standard I/O requests can be intermingled on any file descriptor. The system maintains no particular sequence of read and write requests. The system arbitrarily resequences all pending read and write requests. If a specific sequence is required for the application, the application must insure the completion of prior operations before issuing the dependent requests.

POSIX Asynchronous I/O

POSIX asynchronous I/O is performed using aiocb structures. An aiocb control block identifies each asynchronous I/O request and contains all of the controlling information. A control block can be used for only one request at a time and can be reused after its request has been completed.

A typical POSIX asynchronous I/O operation is initiated by a call to aio_read(3RT) or aio_write(3RT). Either polling or signals can be used to determine the completion of an operation. If signals are used for completing operations, each operation can be uniquely tagged. The tag is then returned in the si_value component of the generated signal. See the siginfo(3HEAD) man page.


aio_read(3RT) is called with an asynchronous I/O control block to initiate a read operation.


aio_write(3RT) is called with an asynchronous I/O control block to initiate a write operation.

aio_return, aio_error

aio_return(3RT) and aio_error(3RT) are called to obtain return and error values, respectively, after an operation is known to have completed.


aio_cancel(3RT) is called with an asynchronous I/O control block to cancel pending operations. It can be used to cancel a specific request, if the control block specifies one, or all of the requests pending for the specified file descriptor.


aio_fsync(3RT) queues an asynchronous fsync(3C) or fdatasync(3RT) request for all of the pending I/O operations on the specified file.


aio_suspend(3RT) suspends the caller as though one, or more, of the preceding asynchronous I/O requests had been made synchronously.

Solaris Asynchronous I/O

This section discusses asynchronous I/O operations in the Solaris operating environment.

Notification (SIGIO)

When an asynchronous I/O call returns successfully, the I/O operation has only been queued and waits to be done. The actual operation has a return value and a potential error identifier, which would have been returned to the caller if the call had been synchronous. When the I/O is finished, both the return and error values are stored at a location given by the user at the time of the request as a pointer to an aio_result_t. The structure of the aio_result_t is defined in <sys/asynch.h>:

typedef struct aio_result_t {
 	ssize_t	aio_return; /* return value of read or write */
 	int 		aio_errno;  /* errno generated by the IO */
 } aio_result_t;

When the aio_result_t has been updated, a SIGIO signal is delivered to the process that made the I/O request.

Note that a process with two or more asynchronous I/O operations pending has no certain way to determine which request, if any, is the cause of the SIGIO signal. A process receiving a SIGIO should check all its conditions that could be generating the SIGIO signal.


The aioread(3AIO) routine is the asynchronous version of read(2). In addition to the normal read arguments, aioread(3AIO) takes the arguments specifying a file position and the address of an aio_result_t structure in which the system stores the result information about the operation. The file position specifies a seek to be performed within the file before the operation. Whether the aioread(3AIO) call succeeds or fails, the file pointer is updated.


The aiowrite(3AIO) routine is the asynchronous version of write(2). In addition to the normal write arguments, aiowrite(3AIO) takes arguments specifying a file position and the address of an aio_result_t structure in which the system is to store the resulting information about the operation.

The file position specifies that a seek operation is to be performed within the file before the operation. If the aiowrite(3AIO) call succeeds, the file pointer is updated to the position that would have resulted in a successful seek and write. The file pointer is also updated when a write fails to allow for subsequent write requests.


The aiocancel(3AIO) routine attempts to cancel the asynchronous request whose aio_result_t structure is given as an argument. An aiocancel(3AIO) call succeeds only if the request is still queued. If the operation is in progress, aiocancel(3AIO) fails.


A call to aiowait(3AIO) blocks the calling process until at least one outstanding asynchronous I/O operation is completed. The timeout parameter points to a maximum interval to wait for I/O completion. A timeout value of zero specifies that no wait is wanted. aiowait(3AIO) returns a pointer to the aio_result_t structure for the completed operation.


To determine the completion of an asynchronous I/O event synchronously rather than depend on a SIGIO interrupt, use poll(2). You can also poll to determine the origin of a SIGIO interrupt.

Use of poll(2) for very large numbers of files is slow. This problem is resolved by poll(7D).


Using /dev/poll provides a highly scalable way of polling a large number of file descriptors. This scalability is provided through a new set of APIs and a new driver, /dev/poll. The /dev/poll API is an alternative to, not a replacement of, poll(2). Use poll(7D) to provide details and examples of the /dev/poll API. When used properly, the /dev/poll API scales much better than poll(2). This API is especially suited for applications that satisfy the following criteria:

  • Applications that repeatedly poll a large number of file descriptors

  • Polled file descriptors that are relatively stable, meaning that they are not constantly closed and reopened

  • The set of file descriptors that actually have polled events pending is small, comparing to the total number of file descriptors being polled


Files are closed by calling close(2). The call to close(2) cancels any outstanding asynchronous I/O request that can be closed. close(2) waits for an operation that cannot be cancelled (see "aiocancel"). When close(2) returns, no asynchronous I/O is pending for the file descriptor. Only asynchronous I/O requests queued to the specified file descriptor are cancelled when a file is closed. Any I/O pending requests for other file descriptors are not cancelled.

Synchronized I/O

Applications might need to guarantee that information has been written to stable storage, or that file updates are performed in a particular order. Synchronized I/O provides for these needs.

Synchronization Modes

Under SunOS, data is successfully transferred to a file for a write operation when the system ensures that all written data is readable after any subsequent open of the file. This check assumes no failure of the physical storage medium, even one that follows a system or power failure. Data is successfully transferred for a read operation when an image of the data on the physical storage medium is available to the requesting process. An I/O operation is complete when either the associated data has been successfully transferred or the operation has been diagnosed as unsuccessful.

An I/O operation has reached synchronized I/O data integrity completion when:

  • For reads, the operation has been completed or diagnosed if unsuccessful. The read is complete only when an image of the data has been successfully transferred to the requesting process. If the synchronized read operation is requested at a time when there are pending write requests affecting the data to be read, these write requests are successfully completed prior to reading the data.

  • For writes, the operation has been completed or diagnosed if unsuccessful. The write is complete only when the data specified in the write request is successfully transferred, and all file system information required to retrieve the data is successfully transferred.

  • File attributes that are not necessary for data retrieval , such as access time, modification time, status change time, are not transferred prior to returning to the calling process.

  • Synchronized I/O file integrity completion is identical to synchronized I/O data integrity completion with the addition that all file attributes relative to the I/O operation, including access time, modification time, and status change time must be successfully transferred prior to returning to the calling process.

  Previous   Contents   Next