Message Linkage

A complex message can consist of several linked message blocks. If buffer size is limited or if processing expands the message, multiple message blocks are formed in the message, as shown in Figure 7-2. When a message is composed of multiple message blocks, the type associated with the first message block determines the overall message type, regardless of the types of the attached message blocks.

Figure 7-2 Linked Message Blocks

Queued Messages

A put procedure processes single messages immediately and can pass the message to the next module's put procedure using put or putnext. Alternatively, the message is linked on the message queue for later processing, to be processed by a module's service procedure (putq(9F)). Note that only the first module of a set of linked modules is linked to the next message in the queue.

Think of linked message blocks as a concatenation of messages. Queued messages are a linked list of individual messages that can also be linked message blocks.

Figure 7-3 Queued Messages

In Figure 7-3 messages are queued: Message 1 being the first message on the queue, followed by Message 2 and Message 3. Notice that Message 1 is a linked message consisting of more than one mblk.

Caution - Modules or drivers must not modify b_next and b_prev. These fields are modified by utility routines such as putq(9F) and getq(9F).

Shared Data

In Figure 7-4, two message blocks are shown pointing to one data block. db_ref indicates that there are two references (mblks) to the data block. db_base and db_lim point to an address range in the data buffer. The b_rptr and b_wptr of both message blocks must fall within the assigned range specified by the data block.

Figure 7-4 Shared Data Block

Data blocks are shared using utility routines (see dupmsg(9F) or dupb(9F)). STREAMS maintains a count of the message blocks sharing a data block in the db_ref field.

These two mblks share the same data and datablock. If a module changes the contents of the data or message type, it is visible to the owner of the message block.

When modifying data contained in the dblk or data buffer, if the reference count of the message is greater than one, the module should copy the message using copymsg(9F), free the duplicated message, and then change the appropriate data.

Note - Hardening Information. At Sun, it is assumed that a message with a db_ref > 1 is a "read-only" message and can be read but not modified. If the module wishes to modify the data, it should first copy the message, and free the original:

if ( dbp->db_ref > 1 ) {
	dblk_t	*newdbp;

	/* Get a copy of the message */
	newdbp = copymsg(dbp);

	/* Free the original */
	freemsg(dbp);

	/* make sure that we are now using the new dbp */
	dbp = newdbp;
}

STREAMS provides utility routines and macros (identified in Appendix B, Kernel Utility Interface Summary) to assist in managing messages and message queues, and in other areas of module and driver development. Always use utility routines to operate on a message queue or to free or allocate messages. If messages are manipulated in the queue without using the STREAMS utilities, the message ordering can become confused and cause inconsistent results.

Caution - Not adhering to the DDI/DKI can result in panics and system crashes.

Sending and Receiving Messages

Among the message types, the most commonly used messages are M_DATA, M_PROTO, and M_PCPROTO. These messages can be passed between a process and the topmost module in a stream, with the same message boundary alignment maintained between user and kernel space. This allows a user process to function, to some degree, as a module above the stream and maintain a service interface. M_PROTO and M_PCPROTO messages carry service interface information among modules, drivers, and user processes.

Modules and drivers do not interact directly with any interfaces except open(2) and close(2). The stream head translates and passes all messages between user processes and the uppermost STREAMS module. Message transfers between a process and the stream head can occur in different forms. For example, M_DATA and M_PROTO messages can be transferred in their direct form by getmsg(2) and putmsg(2). Alternatively, write(2) creates one or more M_DATA messages from the data buffer supplied in the call. M_DATA messages received at the stream head are consumed by read(2) and copied into the user buffer.

Any module or driver can send any message in either direction on a stream. However, based on their intended use in STREAMS and their treatment by the stream head, certain messages can be categorized as upstream, downstream, or bidirectional. For example, M_DATA, M_PROTO, or M_PCPROTO messages can be sent in both directions. Other message types such as M_IOACK are sent upstream to be processed only by the stream head. Messages to be sent downstream are silently discarded if received by the stream head. Table 7-1 and Table 7-2 indicate the intended direction of message types.

STREAMS enables modules to create messages and pass them to neighboring modules. read(2) and write(2) are not enough to enable a user process to generate and receive all messages. In the first place, read(2) and write(2) are byte-stream oriented with no concept of message boundaries. The message boundary of each service primitive must be preserved so that the start and end of each primitive can be located in order to support service interfaces. Furthermore, read(2) and write(2) offer only one buffer to the user for transmitting and receiving STREAMS messages. If control information and data is placed in a single buffer, the user has to parse the contents of the buffer to separate the data from the control information. Furthermore, read(2) and write(2) offer only one buffer to the user for transmitting and receiving STREAMS messages. If control information and data is placed in a single buffer, the user has to parse the contents of the buffer to separate the data from the control information.

getmsg(2) and putmsg(2) enable a user process and the stream to pass data and control information between one another while maintaining distinct message boundaries.

Data Alignment

Note - Hardening Information. There is no guarantee in STREAMS that a b_rptr or b_wptr will fall on a proper bit alignment. Most modules that pass data structures with pointers try to retain the desired bit alignment. If the module is in a stream where this is reasonably guaranteed, it does not need to check data alignment. However, for the purpose of hardening, modules that are concerned about data alignment should verify that pointers are properly aligned, or copy data in mblks to local structures that are properly aligned (see bcopy(3C)).

Note - Hardening Information. Ensure that the changing of pointers is uniform (b_rptr <= b_rptr). Keep pointers inside db_base and db_lim. It is easier to recover from an error if b_rptr and b_wptr are inside db_base and db_lim.

When a module changes the b_rptr and/or the b_wptr, it should verify the following relationship:

db_base <= b_rptr <= b_wptr <= db_lim

and

db_base < db_lim

Message Queues and Message Priority

Message queues grow when the STREAMS scheduler is delayed from calling a service procedure by system activity, or when the procedure is blocked by flow control. When called by the scheduler, a module's service procedure processes queued messages in a FIFO manner (getq(9F)). However, some messages associated with certain conditions, such as M_ERROR, must reach their stream destination as rapidly as possible. This is accomplished by associating priorities with the messages. These priorities imply a certain ordering of messages in the queue, as shown in Figure 7-5.

Each message has a priority band associated with it. Ordinary messages have a priority band of zero. The priority band of high-priority messages is ignored since they are high priority and thus not affected by flow control. putq(9F) places high-priority messages at the head of the message queue, followed by priority band messages (expedited data) and ordinary messages.

Figure 7-5 Message Ordering in a Queue

When a message is queued, it is placed after the messages of the same priority already in the queue (in other words, FIFO within their order of queueing). This affects the flow-control parameters associated with the band of the same priority. Message priorities range from 0 (normal) to 255 (highest). This provides up to 256 bands of message flow within a stream. An example of how to implement expedited data would be with one extra band of data flow (priority band 1), is shown in Figure 7-6. Queues are explained in detail in the next section.

Figure 7-6 Message Ordering with One Priority Band

High-priority messages are not subject to flow control. When they are queued by putq(9F), the associated queue is always scheduled, even if the queue has been disabled (noenable(9F)). When the service procedure is called by the stream's scheduler, the procedure uses getq(9F) to retrieve the first message on queue, which is a high-priority message. Service procedures must be implemented to act on high-priority messages immediately. The mechanisms just mentioned--priority message queueing, absence of flow control, and immediate processing by a procedure--result in rapid transport of high-priority messages between the originating and destination components in the stream.

In general, high-priority messages should be processed immediately by the module's put procedure and not placed on the service queue.

Caution - A service procedure must never queue a high-priority message on its own queue because an infinite loop results. The enqueuing triggers the queue to be immediately scheduled again.

Message Queues

The queue is the fundamental component of a stream. It is the interface between a STREAMS module and the rest of the stream, and is the repository for deferred message processing. For each instance of an open driver or pushed module or stream head, a pair of queues is allocated, one for the read side of the stream and one for the write side.

The RD(9F), WR(9F), and OTHERQ(9F) routines allow reference of one queue from the other. Given a queue, RD(9F) returns a pointer to the read queue, WR(9F) returns a pointer to the write queue and OTHERQ(9F) returns a pointer to the opposite queue of the pair (see QUEUE(9S)).

By convention, queue pairs are depicted graphically as side- by-side blocks, with the write queue on the left and the read queue on the right (see Figure 7-7).

Figure 7-7 Queue Pair Allocation

`queue`(9S) Structure

As previously discussed, messages are ordered in message queues. Message queues, message priority, service procedures, and basic flow control all combine in STREAMS. A service procedure processes the messages in its queue. If there is no service procedure for a queue, putq(9F) does not schedule the queue to be run. The module developer must ensure that the messages in the queue are processed. Message priority and flow control are associated with message queues.

The queue structure is defined in stream.h as a typedef queue_t, and has the following public elements:

struct  qinit   *q_qinfo;	/* procs and limits for queue */
struct  msgb    *q_first;	/* first data block */
struct  msgb    *q_last;	/* last data block */
struct  queue   *q_next;	/* Q of next stream */
struct  queue   *q_link;	/* to next Q for scheduling */
void            *q_ptr;		/* to private data structure */
size_t          q_count;	/* number of bytes on Q */
uint            q_flag;		/* queue state */
ssize_t         q_minpsz;	/* min packet size accepted by this module */
ssize_t         q_maxpsz;	/* max packet size accepted by this module */
size_t          q_hiwat;	/* queue high-water mark */
size_t          q_lowat;	/* queue low-water mark */

q_first points to the first message on the queue, and q_last points to the last message on the queue. q_count is used in flow control and contains the total number of bytes contained in normal and high-priority messages in band 0 of this queue. Each band is flow controlled individually and has its own count. For more details, see "qband Structure". qsize(9F) can be used to determine the total number of messages on the queue. q_flag indicates the state of the queue. See Table 7-3 for the definitions of these flags.

q_minpsz contains the minimum packet size accepted by the queue, and q_maxpsz contains the maximum packet size accepted by the queue. These are suggested limits, and some implementations of STREAMS may not enforce them. The SunOS stream head enforces these values but they are voluntary at the module level. You should design modules to handle messages of any size.

q_hiwat indicates the limiting maximum number of bytes that can be put on a queue before flow control occurs. q_lowat indicates the lower limit where STREAMS flow control is released.

q_ptr is the element of the queue structure where modules can put values or pointers to data structures that are private to the module. This data can include any information required by the module for processing messages passed through the module, such as state information, module IDs, routing tables, and so on. Effectively, this element can be used any way the module or driver writer chooses. q_ptr can be accessed or changed directly by the driver, and is typically initialized in the open(9E) routine.

When a queue pair is allocated, streamtab initializes q_qinfo, and module_info initializes q_minpsz, q_maxpsz, q_hiwat, and q_lowat. Copying values from the module_info structure enables them to be changed in the queue without modifying the streamtab and module_info values.

Table 7-3 lists the queue(9S) flags.

Table 7-3 Queue Flags

Flag	Description
`QENAB`	The queue is enabled to run the service procedure
`QWANTR`	Someone wants to read queue
`QWANTW`	Someone wants to write to queue
`QFULL`	The queue is full
`QREADR`	Set for all read queues
`QUSE`	This queue in use (allocation)
`QNOENB`	Do not enable the queue when data is placed on it


7. STREAMS Framework - Kernel Level Kernel-Level Messages Message Structure