Sun Microsystems, Inc.
spacerspacer
spacer www.sun.com docs.sun.com |
spacer
black dot
 
 
2.  Solaris Kernel Tunables UFS bufhwm  Previous   Contents   Next 
   
 

ndquot

Description

Number of quota structures for the UFS file system that should be allocated. Relevant only if quotas are enabled on one or more UFS file systems. Because of historical reasons, the ufs: prefix is not needed.

Data Type

Signed integer

Default

((maxusers x 40) / 4) + max_nprocs

Range

0 to MAXINT

Units

Quota structures

Dynamic?

No

Validation

None. Excessively large values hang the system.

When to Change

When the default number of quota structures is not enough. This situation is indicated by the following message displayed on the console or written in the message log.

dquot table full
Commitment Level

Unstable

ufs_ninode

Description

Number of inodes to be held in memory. Inodes are cached globally (for UFS), not on a per-file system basis.

A key variable in this situation is ufs_ninode. This parameter is used to compute two key limits that affect the handling of inode caching. A high watermark of ufs_ninode / 2 and a low water mark of ufs_ninode / 4 are computed.

When the system is done with an inode, one of two things can happen:

  1. The file referred to by the inode is no longer on the system so the inode is deleted. After it is deleted, the space goes back into the inode cache for use by another inode (which is read from disk or created for a new file).

  2. The file still exists but is no longer referenced by a running process. The inode is then placed on the idle queue. Any referenced pages are still in memory.

When inodes are idled, the kernel defers the idling process to a later time. If a file system is a logging file system the kernel also defers deletion of inodes. Two kernel threads do this. Each thread is responsible for one of the queues.

When the deferred processing is done, the system drops the inode onto either a delete or idle queue, each of which has a thread that can run to process it. When the inode is placed on the queue, the queue occupancy is checked against the low watermark. If it is in excess of the low watermark, the thread associated with the queue is awakened. After it is awakened, the thread runs through the queue and forces any pages associated with the inode out to disk and frees the inode. The thread stops when it has removed 50% of the inodes on the queue at the time it was awakened.

A second mechanism is in place if the idle thread is unable to keep up with the load. When the system needs to find a vnode, it goes through the ufs_vget routine. The first thing vget does is check the length of the idle queue. If the length is above the high watermark, then it pops two inodes off the idle queue and "idles" them (flushes pages and frees inodes). It does this before it gets an inode for its own use.

The system does attempt to optimize by placing inodes with no in-core pages at the head of the idle list and inodes with pages at the end of the idle list, but it does no other ordering of the list. Inodes are always removed from the front of the idle queue.

The only time that inodes are removed from the queues as a whole is when a sync, unmount, or remount occur.

For historical reasons, this parameter does not require the ufs: prefix.

Data Type

Signed integer

Default

ncsize

Range

0 to MAXINT

Units

Inodes

Dynamic?

Yes

Validation

If ufs_ninode is less than or equal to zero, the value is set to ncsize.

When to Change

When the default number of inodes is not enough. If the maxsize reached field as reported by kstat -n inode_cache is larger than the maxsize field in the kstat, the value of ufs_ninode may be too small. Excessive inode idling (described previously) can also be a problem.

This situation can be identified by using kstat -n inode_cache to look at the inode_cache kstat. Thread idles are inodes idled by the background threads while vget idles are idles by the requesting process before using an inode.

Commitment Level

Unstable

ufs:ufs_WRITES

Description

If ufs_WRITES is non-zero, the number of bytes outstanding for writes on a file is checked. See ufs_HW subsequently to determine whether the write should be issued or should be deferred until only ufs_LW bytes are outstanding. The total number of bytes outstanding is tracked on a per-file basis so that if the limit is passed for one file, it won't affect writes to other files.

Data Type

Signed integer

Default

1 (enabled)

Range

0 (disabled), 1 (enabled)

Units

Toggle (on/off)

Dynamic?

Yes

Validation

None

When to Change

When you want UFS write throttling turned off entirely. If sufficient I/O capacity does not exist, disabling this parameter can result in long service queues for disks.

Commitment Level

Unstable

ufs:ufs_LW and ufs:ufs_HW

Description

ufs_HW is the number of bytes outstanding on a single file barrier value. If the number of bytes outstanding is greater than this value and ufs_WRITES is set, then the write is deferred. The write is deferred by putting the thread issuing the write to sleep on a condition variable.

ufs_LW is the barrier for the number of bytes outstanding on a single file below which the condition variable on which other sleeping processes are toggled. When a write completes and the number of bytes is less than ufs_LW, then the condition variable is toggled, which causes all threads waiting on the variable to awaken and try to issue their writes.

Data Type

Signed integer

Default

8 x 1024 x 1024 for ufs_LW and 16 x 1024 x 1024 for ufs_HW

Range

0 to MAXINT

Units

Bytes

Dynamic?

Yes

Validation

None

Implicit

ufs_LW and ufs_HW have meaning only if ufs_WRITES is not equal to zero. ufs_HW and ufs_LW should be changed together to avoid needless churning when processes awake and find that they either cannot issue a write (when ufs_LW and ufs_HW are too close) or when they might have waited longer than necessary (when ufs_LW and ufs_HW are too far apart).

When to Change

Consider changing these values when file systems consist of striped volumes. The aggregate bandwidth available can easily exceed the current value of ufs_HW. Unfortunately, this is not a per-file system setting.

When ufs_throttles is a non-trivial number. ufs_throttles can currently be accessed only with a kernel debugger.

Commitment Level

Unstable

 
 
 
  Previous   Contents   Next