File Maintenance

mvBase uses a hash-encoding scheme to allow direct access to any item in a file without searching through the entire file. The success of the hash-encoding, and thus the speed of access to the file, depends on a number of related factors, including the type of item-ID in the file, the size of the average item and of the largest and smallest items, and the structure of the file.

In hash-encoding, items are stored in groups based on a hashing algorithm that transforms the key into a group address, thus limiting the range of the search to a particular group. For a further discussion of the hashing algorithm used on the mvBase system, see File Management.

mvBase allows you to control hashing by adjusting the file modulo. The modulo determines the number of groups that will be used to store items. It is used in two ways. First, the modulo is used directly in the hashing algorithm which transforms the keys. The modulo should be a prime number to minimize hashing conflicts and ensure a more even distribution of items.

Second, because it specifies the number of groups, the modulo determines the number of items that are stored in each group (the group depth). The group depth is a function of the total number of items divided by the modulo. For example, if the number of items is 808 and the modulo is 101, the group depth is 8. The modulo should be sufficiently large that the expected number of items in a group fit in the frame allocated for the group.

Use the following formula to calculate what the minimum modulo of a file should be for efficiency:

(ITEM COUNT) * (AVG. BYTES/ITEM) / FRAME SIZE = MODULO

Round the result of the calculation up to the next prime number and use that number for the modulo. The ISTAT command described in the next section displays the item count and average number of bytes per item (among other things). On mvBase, the data frame size is 2 KB bytes; the abs frame size is 4 KB.

When you are creating an mvBase file, try to estimate the total number and the size of items that the file will contain. Remember, however, that files have a tendency to grow beyond their designers’ expectations, and therefore you should regularly check the size and hashing efficiency of important files and reallocate them when necessary.

As with most system administration tasks, this should be done on a regular basis, so that you solve file efficiency problems before users even become aware of them.

The following topics are presented in this section:

Tools for Checking File Efficiency

Reallocating Files

See Also

Daily System Maintenance

Communicating With Users

Monitoring System Activity

Optimizing Disk Drives Containing Virtual Memory Storage Files

Using the Windows Application Event Log