File

Everything in the D3 database is an item in a file.

Files are stored on disk in blocks called frames. The frames are uniform in size (4096) bytes. The primary file space physically consists of a contiguous set of disk frames. The beginning frame is the base frame and the number of contiguous frames or groups (including the base frame) is the modulo of the file. The modulo is defined at the time the file is created or resized. The system automatically adds or removes frames to or from groups as the amount of data within the group expands or contracts. The frames added automatically by the system are added to what is called the secondary file space. This means the user does not need to redimension a file as the amount of data in that file increases or decreases.

As items are added to the file, some groups require additional frames. These frames are allocated from Secondary File Space.

Frames within a group are explicitly linked together. Items are distributed to the various groups within a file based on a hashing algorithm that calculates the frame ID number (FID) of the first frame in the group. Items are distributed quasi-randomly between groups and sequentially within a group. The quasi-randomness is achieved by using the item-ID directly in the hashing algorithm. Because of the nature of the mathematical relationship defining the hashing algorithm, modulo numbers that are multiples of 2 or 5 should not be assigned.

To enable data transfer to and from disk to occur at optimum efficiency, the user needs only to remember to set the modulo of the file to the nearest prime number above that required to set the number of frames per group below unity (50 to 75 percent utilization traded for single disk access speed, and so on) based on the amount of data storage anticipated.

This feature makes the system more efficient, in that the probability of two or more users accessing the same group at the same time is reduced because of the algorithm of data distribution between groups.

Number-of-Items x Average-size-of-Items (Bytes) / Frame-Size (Bytes)

The result should be increased to the next largest prime number. The frame size for a particular release of D3 can be determined by executing the what TCL command. The number of available bytes within a frame is listed on the first line of the report under dfsize. The difference between dfsize and actual frame size is used to hold the frame linkages (forward and backward pointers).

When more than 50 percent of the groups have more than one frame, or the utilization drops below 50 percent, reallocate the files:

One method to force resizing the file is:

  1. Create a new file with the desired modulo.

  2. Copy all items from the old file into the new file.

  3. Delete the old file.

  4. Rename the new file to the old file name.

When resizing in this manner, the user must explicitly copy the index and other data from the old file’s file-defining item to the new file’s file-defining item before renaming all index and subroutine calls from the old file.

Files can also be reallocated using the system’s save and restore processes. When the save and restore commands are used, the indexes are handled automatically. Prior to saving the system on magnetic media, attribute 13 of the file dictionary file-defining item can be set to the new modulo for that file. When the system is restored, all files are reallocated according to the new modulo specified in attribute 13. When attribute 13 is not specified, the file is restored exactly as it was saved. The save and restore process allows reallocation of many files at one time.

f-resize is a program provided with the system to automatically calculate new modulos and mark attribute 13 of the files appropriately using the current statistics in the file of files file as modified by the last file-save.

File Naming Conventions

Character

Reason to avoid

space

Standard TCL command line delimiter.

quotation marks

Double quotation marks (") and single quotation marks (’) are used in the retrieval language to indicate literal values and specific item-IDs.

backslash (\)

Behaves like quotation marks.

caret (^)

Used in the retrieval language as a wildcard character.

parentheses ( )

Parentheses are used to indicate that options follow.

colon (:)

Used as an OSFI/filename separator in OSFI files. Cannot be replaced in an FSI Hot Backup scenario.

control characters

Control characters do not print out and can cause a variety of problems.

While the above list is not exhaustive, avoiding the use of these characters can prevent problems later.

Examples of valid item-IDs:

WARNING

D3 uses many common words as commands, modifiers, and connectives. It would be wise to avoid using those words as file names. For example, DICT.