Memory Optimization Using Block Structures

ACM SIGPLAN Notices. November 1975.
Memory Optimization Using Block Structures
Stephen Leibowitz

One of the most distinctive features of Algol is its block structure. High level languages are generally rather wasteful of memory storage. By using blocks, memory requirements can be reduced.

When leaving a block all variables declared in the block become undefined. The memory space reserved for these variables becomes free and can be used for storage requirements of other blocks or for other programs.

However, this process means that some machine instructions and data locations are needed for run time storage allocation and addressing. Depending on the particular system and program the total costs may be higher than it would be without the blocks. The following example illustrates this point:

BEGIN ----- A:BEGIN INTEGER ARRAY Z(1:10); ----- END; ----- GO TO A: END

In order to save 10 locations while outside the inner block a number of machine instructions and data locations are needed. The program could be improved by placing the array declaration in the program block.

A Fortran program can be considered as a single block with no inner blocks. Therefore one cannot assign and free storage as in Algol. An alternative method of memory optimization is available using the EQUIVALENCE statement. The EQUIVALENCE statement is a compiler directive that permits one to assign different variable names to the same memory location. Of course, the values of the variables assigned to the same memory location must not be needed by the program at the same time, since no matter how many names a location may have, it can contain only one quantity at a time. We can effectively accomplish the same result by using the same variable name to represent different quantities. However this obscures the fact that the quantities are different. Also, assigning different names permits the use of mnemonics. For example:

DIMENSION SALES(25OO),PRICE(2400),TAX(100) EQUIVALENCE (SALES,PRICE),(SALES(2401),TAX) ----- C ARRAY SALES IS NOT USED AFTER THIS POINT. C ARRAYS PRICE AND TAX ARE NOT USED BEFORE THIS POINT. ----- END

Of course things do not always work out so nicely. If array PRICE had 10 elements then only 110 locations would have been saved by using the EQUIVALENCE statement. But on the average a fair percentage of storage can be saved by using EQUIVALENCE.

The algorithm is this. For each variable not in COMMON find the range or “block” in which it is needed by the program. Do this by first locating the two references to the variable that are farthest apart and consider that your provisional block. Locate all labels inside the block and expand the block to include all references to these labels. Look at the new labels inside the expanded block and expand the block again to include all references to these labels. Continue this process until there are no external references to any of the labels inside the block.

It should be realized that the DATA statement assigns values to variables at compile time. Therefore the range of these variables starts with the beginning of the program and their blocks should be expanded accordingly. In the case where we wish to set the elements of an array to some common value it might be preferable to use a DO loop instead of a DATA statement. This will avoid expanding the block to the beginning of the program.

Each block now contains the range where its variable is needed by the program. A useful simplification can be made by combining blocks so that if a block contains part of another block, the range of one block is entirely within the range of the other. After that a simple stacking scheme can be used to equivalence variables.

It would be difficult for the programmer to use this algorithm to the limit on large programs. However an optimizing Fortran compiler might automate the process and internally equivalence variables. To save compile time we can restrict the equivalencing to arrays of a certain minimum size.

The equivalencing method is a static method of memory optimization in that it is accomplished without run time changes in program size. Using only equivalencing many locations may go unused for part of the time. The memory requirements vary throughout the program but the program size must be fixed at the maximum needed at any time. Since we now have a “block structure” in Fortran however, the compiler can go one step further and adjust the program size during execution. Not all systems are configured to support dynamic size programs though. This method should be used with discretion. Expanding the program may require moving programs around in memory in order to make the additional space available.

Closed subprograms present a problem. We can of course equivalence between variables within a subprogram. However, equivalencing alone may leave locations unused part of the time. The problem is more serious with subprograms since locations for variables would be unused while the subprogram was not executing. The other technique for reducing storage is to use dynamic size programs. In the case where there is only one reference to the subprogram the technique is used the same way as in the main program. The situation is different where the subprogram is referenced in more than one location. Generally the storage for a program is contiguous. When the program size is dynamic the storage for variables has to be allocated at the program limits. Since the limits vary, the addresses of subprogram variables would have to be relocated for each reference to the subprogram.

This process will reduce the overall efficiency of the subprogram. As with the main program dynamic size programs should not always be used.

A Fortran compiler is nothing more than a translator that converts Fortran programs into assembly-like code. Therefore it is not difficult to conceive of an assembler that would implement the methods mentioned. Other languages such as Cobol might also internally equivalence variables and dynamically adjust the program size.

The situation is more complicated in Algol because of dynamic arrays. We cannot equivalence these arrays because we do not know their limits at compile time. What is usually done is that upon entering a block, storage is allocated and, except in the program block, the address for each variable defined in the block is calculated. When a variable is referenced additional calculations may have to be performed to access it. Upon leaving a block the storage for variables defined in it is freed.

Much of this can be eliminated by separating the dynamic arrays from the other program locations in memory. The section containing the non dynamic variables can be handled in much the same way as in Fortran. Equivalencing would be easier though since a block structure already exists in Algol. We need only search the block in which the variable is defined to locate the range where it is needed. To save compile time we can use the entire block as an approximation to the range, especially for simple variables and small arrays. Because of the simpler addressing of fixed size arrays, it sometimes may be better to use them instead of variable size arrays.

Each section can have its size dynamically adjusted. If they are to be contiguous then the stacking schemes should be arranged so that the limits are adjusted at the beginning of the first section and at the end of the second section.

A few notes are in order on the use of these methods in a virtual environment. Most of the storage for a program would generally be located on secondary memory which is less expensive than main memory. The size of the program is not important but the amount of paging performed is. Using dynamic size programs will reduce the number of pages but not reduce the amount of paging. This is because the pages eliminated would not be part of the working set. Equivalencing is another matter however since it would reduce the working set. Hence, there is a place for equivalencing in a virtual environment while dynamic size programs would be of little value.