File Handling With Little RAM
|To meet the requirements of the Mastery of Aspects for Higher Level [IB Computer Science], direct manipulation of files must be carried out in such a way that the whole file does not need to be read into RAM.|
It is important to know how to update files without using much RAM.
In the "old days" before the average nerd or even businesses could afford gigabytes of RAM for a personal computer, large quantities of data had to be stored, compared and processed using small amounts of RAM and relatively larger amounts of secondary storage, typically tape and later disk storage.
For example, the Commodore 8032 was a very popular SOHO (Small Office Home Office) computer in the early 1980's. The "80" stood for "80 columns, which was twice the number of columns of characters that most other personal computers of the day placed on the screen. The "32" represented the number of KB (KiloBytes) of RAM that came with the machine, which was about twice the memory that came with most other personal computers of the day.
File-handling techniques that used little RAM were essential processing a real-life database with 32 KB of RAM. Such techniques are still useful when the amount of data to be processed exceeds the available RAM capacity.
It happens that the International Baccalaureate requires that candidates in their higher level Computer Science program be able to search a data file and add, delete or change data in a single record without simultaneously storing copious numbers of records into RAM.
In February of 2000 IBO issued a circular on Clarifications of Internal Assessment Details. One page of clarifications concerned Direct Manipulation of Files.
At Higher Level, the Mastery of Aspects part of the Internal Assessment requires candidates to be able to manipulate files directly.
Candidates must attempt to show mastery of the following aspects:
- adding directly a new record to a file
- deleting directly a record from a file
- searching directly for a record in a file.
To satisfy these requirements, candidates must carry out file manipulation in such a way that files do not need to be read into RAM for manipulation.
For example, the record to be deleted must be found directly by searching the file in its actual location (by, for example, using a binary or linear search, reading a record, or small block or records, at once). The record could then be directly deleted (if the programming language supports this), or marked for deletion by using a flag field, or by using a rogue value in the key field.
Provided the file size alters, or a record marked for deletion is actually overwrittren during addition of new records, this manipulation would meet the requirement.
However, it is not acceptable, for example, to delete a record from a file, to read the whole file into a linked list or tree, to delete the node, and then to write the data back to the file. (Howefver, this manipulation would satisfy the requirement of being able to delete a data item from a linked list or tree.) Similarly, when directly adding new records to a file, it is not accpetable to read the file into RAM, add the data, and rewrite the file.
File-handling with little RAM is a basic requirment of exercises 16.7, 16.8 and 16.9 in Deitel's JAVA HOW TO PROGRAM. Those exercises are reproduced below for your inspection. First, however, the preferred strategy for answering those questions is given in blue. This strategy is preferred over loading all records into RAM at one time for at least two reasons:
You will learn and practice processing data with little RAM but relatively larger secondary storage. It defeats the purpose of the exercises to simply load all the records into a structure such as a vector, linked list or tree, then manipulate the data in RAM, then save all the data back to file after it has been processed. The point is to learn to process data through file-handling without copious amounts of RAM.
The International Baccalaureate Higher Level Dossier program must process records without copying the entire file into RAM.
Here the basic idea is to read the first record of the old master file into RAM, then read all of the records from the transaction file into RAM one record at a time, comparing each and every transaction record one record at a time with that first record from the old master file, updating the single master file record when there is a match. After completing all comparisons between that first record from the old master file and all the records from the transaction file, save the updated single record to the new master file.
Now repeat the process with the second record from the old master file, comparing it with all records from the transaction file one record at a time until it too is updated. Then save the updated second record to the new master file.
Repeat this process until the last record in the old master file has been updated and saved (after being updated) to the new master file.
Then delete the old master file and rename the new master file to the name for the old master file.
Most file-handling in Java such as deleting, renaming, checking the existance of, and changing properties of a file, may be improved by using the Java File class defined in the java.io package. Java's standard File Class is introduced and covered in Chapter 87 - The File Class from Bradley Kjell's Introduction to Computer Science using Java.
Exercise 16.3 asked the reader to write a series of single statements. Actually, these statements form the core of an important type of file-processing program, namely, a file-matching program. In commercial data processing, it is common to have several files in each application system. In an accounts receivable system, for example, there is generally a master file containing detailed information about each customer, such as the customer's name, address, telephone number, outstanding balance, credit limit, discount terms, contract arrangements and possibly a condensed history of recent purchases and cash payments.
After writing the program of Exercise 16.7, write a simple program to create some test data for checking out the program. Use the sample account data in Fig. 16.23 and Fig. 16.24. Run the program of Exercise 16.7, using the files of test data created in this exercise. Print the new master file. Check that the accounts have been updated correctly.
It is possible (actually common) to have several transaction records with the same record key. This occurs because a particular customer might make several purchases and cash payments during a business period. Rewrite your accounts receivable file-matching program of Exercise 16.7 to provide for the possibility of handling several transaction records with the same record key. Modify the test data of Exercise 16.8 to include the additional transaction records in Fig. 16.25.