INTERNATIONAL BACCALAUREATE ORGANISATION

Clarifications of Internal Assessment Details
February 2000

Aqua and lime text is authored by Mr. Donaldson.
Gold text is authored by IBO.

Direct Manipulation of Files

On Tuesday, February 12, 2002, this topic was discussed in great detail by Mr. Donaldson with his senior IB Higher Level Computer Science students of that year. Issues addressed by the IBO document of February, 2000, were debated. Mr. Donaldson's observations from that time morphed with his later revisions of February 20, 2004 are inserted below.

  1. Is a physical search, such as a sequential or binary search, acceptable when employing direct access file handling? YES!!

    IBO says YES!! This current document from IBO, printed in gold, clearly states that a record may be found "by, for example, using a binary or linear search". Note that IBO indicates only that either the binary or linear search are examples of acceptable search methods. IBO simply identifies that a direct search is done "directly by searching the file in its actual location ...." The emphasis seems is on "actual location", implying that a file ought to be searched by copying the contents of only one record at a time to RAM. In RAM the contents of the search field of the record is compared with the target field to determine is the sought-for record is found.

    Significantly, it is NOT an IBO-valid search of a direct access file to copy the entire file to an array or linked list and then search the array or linked list.

    If a key field of the records of a file are pre-sorted in some ordinal order such as alphabetical order, then a binary search may be performed. If the records are not in some type of pre-sorted ordinal order, then a linear search is necessary.

    It is necessary to read one and only one record into RAM at a time in order to compare the value of the key field of a single record with the value being sought.

    It is NOT acceptable to read into RAM the contents of the entire file into a linked list or array, search the linked list or array to get a match and thus the appropriate record number, and then write the desired change to the record itself.

  2. Does precisely identifying and thereby seeking a record by its record number constitute "searching directly for a record in a file"? NO!!

    If you already know the record number of the record that you seek, then it is incorrect to say that you are "searching" for the record. If you determine the location of a record by some means other than searching the file, then you did not honour IBO's requirement that you "search" a direct access file.

    Seeking a record with a record number assumes a different strategy. Using a record's number means that a program must save new records at the end of a file or overwrite an existing record that is "flagged" as deleted. Generally, records should not be physically deleted from anywhere other than the end of a file, as is possible by copying all n+1 records to locations n because this would change the record number of the records. The algorithm used to calculate the location of a record multiples the length of the record (in terms of bytes) by the record number and so assumes that a record is correlated with it's record number which, it is noted, is directly related to the relative location of the record in the file.

  3. In order to delete a record, is it acceptable to write the contents of a file less the record to be deleted to a second file? NO!!

    IBO clearly states that a record is directly deleted when "marked for deletion by using a flag field, or by using a rogue value in the key field." This is consistent with the speed advantage that direct access file handling has over sequential file handling.

    Furthermore, IBO is clear that "it is not acceptable ... to read the whole file into a linked list or tree ... and then write the data back to the file." Earlier this IBO document stated that "candidates must carry out file manipulation in such a way that files do not need to be read into RAM for manipulation." Note that files must not be read into RAM. It is both allowable and necessary that single records be read into RAM, one record at a time, in order to compare the target value with values of the corresponding field in the records.

    If IBO finds it objectionable to carry out file manipulation by reading an entire file into RAM and then overwriting the RAM contents back to the file, then it should also be objectionable to rewrite an entire file to a second file less the record to be deleted. In both cases, the speed advantage of direct access file handling is defeated by the relatively slow speed that it takes to write or rewrite a file.

    There are at least two ways to "delete" a record located in a direct access file.

    1. A record may be flagged as "deleted" by implementing a boolean field in the record where (true == in use) and (false == deleted) or, if there is a field that also records the record number, then that field may be given a value outside the valid range of record numbers, such as -99. The data of "new" records may then overwrite the data of older "deleted" records.

    2. A record may be "overwritten" with the last record, then delete that last record to avoid duplicate copies of it, then update the value of the record number of the moved (formerly last) record wherever reference to the record number is stored in the program. While updating a record number adds complexity, this technique minimizes "record creep". Record creep unnecessarily increases the size of the file by inserting new records at the end of the file while leaving the space intact of formerly "deleted" records.

At Higher Level, the Mastery of Aspects part of the Internal Assessment requires candidates to be able to manipulate files directly.

Candidates must attempt to show mastery of the following aspects:

To satisfy these requirements, candidates must carry out file manipulation in such a way that files do not need to be read into RAM for manipulation.

For example, the record to be deleted must be found directly by searching the file in its actual location (by, for example, using a binary or linear search, reading a record, or small block of records, at once). The record could then be directly deleted (if the programming language supports this), or marked for deletion by using a flag field, or by using a rogue value in the key field.

Provided the file size alters, or a record marked for deletion is actually overwritten during the addition of new records, this manipulation would meet the requirement.

However, it is not acceptable, for example, to delete a record from a file, to read the whole file into a linked list or tree, to delete the node, and then to write the data back to the file. (However, this manipulation would satisfy the requirement of being able to delete a data item from a linked list or tree.) Similarly, when directly adding new records to a file, it is not acceptable to read the file into RAM, add the data, and rewrite the file.

SUMMARY

To meet the requirements of the Mastery of Aspects for Higher Level, direct manipulation of files must be carried out in such a way that the whole file does not need to be read into RAM.

Incorporating User-friendly Features

Criterion G states:

Candidates should give attention to issues of usability during the design stage. The documentation should include some explanations. of the reasons for some of the usability decisions. To be given credit candidates must include features which make the program more user-friendly, such as helpful menus, help instructions, useful guidance to the user during the execution of the program. These should be documented in some way, for example, if an output screen is particularly well designed for readability a hard copy should be provided and labelled as such. Screen dumps and even photographs may be helpful for this criterion.

This criterion refers to the user interface which the program presents to the user (for example, menus and help instructions, rather than internal error checking).

Handling Errors

Criterion H states:

This refers to detecting and rejecting erroneous data input from the user, and preventing common run-time errors caused by calculations and data-file errors. Candidates are not expected to detect or correct intermittent or fatal hardware errors such as paper-out signals from the printer, or damaged disk drives, or to prevent data-loss during a power outage.

For this criterion, candidates must attempt to trap as many errors as possible.

If, the candidate's dossier genuinely has no need of error trapping (including no data input), and this is clearly documented and justified, the candidate can reach Achievement Level 2.

However, if the candidate's claim is incorrect (for example, the program accesses a file but there is no test to check that the file exists before access to it is attempted), a high score cannot be achieved.

The documentation in the dossier can take a variety of forms. For example, it could be a table with two columns, one which identifies any error possibilities, and one which shows the steps taken to trap the errors.

Documenting the Design Process

Criterion B states:

The solution to the problem should be thoroughly designed before any programs are written, and this design process must be documented. Good top-down design results in a flexible, general, extensible solution. In this category, both the quality of the resulting design and the design process are being evaluated. The design must include a detailed representation of the algorithms used (via pseudo-code, structure diagrams, etc.) that clearly illustrate the candidate's solution.

Top-down analysis (solution decomposition) means breaking down a problem into smaller problems. These are then broken down in turn until ultimately a pseudo-code representation is obtained which can be used as a basis for program construction. It is appropriate to use diagrams for the early stages. However, for the non-standard or non-trivial modules, the final stage must be pseudo-code at a level of detail equivalent to PURE. This final stage of design should lead easily into coding in an appropriate programming language. For example, an object-oriented design should be able to be coded into several object-oriented languages, whereas a procedure-oriented design should be able to be coded into any one of several block-structured languages.

This criterion refers to the documentation of the design process which does not include the final program listing.

DESIGNDEFINITION
Incomplete The first and last stages of a design and one or two stages in-between are included, but the design still contains obvious gaps.
Complete All the relevant decomposition from the problem definition through all stages to the final stage are included.
Non-portable The final stage of the design clearly demands the target programming language used by the candidate.
Portable The final stage of the design can be coded into more than one appropriate modular language.

For this criterion, candidates must explain the links between the levels of design to guide the reader through the design process.

Candidates must also document their dossiers thoroughly. To show mastery of an aspect, it is not sufficient if candidates only use it within a program: in the written documentation, candidates must include information about why a particular data structure is appropriate, how it is used (for example, how nodes are added, deleted and searched for) and where it is used in the program. In other words, candidates must provide cross-references between the documentation and specific procedures within the program.

Teacher Comments on the Dossier

If teachers add comments to dossiers as well as marking them, ready for moderation, this facilitates the moderation process. In addition, if teachers write a report for each candidate which justifies the Achievement Level awarded for each criterion, this also will facilitate the moderation process and make the feedback forms from the moderator more focused.


[Counter On Strike        [Home of Gerry Donaldson's Com Sci Gate]       
[Gerry Donaldson's Email Address]
csgate@donaldson.org
ICQ# 62833374
[EFC Blue Ribbon - Free Speech Online]

URL:   http://donaldson.org/    Last Revised:   February 22, 2004