Developing Applications in DATATRIEVE
Style and Convention

Joe H. Gallagher, Ph. D.



Within the framework of the DATATRIEVE language where keywords and language syntax is specified, a programmer has a great deal of freedom to influence the readability and maintainability of created dictionary objects and DATATRIEVE code. This freedom is mostly vested in the naming of elementary and group fields and dictionary objects (domains, records, tables, procedures, and plots). But this freedom extends to the naming of files and even to the formatting (or non-formatting) of nested compound statements.

Field Names

As examples of how the naming of elementary and group fields affects the readability and usability of DATATRIEVE, consider three essentially equivalent record definitions. The first record definition is as follows:


    define record ml-rec using optimize
    01 ml-rec.
       03 ln pic x(18).
       03 fn pic x(12).
       03 mi pic x.
       03 a1 pic x(30).
       03 a2 pic x(30).
       03 ct pic x(15).
       03 st pic x(2).
       03 zp pic 9(5).
    ;

The programmer of this first record definition learned BASIC about 10 years ago when only two character names where allowed (and maybe hasn't learned anything more since). Certainly the programmer is a very poor typist; he can't type more than two characters at a time without making a mistake. While the record definition is technically correct, procedures using this record definition are unlikely to be easily readable.

The second record definition is:


    define record mailing-list-record using optimize
    01 mailing-list-rec.
       03 mailing-list-last-name           picture x(18).
       03 mailing-list-first-name          picture x(12).
       03 mailing-list-middle-initial      picture x.
       03 mailing-list-address-line1       picture x(30).
       03 mailing-list-address-line2       picture x(30).
       03 mailing-list-city                picture x(15).
       03 mailing-list-state-code          picture x(2).
       03 mailing-list-zip-code            picture 9(5).
    ;

The mother of the programmer of this second record definition was obviously frightened by a COBOL compiler. Procedures written with this record definition are verbose and unwieldy; column headers look peculiar. Joining this domain with others is absolutely clear, but certainly interminably long. It is also clear that the programmer of this record definition does not type at all. If he had to type and use such long field names, they would certainly be shortened.

The third record definitions is:


    define record mailinglist-record using optimize
    01 mailinglist-rec.
       03 name.
          05 last-name pic x(18).
          05 first-name pic x(12).
          05 middle-init pic x
             valid if middle-init = " " or 
             middle-init bt "A" and "Z".
          05 print-name computed by choice of
             middle-init eq " " then first-name ||| last-name
             else first-name ||| middle-init |". "| last-name
             end-choice
             edit-string is x(34).
       03 address1 pic x(30).
       03 address2 pic x(30).
       03 city pic x(15).
       03 state-code pic x(2)
          valid if state-code in state-code-table.
       03 zip-code pic 9(5).
       03 print-citystatezip computed by
          city||", "|state-code|"  "|format zip-code using 9(5) .
    ;

Now, as you might have guessed, this third record definition was written by a DATATRIEVE programmer. The field names are of modest length, but completely and accurately describe the field. Validation clauses are included on MIDDLE_INIT and STATE_CODE; conveniently formatted composite print buffers, PRINT_NAME and PRINT_CITYSTATEZIP, are included in the record definition.

It is amazing how different in appearance and functionality these three essentially identical records are!

Domain names

Now consider the naming of domains. Since the name of the domain is the object or "target" of many statement or command verbs, selecting a domain name which is a plural noun creates the most natural syntax in DATATRIEVE. YACHTS, OWNERS, FAMILIES, PETS, and EMPLOYEES are some of the familiar example domains used in DATATRIEVE documentation. Most DATATRIEVE programmers do observe this domain naming convention. There is one notable exception to this convention. Where a plural noun which describes the contents of the domain has a singular equivalent, this equivalent is sometimes used as the domain name and describes the aggregation of "things" in the domain. An example of this would be the one used above, MAILINGLIST. Following the normal naming convention one would used NAMES_AND_ADDRESS rather than MAILINGLIST; but names and address make up the mailing list (i.e, the mailing list is the aggregation of names and address).

Violating the standard naming convention with something like:


    define domain QUICK using yachts-record on yachts.dat;

gives peculiar DATATRIEVE statements like:


    ready QUICK
    for QUICK with loa gt 12
     . . .

Using a part of speech other than a noun (or perhaps a possessive pronoun like MINE, YOURS, OURS, etc) spoils the English-like syntax of DATATRIEVE and certainly gives little indication as to the contents of the domain.

Record name and top-level group field name

The choice of a record name is probably the least important naming choice that a DATATRIEVE programmer makes. The record name appears only in the domain definition and the record definition; it is used no where else in DATATRIEVE. The only real necessity is that the name be chosen in a non-confusing way. A name such as YACHTS-TABLE would be a poor choice for a record name. My preference for a record name is to use the name of the domain followed by "-RECORD". By appending "-RECORD" it is absolutely clear that it is a record definition.

Unlike the record name, the top-level group field name is frequently used when restructuring a domain and sometimes in print statements. Examples in Digital's documentation usually name the top-level group field as either the domain name or the record name, but the well-known example of YACHTS is an exception where the record name is YACHT and the top-level group field is BOAT. The Application Design Tool (ADT) names both the top-level group field and the record name as the domain name followed by "-REC". I prefer to give the top-level group name a name which is different from both the domain and record name. Like ADT, I prefer to use the domain name followed by "-REC". When initially learning DATATRIEVE (several years ago), I was very confused by the lack of convention in naming domains, records, and (particularly) top-level field name. While the name conventions I have described are certainly not the only possible ones, I have found them to be relatively easy for beginning DATATRIEVE users to grasp.

Table names

Table names can be any unique name, but I have always found it least confusing if the table name contains the word "TABLE." Such table name as STATE_CODE_TABLE or TITLE_OF_ADDRESS_CODE_TABLE provide clear indication of the contents of the table and an almost natural syntax when used in a print or validation statement such as


    print state-code via state-code-table
or

    03 state-code pic x(2)
       valid if state-code in state-code-table.

Hierarchy names

Many DATATRIEVE programmers bitten with the "relational bug" try to avoid using hierarchies such as views. But the data structure of lists or one master record with a variable number of detail records occurs very often in many kinds of applications. The syntax of accessing fields that are in lists is certainly not a trivial as accessing non-hierarchical fields. However, some (if not most) of the difficulties are resolved by choosing appropriate names for the lists and list elements. Consider the following record definition for a domain named PATIENTS:


    define record PATIENTS-RECORD using optimize
    01 patients-rec.
       03 name pic . . .
       . . .
       03 number_of_diagnoses usage is byte
          valid if number_of_diagnoses between 0 and 9.
       03 diagnoses occurs 0 to 9 times depending on number_of_diagnoses.
          05 diagnosis pic x(n).
       . . .
    ;

By naming the list as a plural noun (just as we did with domains) and the elements of the list as singular nouns, we create the most natural DATATRIEVE syntax for referring to heirarchies. In this case, a record selection expression and a print statement would look like


    find patients with any diagnoses with diagnosis = "flat feet"

    print name, all diagnosis of diagnoses of current

When it comes to naming the list in a view, I depart from my own convention. Rather than use a record definition like:


    DEFINE DOMAIN SAILBOATS OF YACHTS, OWNERS BY
    01 SAILBOAT OCCURS FOR YACHTS.
       03 BOAT FROM YACHTS.
       03 SKIPPERS OCCURS FOR OWNERS WITH TYPE EQ BOAT.TYPE.
          05 NAME FROM OWNERS.
    ;

My preference for the view definition would be:


    DEFINE DOMAIN SAILBOATS OF YACHTS, OWNERS BY
    01 SAILBOAT-REC OCCURS FOR YACHTS.
       03 BOAT FROM YACHTS.
       03 OWNERS-R OCCURS FOR OWNERS WITH TYPE EQ BOAT.TYPE.
          05 NAME FROM OWNERS.
    ;

I prefer a list name like OWNERS-R to SKIPPERS since it is not completely clear that SKIPPERS is a list of OWNERS.

Formatting

Formatting procedures and record definitions improves the readability of the code or definitions. I have a strong preference that procedures should be indented four spaces for each level of nesting. Such code would look something like:


    . . .
    for foo begin
        if ( . . . ) then begin
            statements
            end else begin
            more statements
            end
        end

If the number of levels of nesting is very large, then two spaces of indentation for each level should be more practical.

For record definitions, I prefer three spaces of indentation. A record definition would then look like:


    define record foo-record using optimize
    01 foo-rec.
       03 first-foo-field ... .
       03 second-foo-field ... .
       03 group-field.
          05 first-group-field ... 
             first clause on field                                
             second clause on field
             last clause on field.
    . . .
    ;

By using three spaces of indentation, one space between the level number and the field specification, and indenting the clauses under the field specification, the resulting record definition is readable and maintainable.

I am not suggesting that the style and conventions which I have described here are the only way or even the best way of writing code and definitions in DATATRIEVE. It is a way that works for me; you should find what works best for you.


Originally published in the newsletter of the DATATRIEVE/4GL SIG, The Wombat Examiner and 4GL Dispatch, Volume 9, Number 11, pages 5-9; in the Combined SIGs Newsletters of Digital Equipment Computer Users Society, Volume 3, Number 11, July 1988.
Joe H. Gallagher, Ph. D.
dtrwiz@ix.netcom.com
MAILTO

BACK Back