How to reset RUNNING COUNT

Joe H. Gallagher, Ph. D.


The statistical function RUNNING COUNT is used to number items in a REPORT or a FOR .. PRINT construct.  Sometimes one would like to start the RUNNING COUNT over again from one so that the items of a break group are counted.  There is no inherent method in DATATRIEVE to restart or reset the RUNNING COUNT function.

This particular problem is one of the DATATRIEVE classics.  At the beginning of Wombat Magic at the 1984 Spring Symposia in Anaheim, the DATATRIEVE Wizards were challenged with this problem: create a set of report writer statements which would have the apparent effect of resetting the RUNNING COUNT at the beginning of each group so that each group would be numbered starting at one.

To illustrate the solutions, we are going to use the FAMILIES domain which is distributed with DATATRIEVE.  The report writer statements are:

READY FAMILIES SHARED READ
FIND FAMILIES CROSS KIDS
REPORT A IN CURRENT SORTED BY PARENTS, DESC AGE
AT TOP OF PARENTS PRINT PARENTS
PRINT KID-NAME, AGE, RUNNING COUNT, RESET-RUNNING-COUNT
END-REPORT

The results of this report is:

                                               RESET
                          KID         RUNNING RUNNING
  FATHER     MOTHER       NAME    AGE  COUNT   COUNT

ARNIE      ANNE
                       SCOTT       2       1        1
                       BRIAN       0       2        2
BASIL      MERIDETH
                       BEAU       28       3        1
                       BROOKS     26       4        2
                       ROBIN      24       5        3
                       JAY        22       6        4
                       JILL       20       7        5
                       WREN       17       8        6
EDWIN      TRINITA
                       ERIC       16       9        1
                       SCOTT      11      10        2
. . .

There are two methods of calculating the variable RESET-RUNNING-COUNT.
 

Technique 1: Context variable

Katherine Wrobel, Colorado Support Center, Digital Equipment Corporation, won the Wombat Magic challenge with the following simple solution.

DECLARE RESET-RUNNING-COUNT COMPUTED BY
    (RUNNING COUNT) - (COUNT OF CURRENT WITH PARENTS < A.PARENTS) .

Note that while this is a simple and elegant solution, it has performance problems.  For each record of the report, it is necessary to scan all other records to determine the count of previous parents.  This solutions is an N squared (N times N) solution.

Technique 2: Logicals

In September  of 1997, a user on the Internet posed essentially the same problem in the Usenet conference comp.os.vms.  He referred to the restart of the RUNNING COUNT as occurrence number.  In the discussion which followed, John Briggs produced the following solution involving logicals.  Note: because the group variable PARENTS would contain embedded spaces, it is necessary, in this case, to use a variable which concatenates the two fields of PARENTS into a single variable which contains no spaces (because a VMS logical can not contain embedded spaces).

DECLARE C_PARENTS COMPUTED BY FATHER || MOTHER.

DECLARE GLOB_CNT_TO_PREV COMPUTED BY
  IF FN$TRANS_LOG("CNT_TO_PREV") EQ "CNT_TO_PREV" THEN
    (0 +
      FN$CREATE_LOG("CNT_TO_PREV", 0) +
      FN$CREATE_LOG("TOTAL_COUNT", 1) +
      FN$CREATE_LOG("THIS_ONE", C_PARENTS))
  ELSE IF C_PARENTS NE FN$TRANS_LOG ("THIS_ONE") THEN
    ( FN$TRANS_LOG ("TOTAL_COUNT") +
      FN$CREATE_LOG ("CNT_TO_PREV", FN$TRANS_LOG ("TOTAL_COUNT")) +
      FN$CREATE_LOG ("TOTAL_COUNT", FN$TRANS_LOG("TOTAL_COUNT") + 1) +
      FN$CREATE_LOG ("THIS_ONE", C_PARENTS))
  ELSE
    ( FN$TRANS_LOG ("CNT_TO_PREV") +
      FN$CREATE_LOG ("TOTAL_COUNT", FN$TRANS_LOG ("TOTAL_COUNT") + 1 )).
DECLARE RESET_RUNNING_COUNT COMPUTED BY
  (RUNNING COUNT) - GLOB_CNT_TO_PREV.
PRINT "Deleting logical CNT_TO_PREV"
FN$DELETE_LOG ("CNT_TO_PREV")
DEFINE PROCEDURE zero_count
  FN$DELETE_LOG ("CNT_TO_PREV")
END-PROCEDURE

To produce the results, run the procedure ZERO_COUNT and then run the report above. The global variable, GLOB_CNT_TO_PREV, and the logical, CNT_TO_PREV, hold the running count at the end of the previous break group.  The logical, TOTAL_COUNT, has a value which is the same as the RUNNING COUNT.  The logical, THIS_ONE, contains the previous or current value of the break field.  To see what is going on, print out the variables fn$trans_log("TOTAL_COUNT") and fn$trasn_log("THIS_ONE") within a FOR loop or REPORT statement.  Also note that the global variable may be generalized to almost any situation by replacing C_PARENTS with the appropriate break variable.

John Briggs' solution  contains two concepts which I had never seen used in DATATRIEVE before (September of 1997):

  1. using a logical to tally information over multiple records, and
  2. using the fact that including a FN$CREATE_LOG in an arithmetic expression does not affect the expression.
The solution gives the desired result in this case.  However, the COMPUTED BY variable RESET_RUNNING_COUNT using logicals has some very unDATATRIEVE-like properties.  If the variable RESET_RUNNING_COUNT is used within an RSE such as

    REPORT CURRENT WITH RESET_RUNNING_COUNT EQ 1 ...

or it is used more than once in a PRINT statement as in

    PRINT KID-NAME,AGE,RUNNING COUNT,RESET-RUNNING-COUNT, RESET-RUNNING-COUNT

gives very unpredictable results.

To get the desired results, one must reference the variable RESET-RUNNING-COUNT only once for each occurrence of the record.  With this caveat, the performance of this method is of the order of N (rather than N squared).  Thus, for large data sets, the extra complexity of defining RESET-RUNNING-COUNT using logicals would have a dramatic performance improvement.

In the particular example choses, the break field was a group field, PARENTS.  In the latest versions of DATATRIEVE (V7.0 or later), there are functions for symbols which correspond to fn$create_log, fn$trans_log, and fn$delete_log.  Thus, if you are using a version of DATATRIEVE with functions for symbol, the problem can be solved using symbols rather than logicals with fewer restrictions on the format of the break variable.


An originally web publication.  February 25, 1998.

Joe H. Gallagher, Ph. D.
dtrwiz@ix.netcom.com
MAILTO 

BACK  Back